[ensembl-dev] help to retrieve the genomic sequence for 28S rRNA

Ian Dunham dunham at ebi.ac.uk
Wed May 4 17:18:17 BST 2011


Hi Bin

I probably should have answered you earlier on this, since I have looked
at this quite a bit over the years, but I've been busy.

In human the ribosomal RNA gene clusters containing the 28S, 18S and
5.8S RNAs are located on the short arms of the acrocentric chromosomes
(13, 14, 15, 21 and 22) in long repeating arrays of 40 odd kb for a
single unit. You have many copies in the arrays and copy number is
variable between individuals. These regions are difficult to assemble
into what are now called supercontigs because of this repeating
structure and other satellite repeats that lie on these chromosome arms,
so although there have been some BACs sequenced representing the
clusters, the contigs are not integrated into the chromosome
supercontigs. Instead they lie on the set of unplaced supercontigs.
Supercontig GL000220.1 for instance is one of these. There are others.
In the sequence databases there are several BACS that cover these
clusters.  I don't have a complete list to hand (although I initiated
the sequencing of several of these) but one example can be found in
accession U13369.1 see http://www.ebi.ac.uk/ena/data/view/U13369

Unfortunately Ensembl doesn't annotate these RNAs because the ncRNA
pipeline uses the RFAM database, and the 28S and 18S RNAs are not
annotated in RFAM because they are too long. It would be useful to have
manual annotation on these but this is currently restricted to the
canonical chromosomes.

Of the top of my head I don't know where the murine ribosomal gene
clusters are, although they have a very similar cluster organisation to
human.  But I suspect it is easy enough to find out from the literature.

This is one of those areas that was worked on a lot in the early days of
genomics/molecular biology because the copy number is very helpful, but
latterly has been less active, and is painful for modern sequencing methods.

Let me know if there is something else specific that you need.

Cheers
Ian Dunham

On 04/05/2011 16:32, Liu,Bin wrote:
> Hi, Nath,
> 
>             Thanks for the help. However, the 28S rRNA I am looking for
> doesn't code any protein. It is ribosomal RNA. I guess it should be
> categorized as non-coding gene in the genome annotation.
> 
> I got partial sequence of 28S for Chinese Hamster from GenBank.
> (GenBank: AY390526.1). 28S rRNA should be conversed between Chinese
> hamster and B6 mouse (the ref genome). I blat the mouse genome with the
> sequence from GenBank. It is very strange that I couldn’t get any
> significant hit (all of them were short hits everywhere in the genome).
> 
> I would like to get about 20K flanking sequence around the 28S rRNA if I
> can locate the genomic sites. There could be more than one genomic
> location for 28S rRNA on the genome due to multiple copies of the gene.
> However, it seems that genome annotation did index any 28S rRNA.
> 
> I wonder if there is a way to search all of the sequenced mouse BAC
> clones in Ensembl. As long as I can find a sequenced BAC that contains
> the 20K region with 28S rRNA, we can use the BAC sequence to design our
> experiment.
> 
> Thanks a lot for the help.
> 
>  
> 
> Bin
> 
>  
> 
> On 29 Apr 2011, at 23:38, Liu,Bin wrote:
> 
> 
> 
> Hi,
> 
>                 I was trying to retrieve the chromosome location of 28S
> rRNA from Human and Mouse genome. It was surprised that I couldn't find it.
> 
> I wonder if the 28S rRNA was annotated in both of the genomes although
> 28S rRNA is the well known non-coding RNA. Thanks for the help.
> 
>  
> 
>  
> 
> Bin Liu, Ph.D.
> 
> Department of Genetics
> 
> The University of Texas M. D. Anderson Cancer Center
> 
> 1515 Holcombe Blvd
> 
> Houston, TX 77030
> 
> Phone:  713-792-0878
> 
> Email: bliu1 at mdanderson.org <mailto:bliu1 at mdanderson.org>
> 
> Room No: S13.8316A
> 
>  
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
> List admin (including
> subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
>  
> 
> Nathan Johnson
> 
> Senior Scientific Programmer
> 
> Ensembl Regulation
> 
> European Bioinformatics Institute
> 
> Wellcome Trust Genome Campus
> 
> Hinxton
> 
> Cambridge CB10 1SD
> 
>  
> 
> http://www.ensembl.info/
> 
> http://twitter.com/#!/ensembl
> 
>  
> 
>  
> 
> 
> 
> 
> 
>  
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
> List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
>  
> 
> Nathan Johnson
> 
> Senior Scientific Programmer
> 
> Ensembl Regulation
> 
> European Bioinformatics Institute
> 
> Wellcome Trust Genome Campus
> 
> Hinxton
> 
> Cambridge CB10 1SD
> 
>  
> 
> http://www.ensembl.info/
> 
> http://twitter.com/#!/ensembl
> 
>  
> 
>  
> 
> 
> 
> 
> 
>  
> 
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-- 
Cheers
Ian


Ian Dunham M.A. D.Phil

European Bioinformatics Institute (EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, UK
Tel:  01223 492636  FAX:  01223 494468
dunham at ebi.ac.uk




More information about the Dev mailing list