[ensembl-dev] loading NCBI exon structures into Ensembl

Bronwen Aken ba1 at sanger.ac.uk
Wed Jun 1 09:39:05 BST 2011


Hi Kiran,

For the RefSeq models, RefSeq provides us with a flat file giving the genomic coordinates for all genes, transcripts and exons in their gene set. We load this up directly into the otherfeatures database and do not change any coordinates.

For the CCDS models, our collaboration in the project allows us access to an ftp site that provides daily dumps of the most current CCDS set. Again, the location of genes, transcripts and exons are provided in genomic coordinates and we just load the models into the otherfeatures database. The date for the gene stable IDs is the date of the data freeze when we dumped from the ftp site.

Thanks,
Bronwen



On 27 May 2011, at 18:25, Kiran Mukhyala wrote:

> Could someone also explain how RefSeq and CCDS are imported into the otherfeatures database. I'd like to know the source of the exon structures.
> 
> Thanks,
> -Kiran
> 
> On Thu, May 26, 2011 at 9:48 PM, Reece Hart <reece at harts.net> wrote:
> Hi-
> 
> Does anyone know whether it would work to load NCBI exon structures directly into Ensembl?
> 
> The goal is to use the Ensembl API to map between genome, transcript, and protein coordinates for variants specified using NCBI accessions. This requires exact NCBI exon structures.
> 
> I'm hoping that populating the transcript, transcript_stable_id, exon, and exon_transcript tables with original NCBI data would suffice.
> 
> As Kiran Mukhyala pointed out in a separate thread, the RefSeq sequence differs from the GRCh37 sequence in some cases. I am content with using RefSeq exon structures on GRCh37.
> 
> All code and advice is appreciated.
> 
> Thanks,
> Reece
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110601/807cbc9d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2058 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110601/807cbc9d/attachment.p7s>


More information about the Dev mailing list