[ensembl-dev] transcript coordinates via REST?

Andrew Yates ayates at ebi.ac.uk
Wed Aug 5 20:08:10 BST 2015


Hi Reece,

Apologies for the long wait for a reply. I don’t believe this is as much of an issue as you may think. This Ensembl & GENCODE annotation is performed against the assembly in question. There is no alignment of transcript model to the genome. In the example you gave in GENCODE19 this was a processed transcript [1]. In all GRCh38 annotations this became a protein coding gene [2] thanks to the injection of KF495714.1 [3] into the assembly config AC107373.4 by GRC allowing us to correctly annotate NEFL.

The upshot of this is that GENCODE annotation reflects what occurs in the genome. Providing an alignment file would match the transcript models.

As for using accession versus names we have considered this previously. Whilst we understand the possible confusion that can occur when not using accessions. However there are a large number of users who are using these long standing names. We have planned to develop a tool which can convert the FTP dumps for a number of analyses and changing names, where applicable, is a possible feature. I’m happy to keep you in the loop on further developments if you’d like.

Cheers

Andy

1 - http://grch37.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000104725;r=8:24808468-24814624;t=ENST00000221169 <http://grch37.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000104725;r=8:24808468-24814624;t=ENST00000221169>
2 - http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000277586;r=8:24950955-24957110 <http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000277586;r=8:24950955-24957110>
3 - http://www.ebi.ac.uk/ena/data/view/KF495714.1 <http://www.ebi.ac.uk/ena/data/view/KF495714.1>

------------
Andrew Yates - Genomics Technology Infrastructure Team Leader
The European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Campus
Hinxton, Cambridge
CB10 1SD, United Kingdom
Tel: +44-(0)1223-492538
Fax: +44-(0)1223-494468
Skype: andrewyatz
http://www.ebi.ac.uk/
http://www.ensembl.org/

> On 3 Aug 2015, at 07:12, Reece Hart <reece at harts.net> wrote:
> 
> 
> On Fri, Jul 31, 2015 at 2:38 AM, mag <mr6 at ebi.ac.uk <mailto:mr6 at ebi.ac.uk>> wrote:
> And a similar endpoint can provide you with transcript coordinates translated to genomic coordinates, although the reverse is not available
> http://rest.ensembl.org/map/cdna/ENST00000288602/100..300?content-type=application/json <http://rest.ensembl.org/map/cdna/ENST00000288602/100..300?content-type=application/json>
> 
> I am not sure if that covers your use case, so please don't hesitate to give us some feedback.
> The REST API is still in active development and we are always happy to get some feature requests that allow us to prioritise future endpoints.
> 
> Hi Magali-
> 
> My primary goal is to build a table of the correspondence between exons in genomic and transcript coordinates so that we can map between g., c., and p. sequence coordinates. The mapping is trivial in most cases, of course, but indels and sequence errors are common enough that "most cases" isn't good enough for diagnostic purposes. (For example, NEFL contains an insert of a single G in a poly-G tract in GRCh37 that leads to a frameshift; the refseq transcript is correct and doesn't have this pathology.) 
> 
> NCBI started providing files with exon-level correspondence in April. These are extremely useful. See ftp://ftp.ncbi.nih.gov/refseq/H_sapiens/alignments/ <ftp://ftp.ncbi.nih.gov/refseq/H_sapiens/alignments/> for examples.
> 
> So, the request/suggestion is to provide a similar file or REST interface for ENST transcripts.
> 
> While I'm at it, I would encourage Ensembl (everyone, actually) to use sequence accessions (like NC_000001.10) rather than chromosome names. Chromosome names are imprecise and unnecessarily create opportunities for errors. (It also makes it hard to even consider a world in which multiple assemblies are in one database schema.)
> 
> Thanks,
> Reece
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150805/a51d3e19/attachment.html>


More information about the Dev mailing list