[ensembl-dev] Ensemble GRCh38 Transcript ID

Matthew Laird lairdm at ebi.ac.uk
Wed Jan 4 14:57:41 GMT 2017


Hello Pankaj,

The GTF does have versions for stable ids, it's just separated out in to 
separate fields. These different formats have slightly different 
formatting for elements such as the stable ids.

We do not have a tool to re-version stable ids, however for human it's 
just a split of the string on '.' There are three good ways that should 
be simple. First, use the GTF file again, this has names in records. 
Second, do a split '.' on the stable ids and use it in biomart as 
planned. Third, use the REST service lookup endpoint 
(http://rest.ensembl.org/documentation/info/lookup) which takes 
versioned stable ids.

On 04/01/17 00:26, Pankaj Agarwal wrote:
>
> Thank you for providing this clarification.  I had just a couple of 
> quick follow up questions.
>
> These ids came out of Ensembl Transcript fasta file 
> (Homo_sapiens.GRCh38.cdna.all.fa) which I used for the first time in 
> rna-seq data analysis. Earlier I had been using the GTF file which 
> probably does not have the version id
>
> 1. Is there a function with in biomart or any Ensembl tool to remove 
> the version numbers from a list of transcript ids.
>
> and/or
>
> 2. Is there a way to get a mapping from the Transcript IDs to Gene 
> Names that will circumvent the issue of version id.
>
> Thanks,
>
> - Pankaj
>
> *From:*dev-bounces at ensembl.org [mailto:dev-bounces at ensembl.org] *On 
> Behalf Of *Matthew Laird
> *Sent:* Wednesday, December 21, 2016 8:02 AM
> *To:* Ensembl developers list
> *Subject:* Re: [ensembl-dev] Ensemble GRCh38 Transcript ID
>
> Hello Pankaj,
>
> Those are version numbers, they've been a part of Ensembl stable ids 
> since the start but not always displayed in every context, you can 
> read more about the rules for how they're incremented here:
>
> http://www.ensembl.org/info/genome/stable_ids/versions.html 
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ensembl.org_info_genome_stable-5Fids_versions.html&d=CwMD-g&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=IDwpe2wIkw9bdBMqeNsKTRZmbtvkqETUeUnJsNoNd4E&m=3TZBnak0YkbHUaCUhR7sPjdI-kpwbMFVh7Ug_Jtzc8w&s=0dbfmrUnD3j-2gUyaMXfTEYB4CkonFKwa6fSyHhZBR4&e=>
>
> Biomart does not accept versioned stable ids in queries, if you're 
> submitting a batch of stable ids to Biomart you must ensure they are 
> unversioned.
>
> Cheers.
>
> On 21/12/16 12:52, Pankaj Agarwal wrote:
>
>     Hi,
>
>     The Ensembl Transcript IDs for the GRCh38 version has periods at
>     the end followed by a number.  For ex:
>
>     ENST00000414852.1
>
>     ENST00000390399.3
>
>     ENST00000610439.4
>
>     When I query these in biomart it does not work.  I have remove the
>     period and the following number.
>
>     I was wondering why these were introduced, what they mean, and how
>     to get these to work with Biomart.
>
>     Thanks,
>
>     - Pankaj
>
>     -----------------------------
>
>     Pankaj Agarwal, M.S
>
>     Bioinformatician
>
>     Data Analyst
>
>     Applied Therapeutics
>
>     Div. of Surgical Sciences
>
>     Dept. of Surgery
>
>     Duke University
>
>     M: 919-244-6389
>
>     O: 919-681-2251
>
>     p.agarwal at duke.edu <mailto:p.agarwal at duke.edu>
>
>
>
>
>     _______________________________________________
>
>     Dev mailing listDev at ensembl.org <mailto:Dev at ensembl.org>
>
>     Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ensembl.org_mailman_listinfo_dev&d=CwMD-g&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=IDwpe2wIkw9bdBMqeNsKTRZmbtvkqETUeUnJsNoNd4E&m=3TZBnak0YkbHUaCUhR7sPjdI-kpwbMFVh7Ug_Jtzc8w&s=DMspTuFLpDSfscMlV8PrB5LcpsM-ijYchewLniAtmn4&e=>
>
>     Ensembl Blog:http://www.ensembl.info/
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ensembl.info_&d=CwMD-g&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=IDwpe2wIkw9bdBMqeNsKTRZmbtvkqETUeUnJsNoNd4E&m=3TZBnak0YkbHUaCUhR7sPjdI-kpwbMFVh7Ug_Jtzc8w&s=oMs4WTlyA3iE4Ie_BDetszY9Q_CzyOS6w_ukqFyp1yY&e=>
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170104/bd426200/attachment.html>


More information about the Dev mailing list