[ensembl-dev] Question regarding canonical transcripts
Cyriac Kandoth
kandothc at mskcc.org
Mon Aug 1 19:03:14 BST 2016
Hi, hope this is still relevant to this thread - what is the rationale for
choosing 5kb? Is there no evidence for promoter regions beyond that? Is it
the same limit at the 3' end?
~C
On Jul 29, 2016 4:24 AM, "Will McLaren" <wm2 at ebi.ac.uk> wrote:
> Hi Lin,
>
> This is actually not a case of Ensembl not providing a canonical
> transcript. It actually shows your input variant overlapping only one
> transcript of a gene, and that transcript is not the canonical one.
>
> If you look at the transcript diagram [1] you can see ENST00000497517
> extends many kb 5' of the other transcripts' start sites (beyond the 5kb
> range within which VEP will call an overlap), so only that transcript is
> annotated.
>
> Regards
>
> Will McLaren
> Ensembl Variation
>
> [1] : http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000115705
>
> On 29 July 2016 at 07:32, 林琼芬 <qiongfen0 at gmail.com<mailto:
> qiongfen0 at gmail.com>> wrote:
> yes, just like the one below
> 1 25372580 rs12731221 G A
> 1 28733759 rs78873359 CA C
> 2 1397282 rs9326165 G A
> 2 1405785 rs74412499 G A
> 2 88285154 rs149707353 C T
> 3 85008865 . C A
> 3 180575632 rs58197854 AT A
> 3 180575641 rs114361217 A T
> 5 42842763 rs9686343 C A
> 6 5109555 rs149371287 G A
> 6 143929729 rs6899521 T C
> 7 72024054 rs193119573 G A
> 7 72024079 rs376943542 G A
> 7 89571465 rs10226999 C G
> 10 11639703 rs77896587 G A
>
> the VEP result would like this, do not have the canonical transcript.
> Thanks a lot !
> [内嵌图片 1]
>
>
> Best regard!
> Lin
>
> 2016-07-27 20:50 GMT+08:00 Will McLaren <wm2 at ebi.ac.uk<mailto:
> wm2 at ebi.ac.uk>>:
> Hi Lin,
>
> Can you provide an example of some input for which VEP does not provide a
> canonical transcript?
>
> Regards
>
> Will McLaren
> Ensembl Variation
>
> On 27 July 2016 at 08:02, 林琼芬 <qiongfen0 at gmail.com<mailto:
> qiongfen0 at gmail.com>> wrote:
> Hi Magali,
> As you mean, a canonical transcript is usually the transcript with the
> longest translation for a given gene, than, maybe all gene has a canonical
> transcript. However, when I use VEP-release-77, some variants has no
> canonical transcript result after annotation, would you know what happen to
> it?
> Hope to hear form you.
>
> Best regard!
> Lin
>
> 2016-07-26 23:06 GMT+08:00 mag <mr6 at ebi.ac.uk<mailto:mr6 at ebi.ac.uk>>:
> Hi Duarte,
>
> A canonical transcript is usually the transcript with the longest
> translation for a given gene
> http://www.ensembl.org/Help/Glossary?id=346
>
> In your example, XP_005244832.1 has a translation of 730 aa while
> NP_003027.1 only has 728.
> Hence, it is chosen as the canonical transcript.
>
> As Kieron mentioned, if you want specifically curated RefSeq annotation,
> it might be better to fetch all external annotations then filter out the
> ones you are interested in.
>
>
> Regards,
> Magali
>
>
> On 25/07/2016 17:07, Duarte Molha wrote:
> I will try and produce here the relevant parts of the script.
>
> But I still am at loss why XP_005244832.1<
> http://www.ncbi.nlm.nih.gov/protein/XP_005244832.1> has been tagged as
> canonical
>
> For what you are saying is that I simply might not have cycled trough all
> of the refseq transcripts... but is there going to be more than one refseq
> transcript tagged as canonical for each gene?
>
> Not sure I follow!
>
> Thanks
>
> Duarte
>
>
>
>
>
> <https://about.me/duarte?promo=email_sig>
>
> Duarte Molha
> about.me/duarte
>
>
>
>
>
>
>
> On 25 July 2016 at 11:58, Kieron Taylor <ktaylor at ebi.ac.uk<mailto:
> ktaylor at ebi.ac.uk>> wrote:
> Hi Duarte,
>
> Can you send us a snippet of code that accesses the external database
> adaptor (DBEntryAdaptor?). It sounds like you may not be reading enough of
> your results to get the RefSeq ID you expect. We have all of the RefSeq IDs
> you mention associated at some level to the transcript, but some are from
> "RefSeq peptide predicted" for example.
>
> Kieron
>
>
>
> Kieron Taylor PhD.
> Ensembl Developer
>
> EMBL, European Bioinformatics Institute
>
>
>
>
>
>
> > On 22 Jul 2016, at 10:47, Duarte Molha <duartemolha at gmail.com<mailto:
> duartemolha at gmail.com>> wrote:
> >
> > Hi Guys
> >
> > I have a script that based on a gene symbol connects to ensembl and
> retrieves the canonical transcript and then does the same using the
> external database adaptor to get the canonical refseq transcript.
> >
> > However this does not seem to give me the correct result
> >
> > Take for example the gene SKI ( I am using GRCh37 assembly btw)
> >
> > If you open this gene on the Ensembl browser:
> >
> >
> http://grch37.ensembl.org/Homo_sapiens/Location/View?db=core;g=ENSG00000157933;r=1:2159997-2161343
> >
> >
> > On SKI, Ensembl annotates as the canonical transcript: ENST00000378536
> >
> > However, using by script, the external database adaptor returns the
> refseq XP_005244832.1 as the refseq canonical transcript, even though the
> correct canonical transcripts is NM_003036.3
> >
> > http://www.ncbi.nlm.nih.gov/gene/6497
> >
> > Unless I am understanding this incorrectly if the coding regions is the
> same length in 2 transcripts the longest should be the canonical
> >
> > The longer Refseq is NM_003036.3 (has a longer 5prime UTR)
> >
> > Can you help me understand this?
> >
> > Many thanks
> >
> > Duarte
> > _______________________________________________
> > Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
> > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> --
>
> Arron Lin
>
> BGI Research Institute
>
> Email: qiongfen0 at gmail.com<mailto:qiongfen0 at gmail.com>
>
> Beishan Industrial Zone| Yantian District| Shenzhen 518083
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> --
>
> Arron Lin
>
> BGI Research Institute
>
> Email: qiongfen0 at gmail.com<mailto:qiongfen0 at gmail.com>
>
> Beishan Industrial Zone| Yantian District| Shenzhen 518083
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160801/76e61bfd/attachment.html>
More information about the Dev
mailing list