[ensembl-dev] Question regarding canonical transcripts

Will McLaren wm2 at ebi.ac.uk
Wed Jul 27 13:50:23 BST 2016


Hi Lin,

Can you provide an example of some input for which VEP does not provide a
canonical transcript?

Regards

Will McLaren
Ensembl Variation

On 27 July 2016 at 08:02, 林琼芬 <qiongfen0 at gmail.com> wrote:

> Hi Magali,
> As you mean, a canonical transcript is usually the transcript with the
> longest translation for a given gene, than, maybe all gene has a canonical
> transcript. However, when I use VEP-release-77, some variants has no canonical
> transcript result after annotation, would you know what happen to it?
> Hope to hear form you.
>
> Best regard!
> Lin
>
> 2016-07-26 23:06 GMT+08:00 mag <mr6 at ebi.ac.uk>:
>
>> Hi Duarte,
>>
>> A canonical transcript is usually the transcript with the longest
>> translation for a given gene
>> http://www.ensembl.org/Help/Glossary?id=346
>>
>> In your example, XP_005244832.1 has a translation of 730 aa while
>> NP_003027.1 only has 728.
>> Hence, it is chosen as the canonical transcript.
>>
>> As Kieron mentioned, if you want specifically curated RefSeq annotation,
>> it might be better to fetch all external annotations then filter out the
>> ones you are interested in.
>>
>>
>> Regards,
>> Magali
>>
>>
>> On 25/07/2016 17:07, Duarte Molha wrote:
>>
>> I will try and produce here the relevant parts of the script.
>>
>> But I still am at loss why  XP_005244832.1
>> <http://www.ncbi.nlm.nih.gov/protein/XP_005244832.1> has been tagged as
>> canonical
>>
>> For what you are saying is that I simply might not have cycled trough all
>> of the refseq transcripts... but is there going to be more than one
>> refseq transcript tagged as canonical for each gene?
>>
>> Not sure I follow!
>>
>> Thanks
>>
>> Duarte
>>
>>
>>
>>
>>
>>
>> [image: --]
>> Duarte Molha
>> [image: https://]about.me/duarte
>> <https://about.me/duarte?promo=email_sig>
>>
>> On 25 July 2016 at 11:58, Kieron Taylor <ktaylor at ebi.ac.uk> wrote:
>>
>>> Hi Duarte,
>>>
>>> Can you send us a snippet of code that accesses the external database
>>> adaptor (DBEntryAdaptor?). It sounds like you may not be reading enough of
>>> your results to get the RefSeq ID you expect. We have all of the RefSeq IDs
>>> you mention associated at some level to the transcript, but some are from
>>> "RefSeq peptide predicted" for example.
>>>
>>> Kieron
>>>
>>>
>>>
>>> Kieron Taylor PhD.
>>> Ensembl Developer
>>>
>>> EMBL, European Bioinformatics Institute
>>>
>>>
>>>
>>>
>>>
>>>
>>> > On 22 Jul 2016, at 10:47, Duarte Molha <duartemolha at gmail.com> wrote:
>>> >
>>> > Hi Guys
>>> >
>>> > I have a script that based on a gene symbol connects to ensembl and
>>> retrieves the canonical transcript and then does the same using the
>>> external database adaptor to get the canonical refseq transcript.
>>> >
>>> > However this does not seem to give me the correct result
>>> >
>>> > Take for example the gene SKI ( I am using GRCh37 assembly btw)
>>> >
>>> > If you open this gene on the Ensembl browser:
>>> >
>>> >
>>> http://grch37.ensembl.org/Homo_sapiens/Location/View?db=core;g=ENSG00000157933;r=1:2159997-2161343
>>> >
>>> >
>>> > On SKI, Ensembl annotates as the canonical transcript: ENST00000378536
>>> >
>>> > However, using by script, the external database adaptor returns the
>>> refseq XP_005244832.1 as the refseq canonical transcript, even though the
>>> correct canonical transcripts is NM_003036.3
>>> >
>>> > http://www.ncbi.nlm.nih.gov/gene/6497
>>> >
>>> > Unless I am understanding this incorrectly if the coding regions is
>>> the same length in 2 transcripts the longest should be the canonical
>>> >
>>> > The longer Refseq is NM_003036.3  (has a longer 5prime UTR)
>>> >
>>> > Can you help me understand this?
>>> >
>>> > Many thanks
>>> >
>>> > Duarte
>>> > _______________________________________________
>>> > Dev mailing list    Dev at ensembl.org
>>> > Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> > Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
>
> --
>
> Arron Lin
>
> BGI Research Institute
>
> Email: qiongfen0 at gmail.com
>
> Beishan Industrial Zone| Yantian  District| Shenzhen 518083
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160727/4dd287c3/attachment.html>


More information about the Dev mailing list