[ensembl-dev] canonical transcript

Andy Yates ayates at ebi.ac.uk
Fri Apr 27 14:22:22 BST 2012


Hi Sung,

1).

Canonical transcripts are defined by a number of rules which for most species boils down to the longest transcript wins. However some species, like human, have a more complicated assignment method:

- if a gene has protein producing transcripts & divide into groups
	- Group A: protein_coding biotype transcripts in CCDS
	- Group B: take protein_coding biotype transcripts in havana
	- Group C: take protein producing biotypes in havana
	- Group D: remaining protein producing biotypes

Order these sets by length & then ask in order for a transcript i.e. if group A had no transcripts but group B had 2 transcripts then we would use the longest from B. Equality to CCDS is based on an identical exon coding model.

- if a gene has no protein producing transcripts
	- Group A: take transcripts in havana
	- Group B: all other transcripts

Apply the same rules but using just groups A & B.

3). The canonical_translation_id field in transcript refers to an entry in the translation table. This is to indicate the canonical when we have transcripts producing more than one protein product due to alternative initiation

4). Ensembl does not annotate canonical translations since we maintain a 1:1 relationship between transcripts and translations. Ensembl Genomes do have this data & they can better explain their rules for assignment.

Hope this helps,

Andy

Andrew Yates                   Ensembl Core Software Project Leader
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensembl.org/

On 27 Apr 2012, at 13:31, Sung Gong wrote:

> Hi,
> 
> The 'gene' table contains a column 'canonical_transcript_id' which is
> a foreign key to the 'transcript' table.
> 
> My questions are:
> 1. How do you define whether a transcript is canonical or not? Any
> documentation on the Ensembl web site?
> 2. Within the 'gene' table, all the 'canonical_annotation' column are null?
> 3. I could not find 'translation_translation' table, whereas there is
> a column 'canonical_translation_id' in the 'transcript' table.
> 4. How do you define a canonical translation?
> 
> Cheers,
> Sung
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list