[ensembl-dev] RNA Edits

Dan Staines dstaines at ebi.ac.uk
Fri Mar 23 11:49:35 GMT 2012


Hi Trevor,

To expand in a bit more detail on some of Michael's comments, 
particularly as applied to Ensembl Genomes. The translation seq edits 
contain peptide coordinates and the amino acids to insert/replace. These 
are entered in the translation_attrib table and have the codes 
_selenocysteine, initial_met and amino_acid_sub. All three of these are 
applied using the same mechanism in modify_translation in 
Bio::EnsEMBL::Translation but are used in different situations.

_selenocysteine addresses a specific biological situation where 
selenocysteine is not coded for directly but denotes that the stop codon 
at the position indicated is to be replaced by selenocysteine. These 
aren't that common, but I see there are four of these in D. melanogaster 
(of the form "109 109 U").

initial_met addresses a similar biological situation where there are 
alternative start codons (like GUG which codes for valine) but in fact 
are translated as methioinine in this context. These attribs are found 
very commonly in bacterial genomes and have the value "1 1 M".

amino_acid_sub is much more general in use and is used to correct errors 
in the underlying assembly (for instance where the correct peptide 
sequence is known). Its used frequently in bacteria e.g. "625 625 E".

Hope this helps,

Dan.

-- 
Dan Staines, PhD               Ensembl Genomes Technical Coordinator
EMBL-EBI                       Tel: +44-(0)1223-492507
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/




More information about the Dev mailing list