[ensembl-dev] Translations

Daniel Hughes dsth at ebi.ac.uk
Thu Oct 4 17:04:47 BST 2012


if you're wanting to discuss why there are stop codons in the translations
that haven't been marked as having genomic errors/selenocysteine etc., then
probably gramene. if you want to discuss why they aren't rendered in
ensembl then EG plants.

dan.

Daniel S. T. Hughes M.Biochem (Hons; Oxford), Ph.D (Cambridge)
-------------------------------------------------------------------------------------
dsth at cantab.net
dsth at cpan.org



2012/10/4 Sam Seaver <samseaver at gmail.com>

> Arnaud,
>
> One example we have in Oryza sativa is: LOC_Os10g21210.1
>
> which translates to:
> MTIALGRVTKEENDLFDIMDDWLRRDRFVFVGWSGLFFFLVLISL*EVGLQGQLL*LLGI
> PMDWRVPIWKVAIS*PQQFPPLPIV*HTLCCYYGARKHKGILLVGVN*VVCGLLLLSMGL
> LH**VSCYVNLNLLGLFNCGLIMQFHSLAQSLFLFPYS*FIHWGNPVGSLRRVLA*QRYF
> DSSSSSKDFIIGR*THFI*WELPEY*ARLCYALFMGQPWKTLYLRTVMVQIPSALLTQLK
> LKKLIQWSPLIAFGPKSLVLLFPINVGYISLCYLYRSPVYG*VLLA*SAWL*TYVPMTSF
> PRKSVQRKILNLRLSTPKIFF*TRVFVRGWQLRISLMKILYSLRRFYHVEMLF
>
> However, I'm also trying to find the actual Ensembl release this came
> from, we got the data from Gramene and the release numbers don't
> match.  To be perfectly honest with you, we are confused as to whether
> to discuss these issues with Gramene or Ensembl Plants, does this
> depend on the species?
>
> S
>
> On Thu, Oct 4, 2012 at 10:23 AM, Arnaud Kerhornou
> <arnaudbioinfo at gmail.com> wrote:
> > On 04/10/2012 15:45, Sam Seaver wrote:
> >>
> >> Dear Arnaud,
> >>
> >> Apparently these embedded stop codons were found in a few sequences in
> >> O. sativa and V. vinifera.  There was a miscommunication and by
> >> "ignored", my colleague actually meant '*'.
> >
> > Re. V. vinifera, we have noticed some genes had their translation holding
> > internal stop codon. This will be fixed in the next release with is
> coming
> > at the end of this month.
> > Because of their number (44 cases), it would be difficult to go through
> each
> > of them to find out how to fix them, so we have removed their translation
> > and updated their biotype to 'nontranslating_cds'.
> >
> > Re. O. sativa, I can not find any cases of translations with internal
> stop
> > codons or of translation where we perform amino acid substitution, can
> you
> > direct us to a gene or translation ?
> >
> >> However, your email provokes another question, how do you define
> >> whether a stop codon actually belongs to another amino acid such as
> >> Selenocystein.  Is this a case where, for the species, every instance
> >> of TGA is known to belong to Selenocystein?
> >
> > Not all TGAs are Selenocystein. Selenocystein amonoacids are defined by
> the
> > presence of an RNA motif, called SECIS, in the 3' UTR of the transcript.
> > Ideally, they are specified in the gff3 file we load to build our core
> > databases, but it is not always the case.
> > What I usually do is to look at the gene function, as these genes are
> > associated with oxydo-reduction reaction. Then in Ensembl we have
> mechanisms
> > to substitute one or more aminoacid at a given position in the protein
> > sequence.
> > That what we did for Chlamydomonas, e.g.:
> >
> http://plants.ensembl.org/Chlamydomonas_reinhardtii/Transcript/Sequence_Protein?db=core;g=CHLREDRAFT_206086;r=DS496117:1347779-1349885;t=EDP05676
> >
> > Arnaud
> >
> >>
> >> Thanks
> >> Sam
> >>
> >> On Thu, Oct 4, 2012 at 8:50 AM, Arnaud Kerhornou <arnaud at ebi.ac.uk>
> wrote:
> >>>
> >>> Dear Sam,
> >>>
> >>> Could you give us the list of species where it is the case ?
> >>> There are some cases where the transcribed DNA sequence has stop codons
> >>> but
> >>> they're not real, and we have a mechanism in the Ensembl API to replace
> >>> the
> >>> stop codon by the right amino acid.
> >>>
> >>> Typical case is for Selenocystein genes where an internal stop codon
> >>> (TGA),
> >>> which is replaced by a 'U' in the amino acid sequence.
> >>>
> >>> In all cases, they should not be ignored. If we don't specify the
> correct
> >>> amino acid behind a stop codon, it is not discarded and the amino acid
> >>> sequence would hold an internal '*' character.
> >>>
> >>> Arnaud
> >>>
> >>>
> >>> On 04/10/2012 14:30, Sam Seaver wrote:
> >>>>
> >>>> Dear ensembl-dev,
> >>>>
> >>>> A colleague has discovered that in a few of the plant genomes, the
> >>>> underlying DNA sequence of a CDS may have some embedded stop codons.
> >>>> He subsequently found that the resulting translation, as performed by
> >>>> Ensembl, ignores these completely.
> >>>>
> >>>> We were wondering what, if any, other problems are encountered when
> >>>> translating plant genes, and what the Ensembl translation code does to
> >>>> address these?
> >>>>
> >>>> Thanks
> >>>> Sam
> >>>>
> >>
> >>
> >
>
>
>
> --
> Postdoctoral Fellow
> Mathematics and Computer Science Division
> Argonne National Laboratory
> 9700 S. Cass Avenue
> Argonne, IL 60439
>
> http://www.linkedin.com/pub/sam-seaver/0/412/168
> samseaver at gmail.com
> (773) 796-7144
>
> "We shall not cease from exploration
> And the end of all our exploring
> Will be to arrive where we started
> And know the place for the first time."
>    --T. S. Eliot
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20121004/46cf79b4/attachment.html>


More information about the Dev mailing list