[ensembl-dev] Changes in EMBL file format?

Matthew Laird lairdm at ebi.ac.uk
Thu Jun 15 09:37:31 BST 2017


Hi Fin,

The discussion about this change began in the context of Genbank files, 
however since our Genbank and EMBL file dumpers share a common code base 
the change rippled to the EMBL files as well.

As Anne indicated, this stemmed from wanting to be more INSDC compliant, 
/note just didn't seem like the appropriate place to put the primary 
identifier for a record, it's more than just a "note." Unfortunately in 
the INSDC standards for Genbank files there is no /transcript_id record, 
despite some other sources using it. /standard_name seemed like the most 
appropriate choice of those allowed.

But yes, for CDS records, not having a /transcript_id, how do we point 
to that record's parent. The parent transcript isn't the /standard_name, 
so a difficult choice was made to stick with the /note field in this 
context. This does feel like an inconsistency, but we believe the 
benefit of a transcript having it's primary, stable identifier more 
prominently part of the record outweighs this negative.

If you have any other questions or concerns, please do let us know.

On 15/06/17 05:25, Fin Swimmer wrote:
> Hello,
>
> I often export gene informations from ensembl in the EMBL file format. I
> realized that, since the last ensembl update, the transcript id's now
> have the key /standard_name in the mRNA or misc_RNA part, whereas in the
> CDS part /note="transcript_id= is still used.
>
> Is this change a bug or a feature?
>
> fin swimmer
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-- 
Matthew Laird - Ensembl Core Developer
The European Bioinformatics Institute (EMBL-EBI)
Wellcome Genome Campus
Hinxton, Cambridge
CB10 1SD, United Kingdom
Tel: +44-(0)1223-494274
Fax: +44-(0)1223-494468
http://www.ebi.ac.uk/
http://www.ensembl.org/




More information about the Dev mailing list