[ensembl-dev] Conversion of ENA xref ids into working URLs
Dan Staines
dstaines at ebi.ac.uk
Fri Apr 22 10:27:04 BST 2016
Hi Dimitry,
> KE159678.1:CDS:7103..8458
>
> GG666297.1:CDS:complement(18707..20086)
These are provided as tracking IDs to indicate the piece of INSDC
annotation from which the Ensembl object was loaded. INSDC do not have
feature level identifiers, so on the advice of ENA these strings are
constructed to provide at least some way to find the original piece of
data from which the Ensembl object was constructed.
These identifiers are of the form accession:feature_type:location e.g.
GG666297.1:CDS:complement(18707..20086)
which refers to this feature:
FT CDS complement(18707..20086)
FT /codon_start=1
FT /transl_table=11
FT /locus_tag="HMPREF0077_0851"
FT /product="ATPase/histidine kinase/DNA gyrase
B/HSP90 domain
...
FT /protein_id="EEI83065.1"
...
from this expanded CON (i.e an entry composed of multiple sub-entries):
http://www.ebi.ac.uk/ena/data/view/GG666297&display=text&expanded=true
I'm not aware of any way to resolve these automatically in ENA - there
is a REST service which can return entire entries in text or XML but it
would return the whole record, not just the feature.
If you're only interested in CDS features, you can also use the
protein_id identifier (in this case EEI83065.1) which is an xref on the
Ensembl transcript to link to ENA e.g.
http://www.ebi.ac.uk/ena/data/view/EEI83065.1
Regarding linking to ENA, my best advice is for you to contact ENA
directly at datasubs at ebi.ac.uk.
Sorry I can't be of more help.
Dan.
--
Dan Staines, PhD
Genomics Technology Infrastructure Coordinator
EMBL-EBI, Wellcome Trust Genome Campus
Cambridge CB10 1SD, UK
Tel: +44-(0)1223-492507
More information about the Dev
mailing list