[ensembl-dev] Retrieving species annotations via REST API
Andy Yates
ayates at ebi.ac.uk
Thu Nov 7 15:44:32 GMT 2013
Hi Greg,
Glad to have another happy REST API user on board. You are right there is a discrepancy in the REST API but it's more of a question of Ensembl names Vs. the NCBI taxonomy. The /info/species endpoint is describing what we know about a species using data from a core schema. In the next version that endpoint will include:
- The species name
- The taxonomic identifier
- The common name
- A display name
The /taxonomy/id endpoints are a binding to the NCBI taxonomy and reflect its contents. So when you say /taxonomy/id/canis_familiaris you are actually asking for any name in the NCBI taxonomy which matches canis_familiaris. That's why Ancestral_sequences comes back with nothing (since they're an Ensembl-ism). The name is also limited to just the scientific name; that would be nicer if we could allow searching on any of the available names for a taxon node. I'll put that in as an issue to address.
To answer your question not yet but very soon you will be able to ask for a species, get its taxonomy identifier and then use this in the /taxonomy/id lookup to retrieve all information about that ID. Should you want to use the Ensembl names you can just avoid doing that second lookup.
As for duck missing the taxonomic identifier from its aliases I'll chase that up now as it should have it AFAIK
Cheers,
Andy
------------
Andrew Yates - Ensembl Core Software Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
Tel: +44-(0)1223-492538
Fax: +44-(0)1223-494468
http://www.ensembl.org/
On 7 Nov 2013, at 14:21, Greg Slodkowicz <gregs at ebi.ac.uk> wrote:
> Hi,
> I'm using the REST API (it's great!) to retrieve a few some taxonomical information and I think I found a minor disrepancy:
>
> When I use the info/species endpoint, I get (among other things) entries for "canis_familiaris" and "Ancestral sequences". When I try to feed these to taxonomy/id/:id, I get an error (HTTP status code 400). While I'm not so surprised this happens with "Ancestral sequences", I'd expect the other one to match.
>
> I got it to work by using "canis lupus familiaris" but since I'm doing this for many species, ideally I'd like to have an automatic way of extracting species names.
>
> I noticed that records returned by info/species include an NCBI taxid as one of the aliases and I was trying to take advantage of that, but it turns out it's missing from some species (e.g. duck).
>
> Is there a foolproof way of retrieving *all* species in Ensembl and then fetching taxonomic annotations for them, using the REST API (or a Python library)?
>
> Best,
> Greg
>
>
> --
> Greg Slodkowicz
> PhD student, Nick Goldman group
> European Bioinformatics Institute (EMBL-EBI)
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
More information about the Dev
mailing list