[ensembl-dev] Gene ID <-> Gene Ontology mapping with REST or BioMart API

Thomas Maurel maurel at ebi.ac.uk
Tue Nov 18 13:05:36 GMT 2014

Dear Joël,

Please find below answers to your first two questions:
On 17 Nov 2014, at 21:09, Joel Fillon, Mr <joel.fillon at mcgill.ca> wrote:

> Hi Ensembl people,
> Given a list of gene IDs from one species, I would like to retrieve the associated GO ids programmatically.
> Species can belong to Ensembl e.g. Mus musculus or EnsemblGenomes e.g. Arabidopsis thaliana.
> I managed to access them using BioMart through R although the parameters differ between Ensembl and EnsemblGenomes.
> Ensembl:
> host: www.ensembl.org
> dataset: <short_scientific_name>_gene_ensembl (e.g. mmusculus_gene_ensembl)
> attributes: ensembl_gene_id, go_id
> EnsemblGenomes:
> host: <division>.ensembl.org (e.g. plants.ensembl.org)
> mart: <division>_mart_<release_number> (e.g. plants_mart_24)
> dataset: <short_scientific_name>_eg_gene (e.g. athaliana_eg_gene)
> attributes: ensembl_gene_id, go_accession
> 1. Are those parameters consistent across species within Ensembl and EnsemblGenomes e.g.
> if I want ids for Bos taurus, dataset will be btaurus_gene_ensembl and attributes ensembl_gene_id, go_id
> will be available?
You are right, the parameters that you have mentioned are consistent across species within Ensembl and EnsemblGenomes.
Your example on Bos taurus will work.
> Or are they likely to be modified with the DB schema in the future and I shouldn't rely on them for a systematic automated solution?
We avoid changing the mart attribute internal names but if we do we will declare the changes on the following page: http://www.ensembl.org/info/website/news.html
The mart internal names are more stable than the DB schema so you can rely on them for a systematic automated solution.
> 2. Is a go_id in Ensembl equivalent to a go_accession in EnsemblGenomes?
Yes, go_id in Ensembl correspond to go_accession in EnsemblGenomes.
> 3. Is there a better way to do this using REST API or other?
> From http://rest.ensembl.org/ , I can't find an Endpoint
> linking a gene ID to related Gene Ontologies
> GET xrefs/symbol/:species/:symbol and GET xrefs/name/:species/:name seem to link to external gene databases only.
> Also, for a list of 20000+ gene ids, I would need to use POST requests with reduced chunks I guess.
> 4. On a different note, sorry if this was answered before. I can't find a "Search" function on this mailing list:
> Am I missing it on:
> http://lists.ensembl.org/mailman/listinfo/dev
> or is search only available through Google or other with "site:http://lists.ensembl.org ..."?
> Thanks a lot for your help!
> Joël
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

Hope this helps,
Thomas Maurel
Bioinformatician - Ensembl Production Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Cambridge CB10 1SD
United Kingdom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141118/925c9a91/attachment.html>

More information about the Dev mailing list