[ensembl-dev] Gene ID <-> Gene Ontology mapping with REST or BioMart API

Joel Fillon, Mr joel.fillon at mcgill.ca
Mon Nov 17 21:09:26 GMT 2014

Hi Ensembl people,

Given a list of gene IDs from one species, I would like to retrieve the associated GO ids programmatically.
Species can belong to Ensembl e.g. Mus musculus or EnsemblGenomes e.g. Arabidopsis thaliana.

I managed to access them using BioMart through R although the parameters differ between Ensembl and EnsemblGenomes.

host: www.ensembl.org
dataset: <short_scientific_name>_gene_ensembl (e.g. mmusculus_gene_ensembl)
attributes: ensembl_gene_id, go_id

host: <division>.ensembl.org (e.g. plants.ensembl.org)
mart: <division>_mart_<release_number> (e.g. plants_mart_24)
dataset: <short_scientific_name>_eg_gene (e.g. athaliana_eg_gene)
attributes: ensembl_gene_id, go_accession

1. Are those parameters consistent across species within Ensembl and EnsemblGenomes e.g.
if I want ids for Bos taurus, dataset will be btaurus_gene_ensembl and attributes ensembl_gene_id, go_id
will be available?

Or are they likely to be modified with the DB schema in the future and I shouldn't rely on them for a systematic automated solution?

2. Is a go_id in Ensembl equivalent to a go_accession in EnsemblGenomes?

3. Is there a better way to do this using REST API or other?
>From http://rest.ensembl.org/ , I can't find an Endpoint
linking a gene ID to related Gene Ontologies

GET xrefs/symbol/:species/:symbol and GET xrefs/name/:species/:name seem to link to external gene databases only.
Also, for a list of 20000+ gene ids, I would need to use POST requests with reduced chunks I guess.

4. On a different note, sorry if this was answered before. I can't find a "Search" function on this mailing list:
Am I missing it on:

or is search only available through Google or other with "site:http://lists.ensembl.org ..."?

Thanks a lot for your help!


More information about the Dev mailing list