[ensembl-dev] Gene ID <-> Gene Ontology mapping with REST or BioMart API
Joel Fillon, Mr
joel.fillon at mcgill.ca
Mon Nov 17 21:09:26 GMT 2014
Hi Ensembl people,
Given a list of gene IDs from one species, I would like to retrieve the associated GO ids programmatically.
Species can belong to Ensembl e.g. Mus musculus or EnsemblGenomes e.g. Arabidopsis thaliana.
I managed to access them using BioMart through R although the parameters differ between Ensembl and EnsemblGenomes.
Ensembl:
host: www.ensembl.org
mart: ENSEMBL_MART_ENSEMBL
dataset: <short_scientific_name>_gene_ensembl (e.g. mmusculus_gene_ensembl)
attributes: ensembl_gene_id, go_id
EnsemblGenomes:
host: <division>.ensembl.org (e.g. plants.ensembl.org)
mart: <division>_mart_<release_number> (e.g. plants_mart_24)
dataset: <short_scientific_name>_eg_gene (e.g. athaliana_eg_gene)
attributes: ensembl_gene_id, go_accession
1. Are those parameters consistent across species within Ensembl and EnsemblGenomes e.g.
if I want ids for Bos taurus, dataset will be btaurus_gene_ensembl and attributes ensembl_gene_id, go_id
will be available?
Or are they likely to be modified with the DB schema in the future and I shouldn't rely on them for a systematic automated solution?
2. Is a go_id in Ensembl equivalent to a go_accession in EnsemblGenomes?
3. Is there a better way to do this using REST API or other?
>From http://rest.ensembl.org/ , I can't find an Endpoint
linking a gene ID to related Gene Ontologies
GET xrefs/symbol/:species/:symbol and GET xrefs/name/:species/:name seem to link to external gene databases only.
Also, for a list of 20000+ gene ids, I would need to use POST requests with reduced chunks I guess.
4. On a different note, sorry if this was answered before. I can't find a "Search" function on this mailing list:
Am I missing it on:
http://lists.ensembl.org/mailman/listinfo/dev
or is search only available through Google or other with "site:http://lists.ensembl.org ..."?
Thanks a lot for your help!
Joël
More information about the Dev
mailing list