[ensembl-dev] Gene ID <-> Gene Ontology mapping with REST or BioMart API

Joel Fillon, Mr joel.fillon at mcgill.ca
Tue Nov 18 23:13:27 GMT 2014


Hi Andy et al,

Regarding the REST API:

Is there a way to specify an archive in the REST Endpoint or does the REST server only works with the latest release?

I've got gene IDs from Ensembl 76 or EnsemblGenomes 23 and I'm trying:
http://rest.aug2014.archive.ensembl.org/xrefs/id/ENSG00000157764?external_db=GO;all_levels=1
http://rest.ensembl.org/archive/aug2014/xrefs/id/ENSG00000157764?external_db=GO;all_levels=1
http://rest.ensembl.org/archive/76/xrefs/id/ENSG00000157764?external_db=GO;all_levels=1

with no success.

Thanks,
Joël
________________________________________
De : dev-bounces at ensembl.org [dev-bounces at ensembl.org] de la part de Andy Yates [ayates at ebi.ac.uk]
Envoyé : mardi 18 novembre 2014 09:24
À : Ensembl developers list
Objet : Re: [ensembl-dev] Gene ID <-> Gene Ontology mapping with REST or        BioMart API

Hi Joel

You're quite right that you can retrieve the Gene -> GO mappings via REST using the /xref/id endpoint e.g.

http://rest.ensembl.org/xrefs/id/ENSG00000157764.json?external_db=GO;all_levels=1

The all_levels parameter is important as GO terms are linked to proteins not genes. All levels forces the REST API to descend through the transcripts and proteins belonging to the gene before sending back the results. That also means that you will probably see duplicate GO terms returned since multiple proteins linked to the same gene could be annotated with similar functions.

As for sending sending 20K+ requests this is fine so long as you are ok respecting the rate limit & the 429 codes the server will send you back should you go over them. If you send 15 requests per second (the max rate available) then you should be able to process 20K in ~20 minutes. I admit this will be slower than using BioMart.

Andy

------------
Andrew Yates - Ensembl Support Coordinator
European Molecular Biology Laboratory
European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton, Cambridge
CB10 1SD, United Kingdom
Tel: +44-(0)1223-492538
Fax: +44-(0)1223-494468
Skype: andrewyatz
http://www.ensembl.org/

On 17 Nov 2014, at 21:09, "Joel Fillon, Mr" <joel.fillon at mcgill.ca> wrote:

> Hi Ensembl people,
>
> Given a list of gene IDs from one species, I would like to retrieve the associated GO ids programmatically.
> Species can belong to Ensembl e.g. Mus musculus or EnsemblGenomes e.g. Arabidopsis thaliana.
>
> I managed to access them using BioMart through R although the parameters differ between Ensembl and EnsemblGenomes.
>
> Ensembl:
> host: www.ensembl.org
> mart: ENSEMBL_MART_ENSEMBL
> dataset: <short_scientific_name>_gene_ensembl (e.g. mmusculus_gene_ensembl)
> attributes: ensembl_gene_id, go_id
>
> EnsemblGenomes:
> host: <division>.ensembl.org (e.g. plants.ensembl.org)
> mart: <division>_mart_<release_number> (e.g. plants_mart_24)
> dataset: <short_scientific_name>_eg_gene (e.g. athaliana_eg_gene)
> attributes: ensembl_gene_id, go_accession
>
>
> 1. Are those parameters consistent across species within Ensembl and EnsemblGenomes e.g.
> if I want ids for Bos taurus, dataset will be btaurus_gene_ensembl and attributes ensembl_gene_id, go_id
> will be available?
>
> Or are they likely to be modified with the DB schema in the future and I shouldn't rely on them for a systematic automated solution?
>
> 2. Is a go_id in Ensembl equivalent to a go_accession in EnsemblGenomes?
>
> 3. Is there a better way to do this using REST API or other?
> From http://rest.ensembl.org/ , I can't find an Endpoint
> linking a gene ID to related Gene Ontologies
>
> GET xrefs/symbol/:species/:symbol and GET xrefs/name/:species/:name seem to link to external gene databases only.
> Also, for a list of 20000+ gene ids, I would need to use POST requests with reduced chunks I guess.
>
> 4. On a different note, sorry if this was answered before. I can't find a "Search" function on this mailing list:
> Am I missing it on:
> http://lists.ensembl.org/mailman/listinfo/dev
>
> or is search only available through Google or other with "site:http://lists.ensembl.org ..."?
>
> Thanks a lot for your help!
>
> Joël
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/


_______________________________________________
Dev mailing list    Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/




More information about the Dev mailing list