[ensembl-dev] Gene ID <-> Gene Ontology mapping with REST or BioMart API

Thomas Maurel maurel at ebi.ac.uk
Tue Nov 18 15:47:54 GMT 2014


Dear Joël,

In EnsemblGenomes Plant GOSlim data is only available for Hordeum vulgare at the moment (plants mart 24). 
In Ensembl, we have GOSlim data available for our 69 species (ensembl mart 77).

Hope this helps,
Regards,
Thomas
On 18 Nov 2014, at 15:33, Joel Fillon, Mr <joel.fillon at mcgill.ca> wrote:

> Dear Thomas and Andy,
> 
> Thanks a lot for your quick answers, much appreciated!
> 
> I will try both BioMart and REST solutions and find which one suits better my needs.
> 
> Another question:
> In BioMart, are goslim_goa_accession and goslim_goa_description available for EnsemblGenomes species
> or only for Ensembl species?
> I can't find them for athaliana_eg_gene dataset.
> 
> Thanks again
> Joël
> ________________________________________
> De : dev-bounces at ensembl.org [dev-bounces at ensembl.org] de la part de Thomas Maurel [maurel at ebi.ac.uk]
> Envoyé : mardi 18 novembre 2014 08:05
> À : Ensembl developers list
> Objet : Re: [ensembl-dev] Gene ID <-> Gene Ontology mapping with REST or        BioMart API
> 
> Dear Joël,
> 
> Please find below answers to your first two questions:
> On 17 Nov 2014, at 21:09, Joel Fillon, Mr <joel.fillon at mcgill.ca<mailto:joel.fillon at mcgill.ca>> wrote:
> 
> Hi Ensembl people,
> 
> Given a list of gene IDs from one species, I would like to retrieve the associated GO ids programmatically.
> Species can belong to Ensembl e.g. Mus musculus or EnsemblGenomes e.g. Arabidopsis thaliana.
> 
> I managed to access them using BioMart through R although the parameters differ between Ensembl and EnsemblGenomes.
> 
> Ensembl:
> host: www.ensembl.org<http://www.ensembl.org>
> mart: ENSEMBL_MART_ENSEMBL
> dataset: <short_scientific_name>_gene_ensembl (e.g. mmusculus_gene_ensembl)
> attributes: ensembl_gene_id, go_id
> 
> EnsemblGenomes:
> host: <division>.ensembl.org<http://ensembl.org> (e.g. plants.ensembl.org<http://plants.ensembl.org>)
> mart: <division>_mart_<release_number> (e.g. plants_mart_24)
> dataset: <short_scientific_name>_eg_gene (e.g. athaliana_eg_gene)
> attributes: ensembl_gene_id, go_accession
> 
> 
> 1. Are those parameters consistent across species within Ensembl and EnsemblGenomes e.g.
> if I want ids for Bos taurus, dataset will be btaurus_gene_ensembl and attributes ensembl_gene_id, go_id
> will be available?
> You are right, the parameters that you have mentioned are consistent across species within Ensembl and EnsemblGenomes.
> Your example on Bos taurus will work.
> 
> Or are they likely to be modified with the DB schema in the future and I shouldn't rely on them for a systematic automated solution?
> We avoid changing the mart attribute internal names but if we do we will declare the changes on the following page: http://www.ensembl.org/info/website/news.html
> The mart internal names are more stable than the DB schema so you can rely on them for a systematic automated solution.
> 
> 2. Is a go_id in Ensembl equivalent to a go_accession in EnsemblGenomes?
> Yes, go_id in Ensembl correspond to go_accession in EnsemblGenomes.
> 
> 3. Is there a better way to do this using REST API or other?
> From http://rest.ensembl.org/ , I can't find an Endpoint
> linking a gene ID to related Gene Ontologies
> 
> GET xrefs/symbol/:species/:symbol and GET xrefs/name/:species/:name seem to link to external gene databases only.
> Also, for a list of 20000+ gene ids, I would need to use POST requests with reduced chunks I guess.
> 
> 4. On a different note, sorry if this was answered before. I can't find a "Search" function on this mailing list:
> Am I missing it on:
> http://lists.ensembl.org/mailman/listinfo/dev
> 
> or is search only available through Google or other with "site:http://lists.ensembl.org ..."?
> 
> Thanks a lot for your help!
> 
> Joël
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> Hope this helps,
> Regards,
> Thomas
> --
> Thomas Maurel
> Bioinformatician - Ensembl Production Team
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge CB10 1SD
> United Kingdom
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

--
Thomas Maurel
Bioinformatician - Ensembl Production Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141118/a72912c8/attachment.html>


More information about the Dev mailing list