[ensembl-dev] How to programmatically get ontology go terms for transcript?

Andy Yates ayates at ebi.ac.uk
Tue Apr 10 16:17:58 BST 2012


Hi James,

GO Slim terms are located in a secondary database; Ensembl calls this ensembl_ontology_RELEASE & Ensembl Genomes calls it ensemblgenomes_ontology_RELEASE. We have a lot of examples of how to use it in our core checkout under:

ensembl/misc-scripts/ontology

We have a README explaining the design of the schema & API and then examples in the scripts directory.

As for links the web code has a mechanism which allows association of a URL pattern with an external identifier. I doubt you will be able to script against this so I would suggest a small lookup in your code to do the associations.

I hope this helps,

Andy

Andrew Yates                   Ensembl Core Software Project Leader
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensembl.org/

On 10 Apr 2012, at 15:27, Thomason, James wrote:

> Okay, good, I am getting somewhere. Hopefully I'm almost there.
> 
> I filtered out my call to get_all_DBLinks() to only look at the OntologyXref objects.
> 
> Comparing against thes tables:
> http://dev.gramene.org/Arabidopsis_thaliana/Transcript/Ontology/Table?db=core;g=AT3G52430;oid=1;r=3:19431371-19434403;t=AT3G52430.1
> 
> it looks like the OntologyXref objects directly give me the accession, term, and evidence codes. So next questions are -
> 
> 1) How do I get those go slim accessions? They don't appear to be  OntologyXref objects, and I'm assuming they're somehow related. Trying the various get_* methods didn't seem to yield the results I wanted.
> 
> 2) How can I get the URLs that the accessions link to? For the non-slim ones, I see that for my purposes I could just hardwire to Gramene, but I'd really rather populate it correctly.
> 
> 3) How do I group the accessions together? Again, that page has multiple tables - descendent of biological process, cellular component, etc.  Is it a matter of looking at each OntologyXref's get_all_masters() values and parsing together a tree from that?
> 
> Many thanks for the help and the patience. :-)
> 
> On Apr 10, 2012, at 2:52 AM, Andy Yates wrote:
> 
> Hi James,
> 
> If you are going to use the API to extract this information then you should look at the Bio::EnsEMBL::OntologyXref which extends DBEntry. The API automatically creates these objects when it encounters an object_xref link which also has an entry in the ontology_xref table. As Jan said get_all_DBLinks() is the method to use and will return these OntologyXref objects.
> 
> All the best,
> 
> Andy
> 
> On 10 Apr 2012, at 00:08, "Thomason, James" <thomason at cshl.edu<mailto:thomason at cshl.edu>> wrote:
> 
> Well, that's progress, I guess. But what do I do with a Bio::EnsEMBL::DBEntry object?
> 
> Looking at them, they don't appear to contain any of the data in that ontology table. Do I need to hop to some additional objects? Nothing looks obvious to jump to next.
> 
> On Apr 9, 2012, at 5:15 PM, Jan Vogel wrote:
> 
> 
> Hi James,
> 
> check out the doxygen ensembl api doc:
> 
> http://uswest.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1Transcript.html#afbe0947fe458e2f2739f78852c292f7c
> 
> Bio::EnsEMBL::Transcript::get_all_DBLinks( )  is your friend - for Ensembl annotation on www.ensembl.org<http://www.ensembl.org>, this should return http://www.geneontology.org/GO.slims.shtml#whatIs annotations.
> Another way would be to use biomart.
> 
> I'm unsure if this is set up for the gramene website …
> 
> Hope  this helps,
> 
>      Jan Vogel
> 
> 
> On Apr 9, 2012, at 2:56 PM, Thomason, James wrote:
> 
> Hi all,
> 
> I'm completely stumped. I've been charged with programmatically extracting out ontology go terms from our ensembl installation. A relevant link would be:
> 
> http://www.gramene.org/Arabidopsis_thaliana/Transcript/Ontology/Table?db=core;g=AT3G52430;oid=1;r=3:19431371-19434403;t=AT3G52430.1
> 
> I want to pull out everything inside that "Ontology Table" bit. But I'm utterly stumped as to how to go about doing it. I dug through the code enough to find that the page is generated through an EnsEMBL::Web::Component::Transcript::Go object, but I don't know how to instantiate one on the command line to get at the info. Presumably, since it's in the Web sub-tree, I really shouldn't be doing that on the command line anyway. Is there some way to link to that data through a Bio::EnsEMBL::Transcript object, perhaps? An xref or something?
> 
> For now, I basically just want to dump out that data in a tab delimited format, so I don't need anything fancy other than actually getting to it.
> 
> Any pointers in the right direction would be greatly appreciated.
> 
> Thanks,
> 
> --
> -Jim Thomason...
> 
> Scientific Informatics Developer @ The Ware Lab,
> a USDA-ARS Laboratory at Cold Spring Harbor Laboratory
> http://www.warelab.org/
> http://www.cshl.edu/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> --
> -Jim Thomason...
> 
> Scientific Informatics Developer @ The Ware Lab,
> a USDA-ARS Laboratory at Cold Spring Harbor Laboratory
> http://www.warelab.org/
> http://www.cshl.edu/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> --
> -Jim Thomason...
> 
> Scientific Informatics Developer @ The Ware Lab,
> a USDA-ARS Laboratory at Cold Spring Harbor Laboratory
> http://www.warelab.org/
> http://www.cshl.edu/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list