[ensembl-dev] problems on fetching GO terms for all genes of a bacteria using API

刘鹏飞 liupfskygre at gmail.com
Thu Jan 9 10:27:34 GMT 2014


Thanks Andy and Magali,
problems solved by import the ontology db before construction.
use Bio::EnsEMBL::DBSQL::OntologyDBAdaptor;
!



2014/1/8 刘鹏飞 <liupfskygre at gmail.com>

> Hi,
> Magali
> Sorry, the way you suggested above did not work.
> using following,
> # get adaptor for ontology
> my $ontology_dba = Bio::EnsEMBL::DBSQL::OntologyDBAdaptor->new(
>  -HOST => 'mysql.ebi.ac.uk',
>  -USER => 'anonymous',
>  -PORT => '4157',
>  -group   => 'ontology',
>  -dbname => 'ensemblgenomes_ontology_21_74',
>  -species => 'multi' );
> my $goada = $ontology_dba->get_adaptor('OntologyTerm');
> I still got the output before:
>
> Can't locate object method "new" via package
> "Bio::EnsEMBL::DBSQL::OntologyDBAdaptor" at /home/liupf/hz254_2.pl line
> 21.
>
> change the OntologyDBAdaptor to DBAdaptor
> my $ontology_dba = Bio::EnsEMBL::DBSQL::DBAdaptor->new(
>
> output:
> Can't call method "fetch_by_accession" on an undefined value at
> /home/liupf/hz254_2.pl line 38.;
>
> >>>>>>>>>>>>>>>>>my full list of code>>>>>>>>>>>>
> # methanocella_conradii_hz254
> #!/usr/bin/perl
> use strict;
> use warnings;
> use Bio::EnsEMBL::LookUp;
>
> # load the lookup from the main Ensembl Bacteria public server
> my $lookup = Bio::EnsEMBL::LookUp->new(
>   -URL => "http://bacteria.ensembl.org/registry.json",
>   -NO_CACHE => 1
> );
> # find the correct database adaptor using a unique name
> my ($dba) = @{$lookup->get_by_name_exact(
>   'methanocella_conradii_hz254'
> )};
> my $genes = $dba->get_GeneAdaptor()->fetch_all(); # where is the
> get_GeneAdaptor() documentation
> # test
>
> print "Found ".scalar @$genes." genes for ".$dba->species()."\n";
> # get adaptor for ontology
> my $ontology_dba = Bio::EnsEMBL::DBSQL::DBAdaptor->new(
>
>  -HOST => 'mysql.ebi.ac.uk',
>  -USER => 'anonymous',
>  -PORT => '4157',
>  -group   => 'ontology',
>  -dbname => 'ensemblgenomes_ontology_21_74',
>  -species => 'multi' );
> my $goada = $ontology_dba->get_adaptor('OntologyTerm');
>
> # get go infomation
> foreach my $gene (@$genes){
> foreach my $link (@{ $gene->get_all_DBLinks } ){
>
> if ($link->database eq "GO"){
> my $term_id=$link->display_id;
> my $term_name='-';
> my $term=$goada->fetch_by_accession($term_id);
> if($term and $term->name){
> $term_name=$term->name;}
> print $gene->stable_id.":$term_id ($term_name)\n";
> # fetch complete GO hierachy
> foreach my $ancestor_term (@{$term->ancestors()}){
> print "\t". $ancestor_term->accession." (".$ancestor_term->name.")\n";
> }
>   }
>  }
> }
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> Is there something else wrong in it cause the output above?
> Thanks!
>
>
> 2014/1/7 <mr6 at ebi.ac.uk>
>
> Hi Pengfei,
>>
>> Replies above
>>
>> > Hi,
>> > Magali
>> > Thanks for your quick reply. I followed your instruction and modified
>> line
>> > as follows:
>> > # get adaptor for ontology
>> > my $ontology_dba=Bio::EnsEMBL::DBSQL::OntologyDBAdaptor->new(
>> > -HOST=>"mysql.ebi.ac.uk",
>> > -USER=>'anonymous',
>> > -PORT=>'4157',
>> > -group =>'ontology',
>> > -dbname=>'ensemblgenomes_ontology_21_74',
>> > -species=>'multi');
>> >
>> > #one more question: where could I find those informations like host,
>> user,
>> > and dbname that I needed to get data object or adaptor wanted ?
>>
>> I used the information provided in the ensembl bacteria documentation
>>
>> http://bacteria.ensembl.org/info/data/accessing_ensembl_bacteria.html#advanced-use
>> The default host for ensembl genomes databases is mysql.ebi.ac.uk with
>> the
>> connection details as mentioned.
>> For the database name, an ontology database would normally be called
>> ensemblgenomes_ontology_egrelease_erelease
>> where egrelease refers to the ensembl genomes release version (here 21)
>> and erelease corresponds to the ensembl release version (here 74)
>> In ensembl, the corresponding database is called ensembl_ontology_74
>> If you have a mysql server installed, you can log directly onto the
>> ensembl genomes server to find the exact name of the database you are
>> looking for
>>
>> >
>> > # now use the DBAdaptor to get_adaptor
>> > my $goada=$ontology_dba->get_adaptor('Multi','Ontology','OntologyTerm');
>> > # in your reply is $goada=$registry, but i think it shoud by
>> > $goada=$ontology_dba, right?
>>
>> Sorry about the confusion.
>> We normally use registry objects to connect to databases, but it does not
>> cope well with multi-species databases like the bacterial ones, hence the
>> use of the lookup and direct DBadaptors.
>> The correct syntax should have been:
>> my $ontology_dba = Bio::EnsEMBL::DBSQL::OntologyDBAdaptor->new(
>>  -HOST => 'mysql.ebi.ac.uk',
>>  -USER => 'anonymous',
>>  -PORT => '4157',
>>  -group   => 'ontology',
>>  -dbname => 'ensemblgenomes_ontology_21_74',
>>  -species => 'multi' );
>>
>> my $goada = $ontology_dba->get_adaptor('OntologyTerm');
>>
>> Hopefully, this should also solve the issue below.
>>
>> >
>> > the output:
>> > Can't locate object method "new" via package
>> > "Bio::EnsEMBL::DBSQL::OntologyDBAdaptor" at /home/liupf/hz254_2.pl line
>> > 21.
>> >
>> > I check the doxygen for OntologyDBAdaptor, the new() methods, but
>> returned
>> > examples were all Bio::EnsEMBL::DBSQL::DBAdaptor::new(), so I think new
>> > method is no longer supported by OntologyDBAdaptor, so I also tried
>> > my $ontology_dba=Bio::EnsEMBL::DBSQL::DBAdaptor->new(.....
>> >
>> > unfortunately, output came:
>> > Can't call method "fetch_by_accession" on an undefined value at
>> > /home/liupf/
>> > hz254_2.pl line 37.
>> >
>> > ### confusion on understanding Ensembl API
>> > Use API to fetch data, you need to use the right database and the
>> > corresponding DBAdaptor, and then use the right object adaptor and
>> methods
>> > to do it. Is my understanding right?
>>
>> That is correct.
>>
>> > I am confused by that:
>> > my genes was in the bacteria database, I could fetch them, but if the
>> gene
>> > ontology terms was in another database, how could the connected and does
>> > that mean I need two DBAdaptor for each of them?
>>
>> The bacterial database contains the genes and all related information.
>> It thus contains ontology terms attached to translations and genes.
>> It does not however contain the definition for each ontology term, nor its
>> relationships with other terms, like descendants and ancestors.
>> This additional information is stored separately in the ontology database.
>>
>> > Hope those not bothering you too much!
>> > ###
>> > Thank you very much!
>>
>>
>> Hope this helps,
>> Magali
>>
>> >
>> >
>> >
>> > 2014/1/7 <mr6 at ebi.ac.uk>
>> >
>> >> Hi Pengfei,
>> >>
>> >> The get_GeneAdaptor method is equivalent using get_Adaptor('Gene').
>> >> More documentation can be found here:
>> >>
>> >>
>> http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1DBSQL_1_1DBAdaptor.html#a2a1ee81ecb9507fc5ea7bdf39be97bf9
>> >>
>> >> As for the undefined value message you are getting.
>> >> By calling get_adaptor on $dba, you are attempting to get an object
>> >> adaptor defined in the context of your bacteria database.
>> >> Ontologies are stored separately in their own database,
>> >> ensembl_ontology.
>> >>
>> >> The easiest way to access the ontology database would be as follow:
>> >> my $ontology_dba = Bio::EnsEMBL::DBSQL::OntologyDBAdaptor->new(
>> >> -HOST => 'mysql.ebi.ac.uk',
>> >> -USER => 'anonymous',
>> >> -PORT => '4157',
>> >> -group   => 'ontology',
>> >> -dbname => 'ensemblgenomes_ontology_21_74',
>> >> -species => 'multi' );
>> >>
>> >> my $goada = $registry->get_adaptor( 'Multi', 'Ontology', 'OntologyTerm'
>> >> );
>> >>
>> >> You should then be able to call fetch_by_accession on $goada for a
>> given
>> >> GO accession.
>> >>
>> >>
>> >> Hope that helps,
>> >> Magali
>> >>
>> >> > Hi all
>> >> >   I am new to API. Now I am trying to use it to get all GO terms for
>> >> each
>> >> > genes of a archaea(Methanocella conradii HZ254), and want to get a
>> >> table
>> >> > with two columns, on for gene name and the other for GO term
>> >> correponding
>> >> > to it
>> >> >
>> >> > Following the instruction of API and the modifications to use API for
>> >> > bacteria, I use the following code to do the job:
>> >> > # load the lookup from the main Ensembl Bacteria public server
>> >> > my $lookup = Bio::EnsEMBL::LookUp->new(
>> >> >   -URL => "http://bacteria.ensembl.org/registry.json",
>> >> >   -NO_CACHE => 1
>> >> > );
>> >> > # find the correct database adaptor using a unique name
>> >> > my ($dba) = @{$lookup->get_by_name_exact(
>> >> >   'methanocella_conradii_hz254'
>> >> > )};
>> >> > # get adaptor for ontology
>> >> > my $goada=$dba->get_adaptor('Multi','Ontology','OntologyTerm');
>> >> > my $genes = $dba->get_GeneAdaptor()->fetch_all(); # where is the
>> >> > get_GeneAdaptor() documentation
>> >> > # ###test####
>> >> > print "Found ".scalar @$genes." genes for ".$dba->species()."\n";
>> >> >
>> >> > # get go infomation (modified from kokocinsky.net ensembl coding)
>> >> > foreach my $gene (@$genes){
>> >> > my $links = $gene->get_all_DBLinks();
>> >> > foreach my $link (@$links){
>> >> > if ($link->database eq "GO"){
>> >> > my $term_id=$link->display_id;
>> >> > my $term_name='-';
>> >> > my $term=$goada->fetch_by_accession($term_id);
>> >> > if($term and $term->name){
>> >> > $term_name=$term->name;}
>> >> > print $gene->stable_id.":$term_id ($term_name)\n";
>> >> > # fetch complete GO hierachy
>> >> > foreach my $ancestor_term (@{$term->ancestors()}){
>> >> > print "\t". $ancestor_term->accession."
>> (".$ancestor_term->name.")\n";
>> >> > }
>> >> >   }
>> >> >  }
>> >> > }
>> >> >
>> >> > it works well before "get go information"
>> >> > the output was as following:
>> >> > Can't call method "fetch_by_accession" on an undefined value at
>> >> > /home/liupf/
>> >> > hz254.pl line 27.
>> >> > 1, I do not understand the use of 'get_GeneAdaptor', I could not find
>> >> > documentation on this synthax.
>> >> > 2, please give me some suggestiones on how to fullfill my task.
>> >> >
>> >> > Thank you all!
>> >> >
>> >> > $ perl ~/ApiVersion.pl
>> >> > The API version used is 74
>> >> >
>> >> > --
>> >> > Pengfei Liu, PhD Candidate
>> >> >
>> >> > Lab of Microbial Ecology
>> >> > College of Resources and Environmental Sciences
>> >> > China Agricultural University
>> >> > No.2 Yuanmingyuanxilu, Beijing, 100193
>> >> > P.R. China
>> >> >
>> >> > Tel: +86-10-62731358
>> >> > Fax: +86-10-62731016
>> >> >
>> >> > E-mail: liupfskygre at gmail.com
>> >> > _______________________________________________
>> >> > Dev mailing list    Dev at ensembl.org
>> >> > Posting guidelines and subscribe/unsubscribe info:
>> >> > http://lists.ensembl.org/mailman/listinfo/dev
>> >> > Ensembl Blog: http://www.ensembl.info/
>> >> >
>> >>
>> >>
>> >
>> >
>> > --
>> > Pengfei Liu, PhD Candidate
>> >
>> > Lab of Microbial Ecology
>> > College of Resources and Environmental Sciences
>> > China Agricultural University
>> > No.2 Yuanmingyuanxilu, Beijing, 100193
>> > P.R. China
>> >
>> > Tel: +86-10-62731358
>> > Fax: +86-10-62731016
>> >
>> > E-mail: liupfskygre at gmail.com
>> >
>> > If you are afraid of tomorrow, how can you enjoy today!
>> > Keep hungry, Keep foolish!
>> > Moving forward!
>> >
>>
>>
>
>
> --
> Pengfei Liu, PhD Candidate
>
> Lab of Microbial Ecology
> College of Resources and Environmental Sciences
> China Agricultural University
> No.2 Yuanmingyuanxilu, Beijing, 100193
> P.R. China
>
> Tel: +86-10-62731358
> Fax: +86-10-62731016
>
> E-mail: liupfskygre at gmail.com
>
> If you are afraid of tomorrow, how can you enjoy today!
> Keep hungry, Keep foolish!
> Moving forward!
>



-- 
Pengfei Liu, PhD Candidate

Lab of Microbial Ecology
College of Resources and Environmental Sciences
China Agricultural University
No.2 Yuanmingyuanxilu, Beijing, 100193
P.R. China

Tel: +86-10-62731358
Fax: +86-10-62731016

E-mail: liupfskygre at gmail.com

If you are afraid of tomorrow, how can you enjoy today!
Keep hungry, Keep foolish!
Moving forward!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140109/36ae707a/attachment.html>


More information about the Dev mailing list