[ensembl-dev] a question about GO annotation

Nathan Johnson njohnson at ebi.ac.uk
Thu Sep 13 10:22:54 BST 2012


Also, I don't believe script one does match for 'histone'.

Nath


On 13 Sep 2012, at 09:19, Andreas Kusalananda Kähäri wrote:

> Hi Mei,
> 
> In Script 1, you fetch all genes and select the ones that are directly
> associated with GO terms having names matching "transcription",
> "chomratin", or "histone".
> 
> In Script 2, you fetch all genes that are directly associated with GO
> terms having names matching "transcription","chomratin", or "histone",
> *or* with any of the child terms of those GO term.  The child terms
> might or might not be named "transcription", "chomratin", or "histone".
> 
> When you fetch_all_by_GOTerm(), the API will take the GO hierarchy into
> account and fetch all genes assocated with the term itself, or with any
> of its child terms.  This is also mentioned in the documentation of that
> method.
> 
> 
> Cheers,
> Andreas
> 
> 
> 
> On Thu, Sep 13, 2012 at 12:20:34AM +0800, JiangMei wrote:
>> 
>> Hi All. Sorry to bother you.
>> 
>> I wrote two scripts to fetch genes annotated with specific GO terms. The scripts are shown in the following:
>> 
>> Script 1:
>> #Store target genes in @genelist
>> my @genelist;
>> 
>> use Bio::EnsEMBL::Registry;
>> my $registry = 'Bio::EnsEMBL::Registry';
>> $registry->load_registry_from_db(
>>      -host       =>'ensembldb.ensembl.org',
>>      -user       =>'anonymous',
>>      -db_version =>'67');
>> my $go_adaptor=$registry->get_adaptor( 'Multi', 'Ontology', 'GOTerm' );
>> my $gene_adaptor=$registry->get_adaptor( 'drosophila melanogaster', 'Core', 'Gene' );
>> 
>> for $gene(@{$gene_adaptor->fetch_all}){
>>      my @db_links=@{$gene->get_all_DBLinks('GO')};
>>      for $dbe(@db_links){
>>            my $go_name=$dbe->description;
>>            push @genelist,$gene->stable_id if $go_name=~/transcription|chromatin/;
>>     }
>> }
>> 
>> Script 2:
>> #Store target genes in @genelist
>> my @genelist;
>> 
>> use Bio::EnsEMBL::Registry;
>> 
>> my $registry = 'Bio::EnsEMBL::Registry';
>> 
>> $registry->load_registry_from_db(
>> 
>>      -host       =>'ensembldb.ensembl.org',
>> 
>>      -user       =>'anonymous',
>> 
>>      -db_version =>'67');
>> my $go_adaptor=$registry->get_adaptor( 'Multi', 'Ontology', 'GOTerm' );
>> my $gene_adaptor=$registry->get_adaptor( 'drosophila melanogaster', 'Core', 'Gene' );
>> 
>> for $term(@{$go_adaptor->fetch_all}){
>> 
>>      my $name=$term->name;
>> 
>>      my $acc=$term->accession;
>> 
>>      if(($acc=~/^GO:/)&&($name=~/transcription|chromatin|histone/)){
>> 
>>         for $gene(@{$gene_adaptor->fetch_all_by_GOTerm($term)}){
>> 
>>               push @genelist,$gene->stable_id;
>> 
>>         }
>> 
>>     }
>> 
>> }
>> 
>> Basically, Script 1 fetched all the genes, then got GO annotations for each gene. If GO matched the regular express, then push gene ID to @genelist. Script 2 fetched all the GO terms and if GO matched the regular expression, push gene ID to @genelist. The two scripts were supposed to get the same gene lists. However, They got different lists. Does anyone konw the reason? Are there anything wrong in the scripts?
>> 
>> Wish your help! Thanks a bunch! I really appreciate it.
>> 
>> Best, Mei
>> 		 	   		  
> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
> 
> 
> -- 
> Andreas Kusalananda Kähäri
> Ensembl Gene Annotation Team
> 
> Sent from the tips of my fingers
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

Nathan Johnson
Senior Scientific Programmer
Ensembl Regulation
European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD

http://www.ensembl.info/
http://twitter.com/#!/ensembl










More information about the Dev mailing list