[ensembl-dev] a question about GO annotation
Nathan Johnson
njohnson at ebi.ac.uk
Thu Sep 13 10:22:54 BST 2012
Also, I don't believe script one does match for 'histone'.
Nath
On 13 Sep 2012, at 09:19, Andreas Kusalananda Kähäri wrote:
> Hi Mei,
>
> In Script 1, you fetch all genes and select the ones that are directly
> associated with GO terms having names matching "transcription",
> "chomratin", or "histone".
>
> In Script 2, you fetch all genes that are directly associated with GO
> terms having names matching "transcription","chomratin", or "histone",
> *or* with any of the child terms of those GO term. The child terms
> might or might not be named "transcription", "chomratin", or "histone".
>
> When you fetch_all_by_GOTerm(), the API will take the GO hierarchy into
> account and fetch all genes assocated with the term itself, or with any
> of its child terms. This is also mentioned in the documentation of that
> method.
>
>
> Cheers,
> Andreas
>
>
>
> On Thu, Sep 13, 2012 at 12:20:34AM +0800, JiangMei wrote:
>>
>> Hi All. Sorry to bother you.
>>
>> I wrote two scripts to fetch genes annotated with specific GO terms. The scripts are shown in the following:
>>
>> Script 1:
>> #Store target genes in @genelist
>> my @genelist;
>>
>> use Bio::EnsEMBL::Registry;
>> my $registry = 'Bio::EnsEMBL::Registry';
>> $registry->load_registry_from_db(
>> -host =>'ensembldb.ensembl.org',
>> -user =>'anonymous',
>> -db_version =>'67');
>> my $go_adaptor=$registry->get_adaptor( 'Multi', 'Ontology', 'GOTerm' );
>> my $gene_adaptor=$registry->get_adaptor( 'drosophila melanogaster', 'Core', 'Gene' );
>>
>> for $gene(@{$gene_adaptor->fetch_all}){
>> my @db_links=@{$gene->get_all_DBLinks('GO')};
>> for $dbe(@db_links){
>> my $go_name=$dbe->description;
>> push @genelist,$gene->stable_id if $go_name=~/transcription|chromatin/;
>> }
>> }
>>
>> Script 2:
>> #Store target genes in @genelist
>> my @genelist;
>>
>> use Bio::EnsEMBL::Registry;
>>
>> my $registry = 'Bio::EnsEMBL::Registry';
>>
>> $registry->load_registry_from_db(
>>
>> -host =>'ensembldb.ensembl.org',
>>
>> -user =>'anonymous',
>>
>> -db_version =>'67');
>> my $go_adaptor=$registry->get_adaptor( 'Multi', 'Ontology', 'GOTerm' );
>> my $gene_adaptor=$registry->get_adaptor( 'drosophila melanogaster', 'Core', 'Gene' );
>>
>> for $term(@{$go_adaptor->fetch_all}){
>>
>> my $name=$term->name;
>>
>> my $acc=$term->accession;
>>
>> if(($acc=~/^GO:/)&&($name=~/transcription|chromatin|histone/)){
>>
>> for $gene(@{$gene_adaptor->fetch_all_by_GOTerm($term)}){
>>
>> push @genelist,$gene->stable_id;
>>
>> }
>>
>> }
>>
>> }
>>
>> Basically, Script 1 fetched all the genes, then got GO annotations for each gene. If GO matched the regular express, then push gene ID to @genelist. Script 2 fetched all the GO terms and if GO matched the regular expression, push gene ID to @genelist. The two scripts were supposed to get the same gene lists. However, They got different lists. Does anyone konw the reason? Are there anything wrong in the scripts?
>>
>> Wish your help! Thanks a bunch! I really appreciate it.
>>
>> Best, Mei
>>
>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
>
> --
> Andreas Kusalananda Kähäri
> Ensembl Gene Annotation Team
>
> Sent from the tips of my fingers
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
Nathan Johnson
Senior Scientific Programmer
Ensembl Regulation
European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
http://www.ensembl.info/
http://twitter.com/#!/ensembl
More information about the Dev
mailing list