[ensembl-dev] Extracting all GOslim-goa (or GO) annotations for every ENSG.

Elena Grassi grassi.e at gmail.com
Sat Jul 6 09:10:38 BST 2013


I'm sorry to bother you but I'm puzzled (again) by the comparisons
between the query-script v1-script with ancenstors results (apart from
the difference in time...the query is almost instantaneous while the
perl script are really slow).

Number of line resulting from the first script, the version that calls
fetch_all_by_parent_term and the SQL query:
data at tungsteno:/rogue/bioinfotree/prj/expr_evol/local/src$ sort
goslim_api.tsv | uniq | wc -l
216713
data at tungsteno:/rogue/bioinfotree/prj/expr_evol/local/src$ sort
goslim_inclusive_api.tsv | uniq | wc -l
216713
data at tungsteno:/rogue/bioinfotree/prj/expr_evol/local/src$ sort
goslim_api.tsv | uniq | wc -l
216713

They are all the same!

script1 relevant part:

        foreach my $gene (@$genes) {
                print_DBEntries($gene->get_all_DBLinks('goslim_goa'),
$gene->stable_id());
        }

sub print_DBEntries
{
    my $db_entries = shift;
    my $gene = shift;

    foreach my $dbe ( @{$db_entries} ) {
        print $gene . "\t" . $dbe->display_id() . "\t" .
$dbe->description() . "\n";
    }
}

script2:
        foreach my $gene (@$genes) {
                my $goslims = $gene->get_all_DBLinks('goslim_goa');
                foreach my $goslim (@$goslims) {
                        bless $goslim, 'Bio::EnsEMBL::OntologyTerm';
                        if (!defined($ancestors{$goslim})) {
                                my $goa = $registry->get_adaptor(
'Multi', 'Ontology', 'OntologyTerm' );
                                $ancestors{$goslim} =
$goa->fetch_all_by_descendant_term($goslim, 'goslim_goa');
                        }
                        bless $goslim, 'Bio::EnsEMBL::DBEntry';
                        print $gene->stable_id() . "\t" .
$goslim->display_id() . "\t" . $goslim->description() . "\n";
                        foreach my $ancestor (@{ $ancestors{$goslim} }) {
                                print $gene->stable_id() . "\t" .
$ancestor->display_id() . "\t" . $ancestor->description() . "\n";
                        }
                }

        }

SQL:
select g.stable_id,x.dbprimary_acc,x.description from xref as x,
object_xref as o, transcript as t, translation as tr, gene as g where
x.external_db_id = "12700" AND o.xref_id=x.xref_id and g.gene_id =
t.gene_id AND t.transcript_id = tr.transcript_id AND tr.translation_id
= o.ensembl_id;

Thanks,
E.




More information about the Dev mailing list