[ensembl-dev] searching for genes by description field?

Kieron Taylor ktaylor at ebi.ac.uk
Fri Mar 15 17:01:26 GMT 2013


Hi Adam,

Sorry that it has taken a month to get around to it, but I have now 
implemented fetch_all_by_description on the Gene Adaptor. It returns a 
list reference to all gene objects with descriptions matching the given 
string.

It has been checked into CVS head, and will be officially available as 
of our next release.

Regards,

Kieron

-- 
Kieron Taylor PhD.
Ensembl Core team
EBI

On 14/02/2013 12:08, Adam Witney wrote:
>
> Thanks, i am working with bacteria so the number of genes and therefore
> memory usage is not such a problem, but I will give your method a try
> also (I hadn't really got involved with sending bare SQL queries yet, so
> its interesting to see how that can be done).
>
> Thanks again
>
> Adam
>
> On 13/02/2013 11:15, ian Longden wrote:
>> The problem with this method (it works fine though) is that it needs to
>> get all the genes which will take up a lot of memory.
>> An alternative way of doing this is to use the db connection itself and
>> use this to query the database directly only getting those genes you
>> want..
>> Here is an example:-
>> -----------------------------------------------------------------------------------------------------------------
>>
>> use Bio::EnsEMBL::Registry;
>> use Bio::SeqIO;
>> use strict;
>> my $reg = "Bio::EnsEMBL::Registry";
>>
>> my $species = "human";
>> my $search_str = "%breast cancer%";
>>
>> $reg->load_registry_from_db(
>>      -host => 'ensembldb.ensembl.org <http://ensembldb.ensembl.org>',
>>      -user => 'anonymous');
>>
>> my $core_ga = $reg->get_adaptor($species,"core","gene");
>>
>>
>> my $sql = 'Select gene_id from gene where description like ?';
>> my $sth = $core_ga->dbc->prepare($sql) || die "Could not prepare $sql
>> for core database";
>> $sth->execute($search_str) or croak( $core_ga->dbc->errstr() );
>>
>> while (my $gene_id = $sth->fetchrow_array()){
>>    my $gene = $core_ga->fetch_by_dbID($gene_id);
>>    print $gene->stable_id."\t".$gene->description."\n";
>> }
>> -----------------------------------------------------------------------------------------------------------------
>>
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Feb 13, 2013 at 11:09 AM, Adam Witney <awitney at sgul.ac.uk
>> <mailto:awitney at sgul.ac.uk>> wrote:
>>
>>
>>     Thanks Andy, in the meantime, I just realised, of course I can just
>>     do this:
>>
>>     my $genes = $dba->get_GeneAdaptor()->__fetch_all();
>>
>>     foreach my $gene ( @{$genes} ) {
>>              if ( $gene->description && $gene->description =~
>> m/$string/ ) {
>>                      # do stuff
>>              }
>>     }
>>
>>     Thanks again
>>
>>     Adam
>>
>>
>>     On 13/02/2013 10:30, Andy Yates wrote:
>>
>>         Hi Adam,
>>
>>         Currently we do not support this. The actual code which
>>         implements this search is in the DBEntryAdaptor and would
>>         require edits in there. I can see a point in having this kind of
>>         functionality so we will put it on our to-do list.
>>
>>         Andy
>>
>>         On 13 Feb 2013, at 10:24, Adam Witney <awitney at sgul.ac.uk
>>         <mailto:awitney at sgul.ac.uk>> wrote:
>>
>>
>>             Hi,
>>
>>             Is there a method to search the Gene description field. I am
>>             currently using fetch_all_by_external_name, but this only
>>             works if the gene has been annotated with gyrA:
>>
>>             my $genes =
>>
>> $dba->get_GeneAdaptor()->__fetch_all_by_external_name('__gyrA');
>>
>>             What I would like to be able to do is search the data
>>             returned by the description method ($gene->description), is
>>             this possible? eg something like this:
>>
>>             my $genes =
>>
>> $dba->get_GeneAdaptor()->__fetch_all_by_description('%__gyrase%');
>>
>>             Thanks
>>
>>             Adam
>>
>>             _________________________________________________






More information about the Dev mailing list