[ensembl-dev] ensembl_id in DBEntry

Andy Yates ayates at ebi.ac.uk
Mon Mar 4 09:39:25 GMT 2013


Hi Nicole,

The issue here is that DBEntryAdaptor serves a dual purpose; the first is as you've previously used is to retrieve Xrefs from an Ensembl object. This is indicated by the ensembl_id and ensembl_object_type keys in the resulting object hash (from the join between xref and object_xref). However when you go in by source you are asking for all instances in the xref table without the join to object_xref. That's why you do not get the ensembl_id key in the hash. The API does not support the fetching of DBEntries linked to all Genes by a source.

Your available options are

1). Loop through all Genes and use fetch_all_by_Gene (as you said very expensive)

2). Use BioMart which has already de-normalised the entries and so will be very quick

3). Write and execute custom SQL against the database

We will consider if it is possible to perform this kind of query using the API but I cannot guarantee when (and if) it will be completed by.

Best regards,

Andy

Andrew Yates                   Ensembl Core Software Project Leader
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensembl.org/

On 1 Mar 2013, at 21:34, Nicole Washington <nlwashington at lbl.gov> wrote:

> Hi,
> 
> I want to fetch all DBEntries by a given source, say EntrezGene, and then locally make a gene-based hash of the entries.   My reason is that repeated queries of fetch_all_by_Gene is a very expensive operation time-wise, particularly when needing to fetch for all genes in a genome.
> 
> In order to to this, I need the ensembl ids linked to the DBEntry.  However, the objects I get from a fetch_all_by_Source don't seem to be delivering this info.
> 
> I'm using r70 of the ensembl API.
> 
> Here's a bit of my code:
> 
>  my $dbentries_by_source = {}; 
>  my %dbentries_by_ensembl_id = {};
> 
> 
>  print STDOUT "Fetching xrefs...\n";
>  my $xref_count=0;
> 
>  foreach my $xref (@xrefs_to_fetch) {
>    my $dbes = $dbentry_adaptor->fetch_all_by_source($xref);
>    print STDOUT @$dbes . " found.  Sorting...";
>    $dbentries_by_source->{$xref} = $dbes;
>    foreach my $dbe (@$dbes) {
>      print STDOUT $dbe->ensembl_id() . ", ";
>      if (!defined ($dbentries_by_ensembl_id{$dbe->ensembl_id()})) {
>        @{$dbentries_by_ensembl_id{$dbe->ensembl_id()}} = ();
>        print "c";  #feedback for making reference hash
>      }
>      push(@{$dbentries_by_ensembl_id{$dbe->ensembl_id()}},$dbe);
>      print "r"; #feedback for adding an reference element
>    }
> 
> Below you'll find what the Dumper of the first object returned is...you'll notice that there's no "ensembl_id" key-value pair in the DBEntry object.  
> 
> Am I going about this the wrong way?  Any hints?  Thanks in advance...
> 
> Nicole
> 
> $VAR1 = bless( {
>                 'priority' => '250',
>                 'adaptor' => bless( {
>                                       '_is_multispecies' => '',
>                                       'db' => bless( {
>                                                        'seq_region_cache' => bless( {
>                                                                                       'id_cache' => {
>                                                                                                       '27527' => [
>                                                                                                                    '27527',
>                                                                                                                    '5',
>                                                                                                                    '2',
>                                                                                                                    '180915260'
>                                                                                                                  ],
>                                                                                                       '27526' => [
>                                                                                                                    '27526',
>                                                                                                                    '19',
>                                                                                                                    '2',
>                                                                                                                    '59128983'
>                                                                                                                  ],
>                                                                                                       '27525' => [
>                                                                                                                    '27525',
>                                                                                                                    '10',
>                                                                                                                    '2',
>                                                                                                                    '135534747'
>                                                                                                                  ],
>                                                                                                       '27524' => [
>                                                                                                                    '27524',
>                                                                                                                    '4',
>                                                                                                                    '2',
>                                                                                                                    '191154276'
>                                                                                                                  ],
>                                                                                                       '27523' => [
>                                                                                                                    '27523',
>                                                                                                                    '8',
>                                                                                                                    '2',
>                                                                                                                    '146364022'
>                                                                                                                  ],
>                                                                                                       '27522' => [
>                                                                                                                    '27522',
>                                                                                                                    '20',
>                                                                                                                    '2',
>                                                                                                                    '63025520'
>                                                                                                                  ],
>                                                                                                       '27521' => [
>                                                                                                                    '27521',
>                                                                                                                    '15',
>                                                                                                                    '2',
>                                                                                                                    '102531392'
>                                                                                                                  ],
>                                                                                                       '27520' => [
>                                                                                                                    '27520',
>                                                                                                                    '14',
>                                                                                                                    '2',
>                                                                                                                    '107349540'
>                                                                                                                  ],
>                                                                                                       '27519' => [
>                                                                                                                    '27519',
>                                                                                                                    '12',
>                                                                                                                    '2',
>                                                                                                                    '133851895'
>                                                                                                                  ],
>                                                                                                       '27518' => [
>                                                                                                                    '27518',
>                                                                                                                    '9',
>                                                                                                                    '2',
>                                                                                                                    '141213431'
>                                                                                                                  ],
>                                                                                                       '27517' => [
>                                                                                                                    '27517',
>                                                                                                                    '3',
>                                                                                                                    '2',
>                                                                                                                    '198022430'
>                                                                                                                  ],
>                                                                                                       '27515' => [
>                                                                                                                    '27515',
>                                                                                                                    '6',
>                                                                                                                    '2',
>                                                                                                                    '171115067'
>                                                                                                                  ],
>                                                                                                       '27514' => [
>                                                                                                                    '27514',
>                                                                                                                    '16',
>                                                                                                                    '2',
>                                                                                                                    '90354753'
>                                                                                                                  ],
>                                                                                                       '27513' => [
>                                                                                                                    '27513',
>                                                                                                                    '13',
>                                                                                                                    '2',
>                                                                                                                    '115169878'
>                                                                                                                  ],
>                                                                                                       '27512' => [
>                                                                                                                    '27512',
>                                                                                                                    '18',
>                                                                                                                    '2',
>                                                                                                                    '78077248'
>                                                                                                                  ],
>                                                                                                       '27511' => [
>                                                                                                                    '27511',
>                                                                                                                    '1',
>                                                                                                                    '2',
>                                                                                                                    '249250621'
>                                                                                                                  ],
>                                                                                                       '27510' => [
>                                                                                                                    '27510',
>                                                                                                                    '22',
>                                                                                                                    '2',
>                                                                                                                    '51304566'
>                                                                                                                  ],
>                                                                                                       '27509' => [
>                                                                                                                    '27509',
>                                                                                                                    '17',
>                                                                                                                    '2',
>                                                                                                                    '81195210'
>                                                                                                                  ],
>                                                                                                       '27508' => [
>                                                                                                                    '27508',
>                                                                                                                    '2',
>                                                                                                                    '2',
>                                                                                                                    '243199373'
>                                                                                                                  ],
>                                                                                                       '27507' => [
>                                                                                                                    '27507',
>                                                                                                                    'Y',
>                                                                                                                    '2',
>                                                                                                                    '59373566'
>                                                                                                                  ],
>                                                                                                       '27516' => [
>                                                                                                                    '27516',
>                                                                                                                    'X',
>                                                                                                                    '2',
>                                                                                                                    '155270560'
>                                                                                                                  ],
>                                                                                                       '27506' => [
>                                                                                                                    '27506',
>                                                                                                                    '7',
>                                                                                                                    '2',
>                                                                                                                    '159138663'
>                                                                                                                  ],
>                                                                                                       '27505' => [
>                                                                                                                    '27505',
>                                                                                                                    '21',
>                                                                                                                    '2',
>                                                                                                                    '48129895'
>                                                                                                                  ],
>                                                                                                       '27504' => [
>                                                                                                                    '27504',
>                                                                                                                    '11',
>                                                                                                                    '2',
>                                                                                                                    '135006516'
>                                                                                                                  ],
>                                                                                                       '100965601' => [
>                                                                                                                        '100965601',
>                                                                                                                        'MT',
>                                                                                                                        '2',
>                                                                                                                        '16569'
>                                                                                                                      ]
>                                                                                                     },
>                                                                                       'name_cache' => {
>                                                                                                         '5:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27527'},
>                                                                                                         '19:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27526'},
>                                                                                                         '10:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27525'},
>                                                                                                         '4:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27524'},
>                                                                                                         '8:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27523'},
>                                                                                                         '20:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27522'},
>                                                                                                         '15:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27521'},
>                                                                                                         '14:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27520'},
>                                                                                                         '12:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27519'},
>                                                                                                         '9:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27518'},
>                                                                                                         '3:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27517'},
>                                                                                                         'X:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27516'},
>                                                                                                         '6:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27515'},
>                                                                                                         '16:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27514'},
>                                                                                                         '13:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27513'},
>                                                                                                         '18:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27512'},
>                                                                                                         '1:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27511'},
>                                                                                                         '22:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27510'},
>                                                                                                         '17:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27509'},
>                                                                                                         '2:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27508'},
>                                                                                                         '7:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27506'},
>                                                                                                         '21:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27505'},
>                                                                                                         '11:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27504'},
>                                                                                                         'MT:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'100965601'},
>                                                                                                         'Y:2' => $VAR1->{'adaptor'}{'db'}{'seq_region_cache'}{'id_cache'}{'27507'}
>                                                                                                       }
>                                                                                     }, 'Bio::EnsEMBL::Utils::SeqRegionCache' ),
>                                                        '_is_multispecies' => '',
>                                                        '_dbc' => bless( {
>                                                                           '_username' => 'anonymous',
>                                                                           'connected86253' => 1,
>                                                                           '_timeout' => 0,
>                                                                           '_host' => 'ensembldb.ensembl.org',
>                                                                           '_port' => '5306',
>                                                                           '_query_count' => 10,
>                                                                           '_driver' => 'mysql',
>                                                                           '_dbname' => 'homo_sapiens_core_70_37',
>                                                                           'db_handle86253' => bless( {}, 'DBI::db' )
>                                                                         }, 'Bio::EnsEMBL::DBSQL::DBConnection' ),
>                                                        '_species' => 'homo_sapiens',
>                                                        '_group' => 'core',
>                                                        '_species_id' => 1
>                                                      }, 'Bio::EnsEMBL::DBSQL::DBAdaptor' ),
>                                       'dbc' => $VAR1->{'adaptor'}{'db'}{'_dbc'},
>                                       'species_id' => 1
>                                     }, 'Bio::EnsEMBL::DBSQL::DBEntryAdaptor' ),
>                 'display_id' => 'A1BG',
>                 'primary_id' => '1',
>                 'version' => '0',
>                 'description' => 'alpha-1-B glycoprotein',
>                 'dbname' => 'EntrezGene',
>                 'dbID' => '936659',
>                 'synonyms' => [
>                                 'A1B'
>                               ],
>                 'info_text' => '',
>                 'info_type' => 'DEPENDENT',
>                 'type' => 'MISC',
>                 'db_display_name' => 'EntrezGene'
>               }, 'Bio::EnsEMBL::DBEntry' );
> 
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list