[ensembl-dev] Problem with locator in LoadMembers.pm (ensembl-compara)

Matthieu Muffato muffato at ebi.ac.uk
Tue Jan 30 12:56:02 GMT 2018


Hi Franceso

As you have realised we very rarely use non-Ensembl species in the 
pipeline, and you're facing all the things that break due to other 
developments.
Basically, the answer is often "comment out" the line that's causing the 
error. In this case, I would comment out the two lines that define/use 
$was_connected and give it a try.

Matthieu

On 25/01/18 16:49, Francesco Lamanna wrote:
> Hi Mateus,
> 
> I am using release/91 of GenomeDB.pm (it misses 
> Bio::EnsEMBL::Compara::Utils::CoreDBAdaptor->pool_one_DBConnection($dba); at 
> line 184).
> 
> Anyway, if I comment lines 181 and 182 I get the following error:
> 
> mysql> SELECT * FROM msg;
> +-------------+------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
> | analysis_id | logic_name             | log_message_id | job_id | 
> role_id | worker_id | when_logged         | retry | status       | 
> msg                                                                                                                                                      
> | is_error |
> +-------------+------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
> |           3 | copy_table_from_master |              1 |      4 |       
> 3 |         3 | 2018-01-25 17:19:47 |     0 | WRITE_OUTPUT | 
> Successfully copied 1646504 'ncbi_taxa_node' 
> rows                                                                                                        
> |        0 |
> |           3 | copy_table_from_master |              2 |      5 |       
> 4 |         4 | 2018-01-25 17:23:03 |     0 | WRITE_OUTPUT | 
> Successfully copied 2504391 'ncbi_taxa_name' 
> rows                                                                                                        
> |        0 |
> |           5 | load_genomedb          |              3 |      9 |       
> 7 |         7 | 2018-01-25 17:24:19 |     0 | RUN          | Can't call 
> method "connected" on an undefined value at 
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/GenomeDB.pm 
> line 186. |        1 |
> |           5 | load_genomedb          |              4 |     10 |       
> 7 |         7 | 2018-01-25 17:24:21 |     0 | RUN          | Can't call 
> method "connected" on an undefined value at 
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/GenomeDB.pm 
> line 186. |        1 |
> |           5 | load_genomedb          |              5 |      9 |       
> 6 |         6 | 2018-01-25 17:24:26 |     1 | RUN          | Can't call 
> method "connected" on an undefined value at 
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/GenomeDB.pm 
> line 186. |        1 |
> |           5 | load_genomedb          |              6 |     10 |       
> 6 |         6 | 2018-01-25 17:24:27 |     1 | RUN          | Can't call 
> method "connected" on an undefined value at 
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/GenomeDB.pm 
> line 186. |        1 |
> |           5 | load_genomedb          |              7 |      9 |       
> 8 |         8 | 2018-01-25 17:25:19 |     2 | RUN          | Can't call 
> method "connected" on an undefined value at 
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/GenomeDB.pm 
> line 186. |        1 |
> |           5 | load_genomedb          |              8 |     10 |       
> 8 |         8 | 2018-01-25 17:25:20 |     2 | RUN          | Can't call 
> method "connected" on an undefined value at 
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/GenomeDB.pm 
> line 186. |        1 |
> +-------------+------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
> 
> It looks like GenomeDB is now missing something.
> 
> Additionally, if I run the analysis in debug mode, I get these two warnings:
> 
> ParamWarning: value for param('genome_component') is used before having 
> been initialized!
> ParamWarning: value for param('master_genome_db') is used before having 
> been initialized!
> Worker 10 [ Role 10 , load_genomedb(5), Job 10 ] Fatal : Can't call 
> method "connected" on an undefined value at 
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/GenomeDB.pm 
> line 186.
> 
> Thanks,
> Francesco.
> 
> 
> 2018-01-25 15:39 GMT+01:00 Mateus Patricio <mateus at ebi.ac.uk 
> <mailto:mateus at ebi.ac.uk>>:
> 
>     Hi Francesco,
> 
>     After discussing this issue with my colleagues.
> 
>     Could you please test something?
> 
>     In the file:
>     modules/Bio/EnsEMBL/Compara/GenomeDB.pm
> 
>     Could you please comment lines 181, 182 and 184?
> 
>     Just like this:
> 
>>     #assert_ref($dba, 'Bio::EnsEMBL::DBSQL::DBAdaptor', 'db_adaptor');
>>     #throw('$db_adaptor must refer to a Core database') unless
>>     $dba->group eq 'core';
>>     $self->{'_db_adaptor'} = $dba;
>>     #Bio::EnsEMBL::Compara::Utils::CoreDBAdaptor->pool_one_DBConnection($dba);
> 
>     I hope that helps.
> 
>     Cheers,
> 
>     Mateus.
> 
> 
> 
>>     On 23 Jan 2018, at 15:39, Francesco Lamanna
>>     <francesco.lamanna at gmail.com <mailto:francesco.lamanna at gmail.com>>
>>     wrote:
>>
>>     Hi Mateus,
>>
>>     I have changed the order of some entries in the json file, it
>>     looks now like this:
>>
>>     [
>>     {
>>         "production_name" : "branchiostoma_floridae",
>>         "taxonomy_id"        : "7739",
>>         "cds_fasta"             :
>>     "/home/hd/hd_hd/hd_cc141/Genomes/amphioxus/NCBI/amphioxus_longest_CDS.fa",
>>         "prot_fasta"            :
>>     "/home/hd/hd_hd/hd_cc141/Genomes/amphioxus/NCBI/amphioxus_longest_proteins.fa",
>>         "gene_coord_gff"   :
>>     "/home/hd/hd_hd/hd_cc141/Genomes/amphioxus/NCBI/GCF_000003815.1_Version_2_genomic.gff",
>>         "source"                 : "refseq",
>>     },
>>     {
>>         "production_name" : "petromyzon_marinus",
>>         "taxonomy_id"        : "7757",
>>         "cds_fasta"             :
>>     "/home/hd/hd_hd/hd_cc141/Lamprey_annotation/germ_final/PMZ_v3.1_final/sea_lamprey_CDS_no_monoexon.fa",
>>         "prot_fasta"            :
>>     "/home/hd/hd_hd/hd_cc141/Lamprey_annotation/germ_final/PMZ_v3.1_final/sea_lamprey_proteins_no_monoexon.fa",
>>         "gene_coord_gff"   :
>>     "/home/hd/hd_hd/hd_cc141/Lamprey_annotation/germ_final/PMZ_v3.1_final/PMZ_v3.1_genes.gtf",
>>         "source"                 : "augustus_maker",
>>     },
>>     ]
>>
>>     but I get a different error message:
>>
>>     Worker 10 [ Role 10 , load_genomedb(5), Job 9 ] -> FETCH_INPUT
>>     Worker 10 [ Role 10 , load_genomedb(5), Job 9 ] -> RUN
>>     ParamWarning: value for param('genome_component') is used before
>>     having been initialized!
>>     ParamWarning: value for param('master_genome_db') is used before
>>     having been initialized!
>>     Worker 10 [ Role 10 , load_genomedb(5), Job 9 ] Fatal :
>>     -------------------- EXCEPTION --------------------
>>     MSG: db_adaptor's type 'Bio::EnsEMBL::Compara::GenomeMF' is not an
>>     ISA of 'Bio::EnsEMBL::DBSQL::DBAdaptor'
>>     STACK Bio::EnsEMBL::Utils::Scalar::assert_ref_pp
>>     /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl/modules/Bio/EnsEMBL/Utils/Scalar.pm:231
>>     STACK Bio::EnsEMBL::Compara::GenomeDB::db_adaptor
>>     /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/GenomeDB.pm:181
>>     STACK Bio::EnsEMBL::Compara::GenomeDB::new_from_DBAdaptor
>>     /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/GenomeDB.pm:146
>>     STACK
>>     Bio::EnsEMBL::Compara::RunnableDB::LoadOneGenomeDB::create_genome_db
>>     /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/LoadOneGenomeDB.pm:204
>>     STACK Bio::EnsEMBL::Compara::RunnableDB::LoadOneGenomeDB::run
>>     /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/LoadOneGenomeDB.pm:186
>>     STACK (eval)
>>     /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Process.pm:140
>>     STACK Bio::EnsEMBL::Hive::Process::life_cycle
>>     /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Process.pm:127
>>     STACK (eval)
>>     /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Worker.pm:688
>>     STACK Bio::EnsEMBL::Hive::Worker::run_one_batch
>>     /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Worker.pm:674
>>     STACK Bio::EnsEMBL::Hive::Worker::run
>>     /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Worker.pm:486
>>     STACK (eval)
>>     /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Scripts/RunWorker.pm:88
>>     STACK Bio::EnsEMBL::Hive::Scripts::RunWorker::runWorker
>>     /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Scripts/RunWorker.pm:94
>>     STACK main::main
>>     /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/scripts/runWorker.pl:140
>>     STACK toplevel
>>     /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/scripts/runWorker.pl:24
>>     Date (localtime)    = Tue Jan 23 16:29:39 2018
>>     Ensembl API version = 91
>>     ---------------------------------------------------
>>
>>     Do you have any idea?
>>
>>     Cheers,
>>     Francesco
>>
>>     2018-01-23 13:43 GMT+01:00 Mateus Patricio <mateus at ebi.ac.uk
>>     <mailto:mateus at ebi.ac.uk>>:
>>
>>         Hi Francesco,
>>
>>         Could you please, check if your non_ensembl_species.json file
>>         is formatted like the one bellow:
>>
>>         [
>>         {
>>             "production_name" : "species_1",
>>             "taxonomy_id"     : "1234",
>>             "cds_fasta"       : "species_1.cds.fa",
>>             "prot_fasta"      : "species_1.prot.fa",
>>             "gene_coord_gff"  : "species_1.gff",
>>             "source"          : "your_source",
>>         },
>>         {
>>             "production_name" : "species_2",
>>             "taxonomy_id"     : "4321",
>>             "cds_fasta"       : "species_2.cds.fa",
>>             "prot_fasta"      : "species_2.prot.fa",
>>             "gene_coord_gff"  : "species_2.gff",
>>             "source"          : "your_source",
>>         },
>>         ]
>>
>>         Cheers,
>>
>>         Mateus.
>>
>>
>>>         On 23 Jan 2018, at 10:03, Francesco Lamanna
>>>         <francesco.lamanna at gmail.com
>>>         <mailto:francesco.lamanna at gmail.com>> wrote:
>>>
>>>         Hi all,
>>>
>>>         I am trying to build a members db using a set of core and
>>>         custom genome assemblies. I have set up the locator column
>>>         for both the core and custom databases in the genome_db table
>>>         of the master database. However, when I run the LoadMembers
>>>         pipeline, the load_genomedb analysis fails to load the custom
>>>         genomes:
>>>
>>>         mysql> SELECT * FROM msg;
>>>         +-------------+------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
>>>         | analysis_id | logic_name             | log_message_id |
>>>         job_id | role_id | worker_id | when_logged         | retry |
>>>         status       | msg     | is_error |
>>>         +-------------+------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
>>>         |           3 | copy_table_from_master |              1
>>>         |      4 |       3 |         3 | 2018-01-22 18:00:51 |     0
>>>         | WRITE_OUTPUT | Successfully copied 1646504 'ncbi_taxa_node'
>>>         rows                    |        0 |
>>>         |           3 | copy_table_from_master |              2
>>>         |      5 |       4 |         4 | 2018-01-22 18:03:12 |     0
>>>         | WRITE_OUTPUT | Successfully copied 2504391 'ncbi_taxa_name'
>>>         rows                    |        0 |
>>>         |           5 | load_genomedb          |              3 |    
>>>         10 |       7 |         7 | 2018-01-22 18:05:17 |     0 |
>>>         FETCH_INPUT  | Sorry, could not figure out how to make a
>>>         DBConnection object out of
>>>         'Bio::EnsEMBL::Compara::GenomeMF/filename=/home/hd/hd_hd/hd_cc141/Vertebrates_project/Ensembl_compara/non_ensembl_species.json;index=2'
>>>         at
>>>         /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Utils.pm
>>>         line 366. |        1 |
>>>         |           5 | load_genomedb          |              4 |    
>>>         10 |       7 |         7 | 2018-01-22 18:05:18 |     1 |
>>>         FETCH_INPUT  | Sorry, could not figure out how to make a
>>>         DBConnection object out of
>>>         'Bio::EnsEMBL::Compara::GenomeMF/filename=/home/hd/hd_hd/hd_cc141/Vertebrates_project/Ensembl_compara/non_ensembl_species.json;index=2'
>>>         at
>>>         /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Utils.pm
>>>         line 366. |        1 |
>>>         |           5 | load_genomedb          |              5
>>>         |      9 |       6 |         6 | 2018-01-22 18:05:19 |     0
>>>         | FETCH_INPUT  | Sorry, could not figure out how to make a
>>>         DBConnection object out of
>>>         'Bio::EnsEMBL::Compara::GenomeMF/filename=/home/hd/hd_hd/hd_cc141/Vertebrates_project/Ensembl_compara/non_ensembl_species.json;index=1'
>>>         at
>>>         /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Utils.pm
>>>         line 366. |        1 |
>>>         |           5 | load_genomedb          |              6
>>>         |      9 |       6 |         6 | 2018-01-22 18:05:20 |     1
>>>         | FETCH_INPUT  | Sorry, could not figure out how to make a
>>>         DBConnection object out of
>>>         'Bio::EnsEMBL::Compara::GenomeMF/filename=/home/hd/hd_hd/hd_cc141/Vertebrates_project/Ensembl_compara/non_ensembl_species.json;index=1'
>>>         at
>>>         /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Utils.pm
>>>         line 366. |        1 |
>>>         |           5 | load_genomedb          |              7
>>>         |      9 |       8 |         8 | 2018-01-22 18:06:08 |     2
>>>         | FETCH_INPUT  | Sorry, could not figure out how to make a
>>>         DBConnection object out of
>>>         'Bio::EnsEMBL::Compara::GenomeMF/filename=/home/hd/hd_hd/hd_cc141/Vertebrates_project/Ensembl_compara/non_ensembl_species.json;index=1'
>>>         at
>>>         /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Utils.pm
>>>         line 366. |        1 |
>>>         |           5 | load_genomedb          |              8 |    
>>>         10 |       8 |         8 | 2018-01-22 18:06:08 |     2 |
>>>         FETCH_INPUT  | Sorry, could not figure out how to make a
>>>         DBConnection object out of
>>>         'Bio::EnsEMBL::Compara::GenomeMF/filename=/home/hd/hd_hd/hd_cc141/Vertebrates_project/Ensembl_compara/non_ensembl_species.json;index=2'
>>>         at
>>>         /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Utils.pm
>>>         line 366. |        1 |
>>>         |           5 | load_genomedb          |              9
>>>         |      9 |      10 |        10 | 2018-01-22 18:08:49 |     1
>>>         | FETCH_INPUT  | Sorry, could not figure out how to make a
>>>         DBConnection object out of
>>>         'Bio::EnsEMBL::Compara::GenomeMF/filename=/home/hd/hd_hd/hd_cc141/Vertebrates_project/Ensembl_compara/non_ensembl_species.json;index=1'
>>>         at
>>>         /beegfs/home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-hive/modules/Bio/EnsEMBL/Hive/Utils.pm
>>>         line 366. |        1 |
>>>         +-------------+------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
>>>         9 rows in set (0.00 sec)
>>>
>>>         Any suggestion would be highly appreciated.
>>>
>>>         Cheers,
>>>         Francesco
>>>         _______________________________________________
>>>         Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>         Posting guidelines and subscribe/unsubscribe info:
>>>         http://lists.ensembl.org/mailman/listinfo/dev
>>>         <http://lists.ensembl.org/mailman/listinfo/dev>
>>>         Ensembl Blog: http://www.ensembl.info/
>>
>>
>>         _______________________________________________
>>         Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>         Posting guidelines and subscribe/unsubscribe info:
>>         http://lists.ensembl.org/mailman/listinfo/dev
>>         <http://lists.ensembl.org/mailman/listinfo/dev>
>>         Ensembl Blog: http://www.ensembl.info/
>>
>>
>>     _______________________________________________
>>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>     Posting guidelines and subscribe/unsubscribe info:
>>     http://lists.ensembl.org/mailman/listinfo/dev
>>     <http://lists.ensembl.org/mailman/listinfo/dev>
>>     Ensembl Blog: http://www.ensembl.info/
> 
> 
>     _______________________________________________
>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>     Posting guidelines and subscribe/unsubscribe info:
>     http://lists.ensembl.org/mailman/listinfo/dev
>     <http://lists.ensembl.org/mailman/listinfo/dev>
>     Ensembl Blog: http://www.ensembl.info/
> 
> 
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 

-- 
Matthieu Muffato, Ph.D.
Ensembl Compara and TreeFam Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
Room  A3-145
Phone + 44 (0) 1223 49 4631
Fax   + 44 (0) 1223 49 4468



More information about the Dev mailing list