[ensembl-dev] Problem with genome_member_copy (Compara).
muffato
muffato at ebi.ac.uk
Mon Jan 15 16:52:58 GMT 2018
Hi there,
I think I know what's going on. See there are two spaces between between
"The" and "species-set" in "The species-set could not be found in the
master database" ? There is supposed to be the content of the set there,
but it's an empty string. Probably because the species-set it needs to
use is actually empty. In other words, the pipeline may be requiring an
empty species-set to be in the master database too. Give it a try by
creating a new row in species_set_header with (17, 'empty', 0, NULL,
NULL) and rerunning the job
Matthieu
On 2018-01-15 16:28, Wasiu Akanni wrote:
> I would suggest setting the first release to the ensembl code version
> that you are using.
>
> On 15/01/2018 16:22, Francesco Lamanna wrote:
>
>> Hi Wasiu,
>>
>> this is the output of the species_set_header table:
>>
>> mysql> SELECT * FROM species_set_header;
>>
> +----------------+-------------------------------+------+---------------+--------------+
>> | species_set_id | name | size |
>> first_release | last_release |
>>
> +----------------+-------------------------------+------+---------------+--------------+
>> | 1 | H.sap-G.gal | 2 |
>> NULL | NULL |
>> | 2 | H.sap-B.flo | 2 |
>> NULL | NULL |
>> | 3 | H.sap-P.mar | 2 |
>> NULL | NULL |
>> | 4 | H.sap-C.mil [1] | 2 |
>> NULL | NULL |
>> | 5 | G.gal-B.flo | 2 |
>> NULL | NULL |
>> | 6 | G.gal-P.mar | 2 |
>> NULL | NULL |
>> | 7 | G.gal-C.mil [2] | 2 |
>> NULL | NULL |
>> | 8 | B.flo-P.mar | 2 |
>> NULL | NULL |
>> | 9 | B.flo-C.mil [3] | 2 |
>> NULL | NULL |
>> | 10 | P.mar-C.mil [4] | 2 |
>> NULL | NULL |
>> | 11 | H.sap | 1 |
>> NULL | NULL |
>> | 12 | G.gal | 1 |
>> NULL | NULL |
>> | 13 | B.flo | 1 |
>> NULL | NULL |
>> | 14 | P.mar | 1 |
>> NULL | NULL |
>> | 15 | C.mil | 1 |
>> NULL | NULL |
>> | 16 | H.sap-G.gal-B.flo-P.mar-C.mil [5] | 5 |
>> NULL | NULL |
>>
> +----------------+-------------------------------+------+---------------+--------------+
>> 16 rows in set (0.05 sec)
>>
>> It looks ok
>>
>> 2018-01-15 17:19 GMT+01:00 Wasiu Akanni <waakanni at ebi.ac.uk>:
>>
>> Hi Francesco,
>>
>> Have you checked the species_set_header table?
>>
>> On 15/01/2018 14:39, Francesco Lamanna wrote:
>>
>> Hi Matthieu,
>>
>> thank you for your suggestion, it fixed the problem.
>>
>> However, I still get the following error:
>>
>> mysql> SELECT * FROM msg;
>>
> +-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------+----------+
>> | analysis_id | logic_name | log_message_id |
>> job_id | role_id | worker_id | when_logged | retry | status
>> | msg |
>> is_error |
>>
> +-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------+----------+
>> | 9 | copy_ncbi_table | 1 |
>> 5 | 3 | 3 | 2018-01-15 14:49:29 | 0 |
>> WRITE_OUTPUT | Successfully copied 1646504 'ncbi_taxa_node' rows
>> | 0 |
>> | 9 | copy_ncbi_table | 2 |
>> 6 | 4 | 4 | 2018-01-15 14:50:36 | 0 |
>> WRITE_OUTPUT | Successfully copied 2504391 'ncbi_taxa_name' rows
>> | 0 |
>> | 10 | populate_method_links_from_db | 3 |
>> 7 | 6 | 6 | 2018-01-15 14:52:06 | 0 |
>> WRITE_OUTPUT | Successfully copied 19 'method_link' rows
>> | 0 |
>> | 16 | create_mlss_ss | 4 |
>> 10 | 9 | 9 | 2018-01-15 14:55:10 | 0 |
>> WRITE_OUTPUT | The species-set could not be found in the master
>> database | 1 |
>> | 16 | create_mlss_ss | 5 |
>> 10 | 10 | 10 | 2018-01-15 14:56:11 | 1 |
>> WRITE_OUTPUT | The species-set could not be found in the master
>> database | 1 |
>> | 16 | create_mlss_ss | 6 |
>> 10 | 11 | 11 | 2018-01-15 14:57:12 | 2 |
>> WRITE_OUTPUT | The species-set could not be found in the master
>> database | 1 |
>> | 16 | create_mlss_ss | 7 |
>> 10 | 12 | 12 | 2018-01-15 14:58:21 | 3 |
>> WRITE_OUTPUT | The species-set could not be found in the master
>> database | 1 |
>>
> +-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------+----------+
>> 7 rows in set (0.01 sec)
>>
>> But my species_set in the master_db is not empty:
>>
>> mysql> SELECT * FROM species_set;
>> +----------------+--------------+
>> | species_set_id | genome_db_id |
>> +----------------+--------------+
>> | 1 | 1 |
>> | 1 | 2 |
>> | 2 | 1 |
>> | 2 | 3 |
>> | 3 | 1 |
>> | 3 | 4 |
>> | 4 | 1 |
>> | 4 | 5 |
>> | 5 | 2 |
>> | 5 | 3 |
>> | 6 | 2 |
>> | 6 | 4 |
>> | 7 | 2 |
>> | 7 | 5 |
>> | 8 | 3 |
>> | 8 | 4 |
>> | 9 | 3 |
>> | 9 | 5 |
>> | 10 | 4 |
>> | 10 | 5 |
>> | 11 | 1 |
>> | 12 | 2 |
>> | 13 | 3 |
>> | 14 | 4 |
>> | 15 | 5 |
>> | 16 | 1 |
>> | 16 | 2 |
>> | 16 | 3 |
>> | 16 | 4 |
>> | 16 | 5 |
>> +----------------+--------------+
>> 30 rows in set (0.00 sec)
>>
>> I am quite puzzled by this error.
>>
>> Cheers,
>> Francesco.
>>
>> 2018-01-12 17:48 GMT+01:00 Matthieu Muffato <muffato at ebi.ac.uk>:
>> Hi Francesco
>>
>> Homoeologues are only used when running on plant genomes (which have
>> polyploid genomes), but the pipeline configuration is shared and
>> expects this method_link to be present
>>
>> This how it looks in the Ensembl Plants database. You can insert
>> this row in your database and it should work
>>
>> ensro at mysql-eg-publicsql.ebi.ac.uk:4157/ensembl_compara_plants_38_91
>> [6] [Fri Jan 12 16:46:32 2018] > SELECT * FROM method_link WHERE
>> method_link_id = 206;
>> +----------------+----------------------+-------------------+
>> | method_link_id | type | class |
>> +----------------+----------------------+-------------------+
>> | 206 | ENSEMBL_HOMOEOLOGUES | Homology.homology |
>> +----------------+----------------------+-------------------+
>>
>> Regards,
>> Matthieu
>>
>> On 12/01/18 14:01, Francesco Lamanna wrote:
>> Hi Mateus,
>>
>> I could solve this problem by commenting out the line: "die "The
>> master dabase must be defined with a collection" if
>> $self->o('master_db') and not $self->o('collection');"
>>
>> in LoadMembers_conf.pm. The member_db is now correctly set up.
>>
>> However, when I run the Protein trees pipeline I get the following
>> error message
>>
>> mysql> SELECT * FROM msg;
>>
> +-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
>> | analysis_id | logic_name | log_message_id |
>> job_id | role_id | worker_id | when_logged | retry | status
>> | msg
>>
>> | is_error |
>>
> +-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
>> | 9 | copy_ncbi_table | 1 |
>> 5 | 3 | 3 | 2018-01-12 14:21:39 | 0 |
>> WRITE_OUTPUT | Successfully copied 1646504 'ncbi_taxa_node' rows
>>
>> | 0 |
>> | 9 | copy_ncbi_table | 2 |
>> 6 | 4 | 4 | 2018-01-12 14:25:27 | 0 |
>> WRITE_OUTPUT | Successfully copied 2504391 'ncbi_taxa_name' rows
>>
>> | 0 |
>> | 10 | populate_method_links_from_db | 3 |
>> 7 | 6 | 6 | 2018-01-12 14:26:54 | 0 |
>> WRITE_OUTPUT | Successfully copied 18 'method_link' rows
>>
>> | 0 |
>> | 16 | create_mlss_ss | 4 |
>> 10 | 9 | 9 | 2018-01-12 14:29:56 | 0 | FETCH_INPUT
>> | Cannot find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm
>> line 70. | 1 |
>> | 16 | create_mlss_ss | 5 |
>> 10 | 10 | 10 | 2018-01-12 14:30:57 | 1 | FETCH_INPUT
>> | Cannot find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm
>> line 70. | 1 |
>> | 16 | create_mlss_ss | 6 |
>> 10 | 11 | 11 | 2018-01-12 14:31:58 | 2 | FETCH_INPUT
>> | Cannot find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm
>> line 70. | 1 |
>> | 16 | create_mlss_ss | 7 |
>> 10 | 12 | 12 | 2018-01-12 14:33:00 | 3 | FETCH_INPUT
>> | Cannot find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm
>> line 70. | 1 |
>> | 16 | create_mlss_ss | 8 |
>> 10 | 13 | 13 | 2018-01-12 14:38:36 | 1 | FETCH_INPUT
>> | Cannot find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm
>> line 70. | 1 |
>>
> +-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
>> 8 rows in set (0.01 sec)
>>
>> I wasn't aware about this method_link.
>>
>> Do you know how can I fix this?
>>
>> Thanks,
>> Francesco.
>>
>> 2018-01-11 13:28 GMT+01:00 Francesco Lamanna
>> <francesco.lamanna at gmail.com <mailto:francesco.lamanna at gmail.com>>:
>>
>> Hi Mateus,
>>
>> if I try to initialize the LoadMemebers pipeline without
>> “--collection ensembl”, I get the following error:
>>
>> The following options are missing:
>> {'collection'}
>>
>> I also tried to set 'collection' => undef, in the conf file, but
>> I
>> get another error:
>>
>> The master dabase must be defined with a collection at
>>
>>
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/PipeConfig/LoadMembers_conf.pm
>> line 190.
>>
>> Cheers,
>> Francesco
>>
>> 2018-01-11 12:01 GMT+01:00 Mateus Patricio <mateus at ebi.ac.uk
>> <mailto:mateus at ebi.ac.uk>>:
>>
>> Hi Francesco,
>>
>> In your case the solution should be starting the LoadMembers
>> without the option “--collection ensembl”.
>>
>> If you start the pipeline without it, it should use all the
>> current species in your master database.
>>
>> In Ensembl we have different collections that are used for
>> different purposes, and the default one is “ensembl”.
>>
>> Please let me know if this works.
>>
>> Cheers,
>>
>> Mateus.
>>
>> On 11 Jan 2018, at 10:13, Francesco Lamanna
>> <francesco.lamanna at gmail.com
>>
>> <mailto:francesco.lamanna at gmail.com>> wrote:
>>
>> Hi Mateus,
>>
>> many thanks for your answer.
>>
>> I am trying to launch the LoadMembers pipeline in order to
>> make a member_db, but I get the following error:
>>
>> mysql> SELECT * from msg;
>>
>>
> +-------------+------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
>> | analysis_id | logic_name | log_message_id |
>> job_id | role_id | worker_id | when_logged | retry |
>> status | msg | is_error |
>>
>>
> +-------------+------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
>> | 3 | copy_table_from_master | 1 |
>> 4 | 3 | 3 | 2018-01-11 11:04:02 | 0 |
>> WRITE_OUTPUT | Successfully copied 1646504 'ncbi_taxa_node'
>> rows | 0 |
>> | 3 | copy_table_from_master | 2 |
>> 5 | 4 | 4 | 2018-01-11 11:05:46 | 0 |
>> WRITE_OUTPUT | Successfully copied 2504391 'ncbi_taxa_name'
>> rows | 0 |
>> | 4 | load_genomedb_factory | 3 |
>> 3 | 5 | 5 | 2018-01-11 11:07:06 | 0 |
>> FETCH_INPUT | Could not fetch collection ss with
>> name=ensembl
>> at
>>
>>
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>> line 106. | 1 |
>> | 4 | load_genomedb_factory | 4 |
>> 3 | 6 | 6 | 2018-01-11 11:08:00 | 1 |
>> FETCH_INPUT | Could not fetch collection ss with
>> name=ensembl
>> at
>>
>>
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>> line 106. | 1 |
>> | 4 | load_genomedb_factory | 5 |
>> 3 | 7 | 7 | 2018-01-11 11:08:25 | 2 |
>> FETCH_INPUT | Could not fetch collection ss with
>> name=ensembl
>> at
>>
>>
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>> line 106. | 1 |
>> | 4 | load_genomedb_factory | 6 |
>> 3 | 8 | 8 | 2018-01-11 11:09:25 | 3 |
>> FETCH_INPUT | Could not fetch collection ss with
>> name=ensembl
>> at
>>
>>
> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>> line 106. | 1 |
>>
>>
> +-------------+------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
>> 6
>>
>> Cheers,
>> Francesco.
>>
>> 2018-01-10 17:31 GMT+01:00 Mateus Patricio <mateus at ebi.ac.uk
>> <mailto:mateus at ebi.ac.uk>>:
>>
>> Hi Francesco
>>
>> The protein tree pipeline reuses the genes and sequence
>> members from the 'reuse_db' parameter, which in this
>> case
>> should point to a members database.
>>
>> This members database can be created by running the
>> LoadMembers pipeline.
>>
>> You can initiate the pipeline with the following command
>> line:
>>
>> init_pipeline.pl [7] <http://init_pipeline.pl/>
>>
>> Bio::EnsEMBL::Compara::PipeConfig::EBI::Ensembl::LoadMembers_conf
>> --collection ensembl
>>
>> Then you should point the parameter reuse_db to this
>> database on your Protein Tree config file.
>>
>> 'reuse_db' => 'mysql://ensro@host:port/database',
>>
>> Please do let me know if you have further questions.
>>
>> Cheers,
>>
>> Mateus.
>>
>> On 10 Jan 2018, at 16:06, Francesco Lamanna
>> <francesco.lamanna at gmail.com
>>
>> <mailto:francesco.lamanna at gmail.com>> wrote:
>>
>> Hi all,
>>
>> when I try to run the protein tree pipeline (v91) using
>> the core Human and Chicken genomes I get the following
>> error message:
>>
>> mysql> SELECT * from msg;
>>
>>
> +-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------------------------------+----------+
>> | analysis_id | logic_name |
>> log_message_id | job_id | role_id | worker_id |
>> when_logged | retry | status | msg
>> | is_error |
>>
>>
> +-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------------------------------+----------+
>> | 9 | copy_ncbi_table
>> | 1 | 5 | 3 | 4 |
>> 2018-01-10 16:40:13 | 0 | WRITE_OUTPUT |
>> Successfully
>> copied 1646504 'ncbi_taxa_node' rows | 0
>> |
>> | 9 | copy_ncbi_table
>> | 2 | 6 | 4 | 3 |
>> 2018-01-10 16:42:04 | 0 | WRITE_OUTPUT |
>> Successfully
>> copied 2504391 'ncbi_taxa_name' rows | 0
>> |
>> | 10 | populate_method_links_from_db
>> | 3 | 7 | 6 | 6 |
>> 2018-01-10 16:43:44 | 0 | WRITE_OUTPUT |
>> Successfully
>> copied 18 'method_link' rows | 0
>> |
>> | 25 | genome_member_copy
>> | 4 | 13 | 11 | 11 |
>> 2018-01-10 16:47:49 | 0 | FETCH_INPUT | ParamError:
>> value for param_required('reuse_db') is required and has
>> to be defined | 1 |
>> | 25 | genome_member_copy
>> | 5 | 14 | 12 | 12 |
>> 2018-01-10 16:47:49 | 0 | FETCH_INPUT | ParamError:
>> value for param_required('reuse_db') is required and has
>> to be defined | 1 |
>>
>>
> +-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------------------------------+----------+
>>
>> I have no clue about what to put in the 'reuse_db'
>> parameter (nor could I find any information in the
>> compara docs).
>>
>> Can anyone please help me to solve this issue?
>>
>> Thanks,
>> Francesco.
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev [8]
>> <http://lists.ensembl.org/mailman/listinfo/dev [8]>
>> Ensembl Blog: http://www.ensembl.info/
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev [8]
>> <http://lists.ensembl.org/mailman/listinfo/dev [8]>
>> Ensembl Blog: http://www.ensembl.info/
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev [8]
>> <http://lists.ensembl.org/mailman/listinfo/dev [8]>
>> Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev [8]
> <http://lists.ensembl.org/mailman/listinfo/dev [8]>
> Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev [8]
> Ensembl Blog: http://www.ensembl.info/
>
> --
> Matthieu Muffato, Ph.D.
> Ensembl Compara and TreeFam Project Leader
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Wellcome Trust Genome Campus, Hinxton
> Cambridge, CB10 1SD, United Kingdom
> Room A3-145
> Phone + 44 (0) 1223 49 4631 [9]
> Fax + 44 (0) 1223 49 4468 [10]
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev [8]
> Ensembl Blog: http://www.ensembl.info/
>
> --
> WASIU AJENIFUJA AKANNI
> Developer
>
> Compara group
> EMBL-EBI
> Phone: + 44 (0) 1223 494 237 [11]
> Room A3145, West building | Wellcome Trust Genome Campus | Hinxton |
> Cambridge | CB10 1SD | UK
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev [8]
> Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
> --
> WASIU AJENIFUJA AKANNI
> Developer
>
> Compara group
> EMBL-EBI
> Phone: + 44 (0) 1223 494 237
> Room A3145, West building | Wellcome Trust Genome Campus | Hinxton |
> Cambridge | CB10 1SD | UK
>
> Links:
> ------
> [1] http://H.sap-C.mil
> [2] http://G.gal-C.mil
> [3] http://B.flo-C.mil
> [4] http://P.mar-C.mil
> [5] http://H.sap-G.gal-B.flo-P.mar-C.mil
> [6]
> http://ensro@mysql-eg-publicsql.ebi.ac.uk:4157/ensembl_compara_plants_38_91
> [7] http://init_pipeline.pl
> [8] http://lists.ensembl.org/mailman/listinfo/dev
> [9] tel:%2B%2044%20%280%29%201223%2049%204631
> [10] tel:%2B%2044%20%280%29%201223%2049%204468
> [11] tel:+44%201223%20494237
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
More information about the Dev
mailing list