[ensembl-dev] Problem with genome_member_copy (Compara).
Francesco Lamanna
francesco.lamanna at gmail.com
Mon Jan 15 14:39:38 GMT 2018
Hi Matthieu,
thank you for your suggestion, it fixed the problem.
However, I still get the following error:
mysql> SELECT * FROM msg;
+-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------+----------+
| analysis_id | logic_name | log_message_id | job_id |
role_id | worker_id | when_logged | retry | status |
msg | is_error |
+-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------+----------+
| 9 | copy_ncbi_table | 1 | 5
| 3 | 3 | 2018-01-15 14:49:29 | 0 | WRITE_OUTPUT |
Successfully copied 1646504 'ncbi_taxa_node' rows | 0 |
| 9 | copy_ncbi_table | 2 | 6
| 4 | 4 | 2018-01-15 14:50:36 | 0 | WRITE_OUTPUT |
Successfully copied 2504391 'ncbi_taxa_name' rows | 0 |
| 10 | populate_method_links_from_db | 3 | 7
| 6 | 6 | 2018-01-15 14:52:06 | 0 | WRITE_OUTPUT |
Successfully copied 19 'method_link' rows | 0 |
| 16 | create_mlss_ss | 4 | 10
| 9 | 9 | 2018-01-15 14:55:10 | 0 | WRITE_OUTPUT | The
species-set could not be found in the master database | 1 |
| 16 | create_mlss_ss | 5 | 10
| 10 | 10 | 2018-01-15 14:56:11 | 1 | WRITE_OUTPUT | The
species-set could not be found in the master database | 1 |
| 16 | create_mlss_ss | 6 | 10
| 11 | 11 | 2018-01-15 14:57:12 | 2 | WRITE_OUTPUT | The
species-set could not be found in the master database | 1 |
| 16 | create_mlss_ss | 7 | 10
| 12 | 12 | 2018-01-15 14:58:21 | 3 | WRITE_OUTPUT | The
species-set could not be found in the master database | 1 |
+-------------+-------------------------------+----------------+--------+---------+-----------+---------------------+-------+--------------+------------------------------------------------------------+----------+
7 rows in set (0.01 sec)
But my species_set in the master_db is not empty:
mysql> SELECT * FROM species_set;
+----------------+--------------+
| species_set_id | genome_db_id |
+----------------+--------------+
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 2 | 3 |
| 3 | 1 |
| 3 | 4 |
| 4 | 1 |
| 4 | 5 |
| 5 | 2 |
| 5 | 3 |
| 6 | 2 |
| 6 | 4 |
| 7 | 2 |
| 7 | 5 |
| 8 | 3 |
| 8 | 4 |
| 9 | 3 |
| 9 | 5 |
| 10 | 4 |
| 10 | 5 |
| 11 | 1 |
| 12 | 2 |
| 13 | 3 |
| 14 | 4 |
| 15 | 5 |
| 16 | 1 |
| 16 | 2 |
| 16 | 3 |
| 16 | 4 |
| 16 | 5 |
+----------------+--------------+
30 rows in set (0.00 sec)
I am quite puzzled by this error.
Cheers,
Francesco.
2018-01-12 17:48 GMT+01:00 Matthieu Muffato <muffato at ebi.ac.uk>:
> Hi Francesco
>
> Homoeologues are only used when running on plant genomes (which have
> polyploid genomes), but the pipeline configuration is shared and expects
> this method_link to be present
>
> This how it looks in the Ensembl Plants database. You can insert this row
> in your database and it should work
>
> ensro at mysql-eg-publicsql.ebi.ac.uk:4157/ensembl_compara_plants_38_91 [Fri
> Jan 12 16:46:32 2018] > SELECT * FROM method_link WHERE method_link_id =
> 206;
> +----------------+----------------------+-------------------+
> | method_link_id | type | class |
> +----------------+----------------------+-------------------+
> | 206 | ENSEMBL_HOMOEOLOGUES | Homology.homology |
> +----------------+----------------------+-------------------+
>
> Regards,
> Matthieu
>
> On 12/01/18 14:01, Francesco Lamanna wrote:
>
>> Hi Mateus,
>>
>> I could solve this problem by commenting out the line: "die "The master
>> dabase must be defined with a collection" if $self->o('master_db') and not
>> $self->o('collection');"
>>
>> in LoadMembers_conf.pm. The member_db is now correctly set up.
>>
>> However, when I run the Protein trees pipeline I get the following error
>> message
>>
>> mysql> SELECT * FROM msg;
>> +-------------+-------------------------------+-------------
>> ---+--------+---------+-----------+---------------------+---
>> ----+--------------+----------------------------------------
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> -----------------+----------+
>> | analysis_id | logic_name | log_message_id | job_id |
>> role_id | worker_id | when_logged | retry | status | msg
>>
>>
>> | is_error |
>> +-------------+-------------------------------+-------------
>> ---+--------+---------+-----------+---------------------+---
>> ----+--------------+----------------------------------------
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> -----------------+----------+
>> | 9 | copy_ncbi_table | 1 | 5
>> | 3 | 3 | 2018-01-12 14:21:39 | 0 | WRITE_OUTPUT |
>> Successfully copied 1646504 'ncbi_taxa_node' rows
>>
>> | 0 |
>> | 9 | copy_ncbi_table | 2 | 6
>> | 4 | 4 | 2018-01-12 14:25:27 | 0 | WRITE_OUTPUT |
>> Successfully copied 2504391 'ncbi_taxa_name' rows
>>
>> | 0 |
>> | 10 | populate_method_links_from_db | 3 | 7
>> | 6 | 6 | 2018-01-12 14:26:54 | 0 | WRITE_OUTPUT |
>> Successfully copied 18 'method_link' rows
>>
>> | 0 |
>> | 16 | create_mlss_ss | 4 | 10
>> | 9 | 9 | 2018-01-12 14:29:56 | 0 | FETCH_INPUT | Cannot
>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. | 1
>> |
>> | 16 | create_mlss_ss | 5 | 10
>> | 10 | 10 | 2018-01-12 14:30:57 | 1 | FETCH_INPUT | Cannot
>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. | 1
>> |
>> | 16 | create_mlss_ss | 6 | 10
>> | 11 | 11 | 2018-01-12 14:31:58 | 2 | FETCH_INPUT | Cannot
>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. | 1
>> |
>> | 16 | create_mlss_ss | 7 | 10
>> | 12 | 12 | 2018-01-12 14:33:00 | 3 | FETCH_INPUT | Cannot
>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. | 1
>> |
>> | 16 | create_mlss_ss | 8 | 10
>> | 13 | 13 | 2018-01-12 14:38:36 | 1 | FETCH_INPUT | Cannot
>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. | 1
>> |
>> +-------------+-------------------------------+-------------
>> ---+--------+---------+-----------+---------------------+---
>> ----+--------------+----------------------------------------
>> ------------------------------------------------------------
>> ------------------------------------------------------------
>> -----------------+----------+
>> 8 rows in set (0.01 sec)
>>
>> I wasn't aware about this method_link.
>>
>> Do you know how can I fix this?
>>
>> Thanks,
>> Francesco.
>>
>> 2018-01-11 13:28 GMT+01:00 Francesco Lamanna <francesco.lamanna at gmail.com
>> <mailto:francesco.lamanna at gmail.com>>:
>>
>> Hi Mateus,
>>
>> if I try to initialize the LoadMemebers pipeline without
>> “--collection ensembl”, I get the following error:
>>
>> The following options are missing:
>> {'collection'}
>>
>> I also tried to set 'collection' => undef, in the conf file, but I
>> get another error:
>>
>> The master dabase must be defined with a collection at
>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>> EnsEMBL/Compara/PipeConfig/LoadMembers_conf.pm
>> line 190.
>>
>> Cheers,
>> Francesco
>>
>> 2018-01-11 12:01 GMT+01:00 Mateus Patricio <mateus at ebi.ac.uk
>> <mailto:mateus at ebi.ac.uk>>:
>>
>> Hi Francesco,
>>
>> In your case the solution should be starting the LoadMembers
>> without the option “--collection ensembl”.
>>
>> If you start the pipeline without it, it should use all the
>> current species in your master database.
>>
>> In Ensembl we have different collections that are used for
>> different purposes, and the default one is “ensembl”.
>>
>> Please let me know if this works.
>>
>> Cheers,
>>
>> Mateus.
>>
>>
>> On 11 Jan 2018, at 10:13, Francesco Lamanna
>>> <francesco.lamanna at gmail.com
>>> <mailto:francesco.lamanna at gmail.com>> wrote:
>>>
>>> Hi Mateus,
>>>
>>> many thanks for your answer.
>>>
>>> I am trying to launch the LoadMembers pipeline in order to
>>> make a member_db, but I get the following error:
>>>
>>> mysql> SELECT * from msg;
>>> +-------------+------------------------+----------------+---
>>> -----+---------+-----------+---------------------+-------+--
>>> ------------+-----------------------------------------------
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> -+----------+
>>> | analysis_id | logic_name | log_message_id |
>>> job_id | role_id | worker_id | when_logged | retry |
>>> status | msg | is_error |
>>> +-------------+------------------------+----------------+---
>>> -----+---------+-----------+---------------------+-------+--
>>> ------------+-----------------------------------------------
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> -+----------+
>>> | 3 | copy_table_from_master | 1 |
>>> 4 | 3 | 3 | 2018-01-11 11:04:02 | 0 |
>>> WRITE_OUTPUT | Successfully copied 1646504 'ncbi_taxa_node'
>>> rows | 0 |
>>> | 3 | copy_table_from_master | 2 |
>>> 5 | 4 | 4 | 2018-01-11 11:05:46 | 0 |
>>> WRITE_OUTPUT | Successfully copied 2504391 'ncbi_taxa_name'
>>> rows | 0 |
>>> | 4 | load_genomedb_factory | 3 |
>>> 3 | 5 | 5 | 2018-01-11 11:07:06 | 0 |
>>> FETCH_INPUT | Could not fetch collection ss with name=ensembl
>>> at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>> line 106. | 1 |
>>> | 4 | load_genomedb_factory | 4 |
>>> 3 | 6 | 6 | 2018-01-11 11:08:00 | 1 |
>>> FETCH_INPUT | Could not fetch collection ss with name=ensembl
>>> at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>> line 106. | 1 |
>>> | 4 | load_genomedb_factory | 5 |
>>> 3 | 7 | 7 | 2018-01-11 11:08:25 | 2 |
>>> FETCH_INPUT | Could not fetch collection ss with name=ensembl
>>> at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>> line 106. | 1 |
>>> | 4 | load_genomedb_factory | 6 |
>>> 3 | 8 | 8 | 2018-01-11 11:09:25 | 3 |
>>> FETCH_INPUT | Could not fetch collection ss with name=ensembl
>>> at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>> line 106. | 1 |
>>> +-------------+------------------------+----------------+---
>>> -----+---------+-----------+---------------------+-------+--
>>> ------------+-----------------------------------------------
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> -+----------+
>>> 6
>>>
>>> Cheers,
>>> Francesco.
>>>
>>> 2018-01-10 17:31 GMT+01:00 Mateus Patricio <mateus at ebi.ac.uk
>>> <mailto:mateus at ebi.ac.uk>>:
>>>
>>> Hi Francesco
>>>
>>> The protein tree pipeline reuses the genes and sequence
>>> members from the 'reuse_db' parameter, which in this case
>>> should point to a members database.
>>>
>>> This members database can be created by running the
>>> LoadMembers pipeline.
>>>
>>> You can initiate the pipeline with the following command
>>> line:
>>>
>>> init_pipeline.pl <http://init_pipeline.pl/>
>>> Bio::EnsEMBL::Compara::PipeConfig::EBI::Ensembl::LoadMembers
>>> _conf
>>> --collection ensembl
>>>
>>> Then you should point the parameter reuse_db to this
>>> database on your Protein Tree config file.
>>>
>>> 'reuse_db' => 'mysql://ensro@host:port/database',
>>>
>>> Please do let me know if you have further questions.
>>>
>>> Cheers,
>>>
>>> Mateus.
>>>
>>>
>>> On 10 Jan 2018, at 16:06, Francesco Lamanna
>>>> <francesco.lamanna at gmail.com
>>>> <mailto:francesco.lamanna at gmail.com>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> when I try to run the protein tree pipeline (v91) using
>>>> the core Human and Chicken genomes I get the following
>>>> error message:
>>>>
>>>> mysql> SELECT * from msg;
>>>> +-------------+---------------
>>>> ----------------+----------------+--------+---------+-------
>>>> ----+---------------------+-------+--------------+----------
>>>> ------------------------------------------------------------
>>>> --------------+----------+
>>>> | analysis_id | logic_name |
>>>> log_message_id | job_id | role_id | worker_id |
>>>> when_logged | retry | status | msg
>>>> | is_error |
>>>> +-------------+---------------
>>>> ----------------+----------------+--------+---------+-------
>>>> ----+---------------------+-------+--------------+----------
>>>> ------------------------------------------------------------
>>>> --------------+----------+
>>>> | 9 | copy_ncbi_table
>>>> | 1 | 5 | 3 | 4 |
>>>> 2018-01-10 16:40:13 | 0 | WRITE_OUTPUT | Successfully
>>>> copied 1646504 'ncbi_taxa_node' rows | 0 |
>>>> | 9 | copy_ncbi_table
>>>> | 2 | 6 | 4 | 3 |
>>>> 2018-01-10 16:42:04 | 0 | WRITE_OUTPUT | Successfully
>>>> copied 2504391 'ncbi_taxa_name' rows | 0 |
>>>> | 10 | populate_method_links_from_db
>>>> | 3 | 7 | 6 | 6 |
>>>> 2018-01-10 16:43:44 | 0 | WRITE_OUTPUT | Successfully
>>>> copied 18 'method_link' rows | 0 |
>>>> | 25 | genome_member_copy
>>>> | 4 | 13 | 11 | 11 |
>>>> 2018-01-10 16:47:49 | 0 | FETCH_INPUT | ParamError:
>>>> value for param_required('reuse_db') is required and has
>>>> to be defined | 1 |
>>>> | 25 | genome_member_copy
>>>> | 5 | 14 | 12 | 12 |
>>>> 2018-01-10 16:47:49 | 0 | FETCH_INPUT | ParamError:
>>>> value for param_required('reuse_db') is required and has
>>>> to be defined | 1 |
>>>> +-------------+---------------
>>>> ----------------+----------------+--------+---------+-------
>>>> ----+---------------------+-------+--------------+----------
>>>> ------------------------------------------------------------
>>>> --------------+----------+
>>>>
>>>> I have no clue about what to put in the 'reuse_db'
>>>> parameter (nor could I find any information in the
>>>> compara docs).
>>>>
>>>> Can anyone please help me to solve this issue?
>>>>
>>>> Thanks,
>>>> Francesco.
>>>> _______________________________________________
>>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>> Posting guidelines and subscribe/unsubscribe info:
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> <http://lists.ensembl.org/mailman/listinfo/dev>
>>>> Ensembl Blog: http://www.ensembl.info/
>>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> <http://lists.ensembl.org/mailman/listinfo/dev>
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> <http://lists.ensembl.org/mailman/listinfo/dev>
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> <http://lists.ensembl.org/mailman/listinfo/dev>
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
> --
> Matthieu Muffato, Ph.D.
> Ensembl Compara and TreeFam Project Leader
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Wellcome Trust Genome Campus, Hinxton
> Cambridge, CB10 1SD, United Kingdom
> Room A3-145
> Phone + 44 (0) 1223 49 4631
> Fax + 44 (0) 1223 49 4468
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180115/d3da7165/attachment.html>
More information about the Dev
mailing list