[ensembl-dev] Problem with genome_member_copy (Compara).

Francesco Lamanna francesco.lamanna at gmail.com
Mon Jan 15 16:22:34 GMT 2018


Hi Wasiu,

this is the output of the species_set_header table:

mysql> SELECT * FROM species_set_header;
+----------------+-------------------------------+------+---------------+--------------+
| species_set_id | name                          | size | first_release |
last_release |
+----------------+-------------------------------+------+---------------+--------------+
|              1 | H.sap-G.gal                   |    2 |          NULL
|         NULL |
|              2 | H.sap-B.flo                   |    2 |          NULL
|         NULL |
|              3 | H.sap-P.mar                   |    2 |          NULL
|         NULL |
|              4 | H.sap-C.mil                   |    2 |          NULL
|         NULL |
|              5 | G.gal-B.flo                   |    2 |          NULL
|         NULL |
|              6 | G.gal-P.mar                   |    2 |          NULL
|         NULL |
|              7 | G.gal-C.mil                   |    2 |          NULL
|         NULL |
|              8 | B.flo-P.mar                   |    2 |          NULL
|         NULL |
|              9 | B.flo-C.mil                   |    2 |          NULL
|         NULL |
|             10 | P.mar-C.mil                   |    2 |          NULL
|         NULL |
|             11 | H.sap                         |    1 |          NULL
|         NULL |
|             12 | G.gal                         |    1 |          NULL
|         NULL |
|             13 | B.flo                         |    1 |          NULL
|         NULL |
|             14 | P.mar                         |    1 |          NULL
|         NULL |
|             15 | C.mil                         |    1 |          NULL
|         NULL |
|             16 | H.sap-G.gal-B.flo-P.mar-C.mil |    5 |          NULL
|         NULL |
+----------------+-------------------------------+------+---------------+--------------+
16 rows in set (0.05 sec)

It looks ok

2018-01-15 17:19 GMT+01:00 Wasiu Akanni <waakanni at ebi.ac.uk>:

> Hi Francesco,
>
> Have you checked the species_set_header table?
>
> On 15/01/2018 14:39, Francesco Lamanna wrote:
>
> Hi Matthieu,
>
> thank you for your suggestion, it fixed the problem.
>
> However, I still get the following error:
>
> mysql> SELECT * FROM msg;
> +-------------+-------------------------------+-------------
> ---+--------+---------+-----------+---------------------+---
> ----+--------------+----------------------------------------
> --------------------+----------+
> | analysis_id | logic_name                    | log_message_id | job_id |
> role_id | worker_id | when_logged         | retry | status       |
> msg                                                        | is_error |
> +-------------+-------------------------------+-------------
> ---+--------+---------+-----------+---------------------+---
> ----+--------------+----------------------------------------
> --------------------+----------+
> |           9 | copy_ncbi_table               |              1 |      5
> |       3 |         3 | 2018-01-15 14:49:29 |     0 | WRITE_OUTPUT |
> Successfully copied 1646504 'ncbi_taxa_node' rows          |        0 |
> |           9 | copy_ncbi_table               |              2 |      6
> |       4 |         4 | 2018-01-15 14:50:36 |     0 | WRITE_OUTPUT |
> Successfully copied 2504391 'ncbi_taxa_name' rows          |        0 |
> |          10 | populate_method_links_from_db |              3 |      7
> |       6 |         6 | 2018-01-15 14:52:06 |     0 | WRITE_OUTPUT |
> Successfully copied 19 'method_link' rows                  |        0 |
> |          16 | create_mlss_ss                |              4 |     10
> |       9 |         9 | 2018-01-15 14:55:10 |     0 | WRITE_OUTPUT | The
> species-set could not be found in the master database |        1 |
> |          16 | create_mlss_ss                |              5 |     10
> |      10 |        10 | 2018-01-15 14:56:11 |     1 | WRITE_OUTPUT | The
> species-set could not be found in the master database |        1 |
> |          16 | create_mlss_ss                |              6 |     10
> |      11 |        11 | 2018-01-15 14:57:12 |     2 | WRITE_OUTPUT | The
> species-set could not be found in the master database |        1 |
> |          16 | create_mlss_ss                |              7 |     10
> |      12 |        12 | 2018-01-15 14:58:21 |     3 | WRITE_OUTPUT | The
> species-set could not be found in the master database |        1 |
> +-------------+-------------------------------+-------------
> ---+--------+---------+-----------+---------------------+---
> ----+--------------+----------------------------------------
> --------------------+----------+
> 7 rows in set (0.01 sec)
>
> But my species_set in the master_db is not empty:
>
> mysql> SELECT * FROM species_set;
> +----------------+--------------+
> | species_set_id | genome_db_id |
> +----------------+--------------+
> |              1 |            1 |
> |              1 |            2 |
> |              2 |            1 |
> |              2 |            3 |
> |              3 |            1 |
> |              3 |            4 |
> |              4 |            1 |
> |              4 |            5 |
> |              5 |            2 |
> |              5 |            3 |
> |              6 |            2 |
> |              6 |            4 |
> |              7 |            2 |
> |              7 |            5 |
> |              8 |            3 |
> |              8 |            4 |
> |              9 |            3 |
> |              9 |            5 |
> |             10 |            4 |
> |             10 |            5 |
> |             11 |            1 |
> |             12 |            2 |
> |             13 |            3 |
> |             14 |            4 |
> |             15 |            5 |
> |             16 |            1 |
> |             16 |            2 |
> |             16 |            3 |
> |             16 |            4 |
> |             16 |            5 |
> +----------------+--------------+
> 30 rows in set (0.00 sec)
>
> I am quite puzzled by this error.
>
> Cheers,
> Francesco.
>
> 2018-01-12 17:48 GMT+01:00 Matthieu Muffato <muffato at ebi.ac.uk>:
>
>> Hi Francesco
>>
>> Homoeologues are only used when running on plant genomes (which have
>> polyploid genomes), but the pipeline configuration is shared and expects
>> this method_link to be present
>>
>> This how it looks in the Ensembl Plants database. You can insert this row
>> in your database and it should work
>>
>> ensro at mysql-eg-publicsql.ebi.ac.uk:4157/ensembl_compara_plants_38_91
>> [Fri Jan 12 16:46:32 2018] > SELECT * FROM method_link WHERE method_link_id
>> = 206;
>> +----------------+----------------------+-------------------+
>> | method_link_id | type                 | class             |
>> +----------------+----------------------+-------------------+
>> |            206 | ENSEMBL_HOMOEOLOGUES | Homology.homology |
>> +----------------+----------------------+-------------------+
>>
>> Regards,
>> Matthieu
>>
>> On 12/01/18 14:01, Francesco Lamanna wrote:
>>
>>> Hi Mateus,
>>>
>>> I could solve this problem by commenting out the line: "die "The master
>>> dabase must be defined with a collection" if $self->o('master_db') and not
>>> $self->o('collection');"
>>>
>>> in LoadMembers_conf.pm. The member_db is now correctly set up.
>>>
>>> However, when I run the Protein trees pipeline I get the following error
>>> message
>>>
>>> mysql> SELECT * FROM msg;
>>> +-------------+-------------------------------+-------------
>>> ---+--------+---------+-----------+---------------------+---
>>> ----+--------------+----------------------------------------
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> -----------------+----------+
>>> | analysis_id | logic_name                    | log_message_id | job_id
>>> | role_id | worker_id | when_logged         | retry | status       | msg
>>>
>>>
>>>                  | is_error |
>>> +-------------+-------------------------------+-------------
>>> ---+--------+---------+-----------+---------------------+---
>>> ----+--------------+----------------------------------------
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> -----------------+----------+
>>> |           9 | copy_ncbi_table               |              1 |      5
>>> |       3 |         3 | 2018-01-12 14:21:39 |     0 | WRITE_OUTPUT |
>>> Successfully copied 1646504 'ncbi_taxa_node' rows
>>>
>>>                          |        0 |
>>> |           9 | copy_ncbi_table               |              2 |      6
>>> |       4 |         4 | 2018-01-12 14:25:27 |     0 | WRITE_OUTPUT |
>>> Successfully copied 2504391 'ncbi_taxa_name' rows
>>>
>>>                          |        0 |
>>> |          10 | populate_method_links_from_db |              3 |      7
>>> |       6 |         6 | 2018-01-12 14:26:54 |     0 | WRITE_OUTPUT |
>>> Successfully copied 18 'method_link' rows
>>>
>>>                          |        0 |
>>> |          16 | create_mlss_ss                |              4 |     10
>>> |       9 |         9 | 2018-01-12 14:29:56 |     0 | FETCH_INPUT  | Cannot
>>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. |
>>> 1 |
>>> |          16 | create_mlss_ss                |              5 |     10
>>> |      10 |        10 | 2018-01-12 14:30:57 |     1 | FETCH_INPUT  | Cannot
>>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. |
>>> 1 |
>>> |          16 | create_mlss_ss                |              6 |     10
>>> |      11 |        11 | 2018-01-12 14:31:58 |     2 | FETCH_INPUT  | Cannot
>>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. |
>>> 1 |
>>> |          16 | create_mlss_ss                |              7 |     10
>>> |      12 |        12 | 2018-01-12 14:33:00 |     3 | FETCH_INPUT  | Cannot
>>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. |
>>> 1 |
>>> |          16 | create_mlss_ss                |              8 |     10
>>> |      13 |        13 | 2018-01-12 14:38:36 |     1 | FETCH_INPUT  | Cannot
>>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. |
>>> 1 |
>>> +-------------+-------------------------------+-------------
>>> ---+--------+---------+-----------+---------------------+---
>>> ----+--------------+----------------------------------------
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> -----------------+----------+
>>> 8 rows in set (0.01 sec)
>>>
>>> I wasn't aware about this method_link.
>>>
>>> Do you know how can I fix this?
>>>
>>> Thanks,
>>> Francesco.
>>>
>>> 2018-01-11 13:28 GMT+01:00 Francesco Lamanna <
>>> francesco.lamanna at gmail.com <mailto:francesco.lamanna at gmail.com>>:
>>>
>>>     Hi Mateus,
>>>
>>>     if I try to initialize the LoadMemebers pipeline without
>>>     “--collection ensembl”, I get the following error:
>>>
>>>     The following options are missing:
>>>          {'collection'}
>>>
>>>     I also tried to set 'collection' => undef, in the conf file, but I
>>>     get another error:
>>>
>>>     The master dabase must be defined with a collection at
>>>     /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/PipeConfig/LoadMembers_conf.pm
>>>     line 190.
>>>
>>>     Cheers,
>>>     Francesco
>>>
>>>     2018-01-11 12:01 GMT+01:00 Mateus Patricio <mateus at ebi.ac.uk
>>>     <mailto:mateus at ebi.ac.uk>>:
>>>
>>>         Hi Francesco,
>>>
>>>         In your case the solution should be starting the LoadMembers
>>>         without the option “--collection ensembl”.
>>>
>>>         If you start the pipeline without it, it should use all the
>>>         current species in your master database.
>>>
>>>         In Ensembl we have different collections that are used for
>>>         different purposes, and the default one is “ensembl”.
>>>
>>>         Please let me know if this works.
>>>
>>>         Cheers,
>>>
>>>         Mateus.
>>>
>>>
>>>         On 11 Jan 2018, at 10:13, Francesco Lamanna
>>>>         <francesco.lamanna at gmail.com
>>>>         <mailto:francesco.lamanna at gmail.com>> wrote:
>>>>
>>>>         Hi Mateus,
>>>>
>>>>         many thanks for your answer.
>>>>
>>>>         I am trying to launch the LoadMembers pipeline in order to
>>>>         make a member_db, but I get the following error:
>>>>
>>>>         mysql> SELECT * from msg;
>>>>         +-------------+------------------------+----------------+---
>>>> -----+---------+-----------+---------------------+-------+--
>>>> ------------+-----------------------------------------------
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> -+----------+
>>>>         | analysis_id | logic_name             | log_message_id |
>>>>         job_id | role_id | worker_id | when_logged         | retry |
>>>>         status       | msg                  | is_error |
>>>>         +-------------+------------------------+----------------+---
>>>> -----+---------+-----------+---------------------+-------+--
>>>> ------------+-----------------------------------------------
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> -+----------+
>>>>         |           3 | copy_table_from_master |              1 |
>>>>        4 |       3 |         3 | 2018-01-11 11:04:02 |     0 |
>>>>         WRITE_OUTPUT | Successfully copied 1646504 'ncbi_taxa_node'
>>>>         rows   |        0 |
>>>>         |           3 | copy_table_from_master |              2 |
>>>>        5 |       4 |         4 | 2018-01-11 11:05:46 |     0 |
>>>>         WRITE_OUTPUT | Successfully copied 2504391 'ncbi_taxa_name'
>>>>         rows   |        0 |
>>>>         |           4 | load_genomedb_factory  |              3 |
>>>>        3 |       5 |         5 | 2018-01-11 11:07:06 |     0 |
>>>>         FETCH_INPUT  | Could not fetch collection ss with name=ensembl
>>>>         at
>>>>         /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>>>         line 106. |        1 |
>>>>         |           4 | load_genomedb_factory  |              4 |
>>>>        3 |       6 |         6 | 2018-01-11 11:08:00 |     1 |
>>>>         FETCH_INPUT  | Could not fetch collection ss with name=ensembl
>>>>         at
>>>>         /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>>>         line 106. |        1 |
>>>>         |           4 | load_genomedb_factory  |              5 |
>>>>        3 |       7 |         7 | 2018-01-11 11:08:25 |     2 |
>>>>         FETCH_INPUT  | Could not fetch collection ss with name=ensembl
>>>>         at
>>>>         /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>>>         line 106. |        1 |
>>>>         |           4 | load_genomedb_factory  |              6 |
>>>>        3 |       8 |         8 | 2018-01-11 11:09:25 |     3 |
>>>>         FETCH_INPUT  | Could not fetch collection ss with name=ensembl
>>>>         at
>>>>         /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>>>         line 106. |        1 |
>>>>         +-------------+------------------------+----------------+---
>>>> -----+---------+-----------+---------------------+-------+--
>>>> ------------+-----------------------------------------------
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> -+----------+
>>>>         6
>>>>
>>>>         Cheers,
>>>>         Francesco.
>>>>
>>>>         2018-01-10 17:31 GMT+01:00 Mateus Patricio <mateus at ebi.ac.uk
>>>>         <mailto:mateus at ebi.ac.uk>>:
>>>>
>>>>             Hi Francesco
>>>>
>>>>             The protein tree pipeline reuses the genes and sequence
>>>>             members from the 'reuse_db' parameter, which in this case
>>>>             should point to a members database.
>>>>
>>>>             This members database can be created by running the
>>>>             LoadMembers pipeline.
>>>>
>>>>             You can initiate the pipeline with the following command
>>>> line:
>>>>
>>>>             init_pipeline.pl <http://init_pipeline.pl/>
>>>>             Bio::EnsEMBL::Compara::PipeCon
>>>> fig::EBI::Ensembl::LoadMembers_conf
>>>>             --collection ensembl
>>>>
>>>>             Then you should point the parameter reuse_db to this
>>>>             database on your Protein Tree config file.
>>>>
>>>>             'reuse_db'   => 'mysql://ensro@host:port/database',
>>>>
>>>>             Please do let me know if you have further questions.
>>>>
>>>>             Cheers,
>>>>
>>>>             Mateus.
>>>>
>>>>
>>>>             On 10 Jan 2018, at 16:06, Francesco Lamanna
>>>>>             <francesco.lamanna at gmail.com
>>>>>             <mailto:francesco.lamanna at gmail.com>> wrote:
>>>>>
>>>>>             Hi all,
>>>>>
>>>>>             when I try to run the protein tree pipeline (v91) using
>>>>>             the core Human and Chicken genomes I get the following
>>>>>             error message:
>>>>>
>>>>>             mysql> SELECT * from msg;
>>>>>             +-------------+---------------
>>>>> ----------------+----------------+--------+---------+-------
>>>>> ----+---------------------+-------+--------------+----------
>>>>> ------------------------------------------------------------
>>>>> --------------+----------+
>>>>>             | analysis_id | logic_name                    |
>>>>>             log_message_id | job_id | role_id | worker_id |
>>>>>             when_logged         | retry | status       | msg
>>>>>                                    | is_error |
>>>>>             +-------------+---------------
>>>>> ----------------+----------------+--------+---------+-------
>>>>> ----+---------------------+-------+--------------+----------
>>>>> ------------------------------------------------------------
>>>>> --------------+----------+
>>>>>             |           9 | copy_ncbi_table
>>>>> |              1 |      5 |       3 |         4 |
>>>>>             2018-01-10 16:40:13 |     0 | WRITE_OUTPUT | Successfully
>>>>>             copied 1646504 'ncbi_taxa_node' rows         |        0 |
>>>>>             |           9 | copy_ncbi_table
>>>>> |              2 |      6 |       4 |         3 |
>>>>>             2018-01-10 16:42:04 |     0 | WRITE_OUTPUT | Successfully
>>>>>             copied 2504391 'ncbi_taxa_name' rows         |        0 |
>>>>>             |          10 | populate_method_links_from_db
>>>>>             |              3 |      7 |       6 |         6 |
>>>>>             2018-01-10 16:43:44 |     0 | WRITE_OUTPUT | Successfully
>>>>>             copied 18 'method_link' rows                 |        0 |
>>>>>             |          25 | genome_member_copy
>>>>>  |              4 |     13 |      11 |        11 |
>>>>>             2018-01-10 16:47:49 |     0 | FETCH_INPUT  | ParamError:
>>>>>             value for param_required('reuse_db') is required and has
>>>>>             to be defined |        1 |
>>>>>             |          25 | genome_member_copy
>>>>>  |              5 |     14 |      12 |        12 |
>>>>>             2018-01-10 16:47:49 |     0 | FETCH_INPUT  | ParamError:
>>>>>             value for param_required('reuse_db') is required and has
>>>>>             to be defined |        1 |
>>>>>             +-------------+---------------
>>>>> ----------------+----------------+--------+---------+-------
>>>>> ----+---------------------+-------+--------------+----------
>>>>> ------------------------------------------------------------
>>>>> --------------+----------+
>>>>>
>>>>>             I have no clue about what to put in the 'reuse_db'
>>>>>             parameter (nor could I find any information in the
>>>>>             compara docs).
>>>>>
>>>>>             Can anyone please help me to solve this issue?
>>>>>
>>>>>             Thanks,
>>>>>             Francesco.
>>>>>             _______________________________________________
>>>>>             Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>>>             Posting guidelines and subscribe/unsubscribe info:
>>>>>             http://lists.ensembl.org/mailman/listinfo/dev
>>>>>             <http://lists.ensembl.org/mailman/listinfo/dev>
>>>>>             Ensembl Blog: http://www.ensembl.info/
>>>>>
>>>>
>>>>
>>>>             _______________________________________________
>>>>             Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>>             Posting guidelines and subscribe/unsubscribe info:
>>>>             http://lists.ensembl.org/mailman/listinfo/dev
>>>>             <http://lists.ensembl.org/mailman/listinfo/dev>
>>>>             Ensembl Blog: http://www.ensembl.info/
>>>>
>>>>
>>>>         _______________________________________________
>>>>         Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>>         Posting guidelines and subscribe/unsubscribe info:
>>>>         http://lists.ensembl.org/mailman/listinfo/dev
>>>>         <http://lists.ensembl.org/mailman/listinfo/dev>
>>>>         Ensembl Blog: http://www.ensembl.info/
>>>>
>>>
>>>
>>>         _______________________________________________
>>>         Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>         Posting guidelines and subscribe/unsubscribe info:
>>>         http://lists.ensembl.org/mailman/listinfo/dev
>>>         <http://lists.ensembl.org/mailman/listinfo/dev>
>>>         Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>> --
>> Matthieu Muffato, Ph.D.
>> Ensembl Compara and TreeFam Project Leader
>> European Bioinformatics Institute (EMBL-EBI)
>> European Molecular Biology Laboratory
>> Wellcome Trust Genome Campus, Hinxton
>> Cambridge, CB10 1SD, United Kingdom
>> Room  A3-145
>> Phone + 44 (0) 1223 49 4631
>> Fax   + 44 (0) 1223 49 4468
>>
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
> --
> *Wasiu Ajenifuja Akanni*
> Developer
>
> Compara group
> EMBL-EBI
> Phone: + 44 (0) 1223 494 237 <+44%201223%20494237>
> Room A3145, West building | Wellcome Trust Genome Campus | Hinxton |
> Cambridge | CB10 1SD | UK
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180115/157c4550/attachment.html>


More information about the Dev mailing list