[ensembl-dev] Problem with genome_member_copy (Compara).
Francesco Lamanna
francesco.lamanna at gmail.com
Mon Jan 15 16:22:34 GMT 2018
Hi Wasiu,
this is the output of the species_set_header table:
mysql> SELECT * FROM species_set_header;
+----------------+-------------------------------+------+---------------+--------------+
| species_set_id | name | size | first_release |
last_release |
+----------------+-------------------------------+------+---------------+--------------+
| 1 | H.sap-G.gal | 2 | NULL
| NULL |
| 2 | H.sap-B.flo | 2 | NULL
| NULL |
| 3 | H.sap-P.mar | 2 | NULL
| NULL |
| 4 | H.sap-C.mil | 2 | NULL
| NULL |
| 5 | G.gal-B.flo | 2 | NULL
| NULL |
| 6 | G.gal-P.mar | 2 | NULL
| NULL |
| 7 | G.gal-C.mil | 2 | NULL
| NULL |
| 8 | B.flo-P.mar | 2 | NULL
| NULL |
| 9 | B.flo-C.mil | 2 | NULL
| NULL |
| 10 | P.mar-C.mil | 2 | NULL
| NULL |
| 11 | H.sap | 1 | NULL
| NULL |
| 12 | G.gal | 1 | NULL
| NULL |
| 13 | B.flo | 1 | NULL
| NULL |
| 14 | P.mar | 1 | NULL
| NULL |
| 15 | C.mil | 1 | NULL
| NULL |
| 16 | H.sap-G.gal-B.flo-P.mar-C.mil | 5 | NULL
| NULL |
+----------------+-------------------------------+------+---------------+--------------+
16 rows in set (0.05 sec)
It looks ok
2018-01-15 17:19 GMT+01:00 Wasiu Akanni <waakanni at ebi.ac.uk>:
> Hi Francesco,
>
> Have you checked the species_set_header table?
>
> On 15/01/2018 14:39, Francesco Lamanna wrote:
>
> Hi Matthieu,
>
> thank you for your suggestion, it fixed the problem.
>
> However, I still get the following error:
>
> mysql> SELECT * FROM msg;
> +-------------+-------------------------------+-------------
> ---+--------+---------+-----------+---------------------+---
> ----+--------------+----------------------------------------
> --------------------+----------+
> | analysis_id | logic_name | log_message_id | job_id |
> role_id | worker_id | when_logged | retry | status |
> msg | is_error |
> +-------------+-------------------------------+-------------
> ---+--------+---------+-----------+---------------------+---
> ----+--------------+----------------------------------------
> --------------------+----------+
> | 9 | copy_ncbi_table | 1 | 5
> | 3 | 3 | 2018-01-15 14:49:29 | 0 | WRITE_OUTPUT |
> Successfully copied 1646504 'ncbi_taxa_node' rows | 0 |
> | 9 | copy_ncbi_table | 2 | 6
> | 4 | 4 | 2018-01-15 14:50:36 | 0 | WRITE_OUTPUT |
> Successfully copied 2504391 'ncbi_taxa_name' rows | 0 |
> | 10 | populate_method_links_from_db | 3 | 7
> | 6 | 6 | 2018-01-15 14:52:06 | 0 | WRITE_OUTPUT |
> Successfully copied 19 'method_link' rows | 0 |
> | 16 | create_mlss_ss | 4 | 10
> | 9 | 9 | 2018-01-15 14:55:10 | 0 | WRITE_OUTPUT | The
> species-set could not be found in the master database | 1 |
> | 16 | create_mlss_ss | 5 | 10
> | 10 | 10 | 2018-01-15 14:56:11 | 1 | WRITE_OUTPUT | The
> species-set could not be found in the master database | 1 |
> | 16 | create_mlss_ss | 6 | 10
> | 11 | 11 | 2018-01-15 14:57:12 | 2 | WRITE_OUTPUT | The
> species-set could not be found in the master database | 1 |
> | 16 | create_mlss_ss | 7 | 10
> | 12 | 12 | 2018-01-15 14:58:21 | 3 | WRITE_OUTPUT | The
> species-set could not be found in the master database | 1 |
> +-------------+-------------------------------+-------------
> ---+--------+---------+-----------+---------------------+---
> ----+--------------+----------------------------------------
> --------------------+----------+
> 7 rows in set (0.01 sec)
>
> But my species_set in the master_db is not empty:
>
> mysql> SELECT * FROM species_set;
> +----------------+--------------+
> | species_set_id | genome_db_id |
> +----------------+--------------+
> | 1 | 1 |
> | 1 | 2 |
> | 2 | 1 |
> | 2 | 3 |
> | 3 | 1 |
> | 3 | 4 |
> | 4 | 1 |
> | 4 | 5 |
> | 5 | 2 |
> | 5 | 3 |
> | 6 | 2 |
> | 6 | 4 |
> | 7 | 2 |
> | 7 | 5 |
> | 8 | 3 |
> | 8 | 4 |
> | 9 | 3 |
> | 9 | 5 |
> | 10 | 4 |
> | 10 | 5 |
> | 11 | 1 |
> | 12 | 2 |
> | 13 | 3 |
> | 14 | 4 |
> | 15 | 5 |
> | 16 | 1 |
> | 16 | 2 |
> | 16 | 3 |
> | 16 | 4 |
> | 16 | 5 |
> +----------------+--------------+
> 30 rows in set (0.00 sec)
>
> I am quite puzzled by this error.
>
> Cheers,
> Francesco.
>
> 2018-01-12 17:48 GMT+01:00 Matthieu Muffato <muffato at ebi.ac.uk>:
>
>> Hi Francesco
>>
>> Homoeologues are only used when running on plant genomes (which have
>> polyploid genomes), but the pipeline configuration is shared and expects
>> this method_link to be present
>>
>> This how it looks in the Ensembl Plants database. You can insert this row
>> in your database and it should work
>>
>> ensro at mysql-eg-publicsql.ebi.ac.uk:4157/ensembl_compara_plants_38_91
>> [Fri Jan 12 16:46:32 2018] > SELECT * FROM method_link WHERE method_link_id
>> = 206;
>> +----------------+----------------------+-------------------+
>> | method_link_id | type | class |
>> +----------------+----------------------+-------------------+
>> | 206 | ENSEMBL_HOMOEOLOGUES | Homology.homology |
>> +----------------+----------------------+-------------------+
>>
>> Regards,
>> Matthieu
>>
>> On 12/01/18 14:01, Francesco Lamanna wrote:
>>
>>> Hi Mateus,
>>>
>>> I could solve this problem by commenting out the line: "die "The master
>>> dabase must be defined with a collection" if $self->o('master_db') and not
>>> $self->o('collection');"
>>>
>>> in LoadMembers_conf.pm. The member_db is now correctly set up.
>>>
>>> However, when I run the Protein trees pipeline I get the following error
>>> message
>>>
>>> mysql> SELECT * FROM msg;
>>> +-------------+-------------------------------+-------------
>>> ---+--------+---------+-----------+---------------------+---
>>> ----+--------------+----------------------------------------
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> -----------------+----------+
>>> | analysis_id | logic_name | log_message_id | job_id
>>> | role_id | worker_id | when_logged | retry | status | msg
>>>
>>>
>>> | is_error |
>>> +-------------+-------------------------------+-------------
>>> ---+--------+---------+-----------+---------------------+---
>>> ----+--------------+----------------------------------------
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> -----------------+----------+
>>> | 9 | copy_ncbi_table | 1 | 5
>>> | 3 | 3 | 2018-01-12 14:21:39 | 0 | WRITE_OUTPUT |
>>> Successfully copied 1646504 'ncbi_taxa_node' rows
>>>
>>> | 0 |
>>> | 9 | copy_ncbi_table | 2 | 6
>>> | 4 | 4 | 2018-01-12 14:25:27 | 0 | WRITE_OUTPUT |
>>> Successfully copied 2504391 'ncbi_taxa_name' rows
>>>
>>> | 0 |
>>> | 10 | populate_method_links_from_db | 3 | 7
>>> | 6 | 6 | 2018-01-12 14:26:54 | 0 | WRITE_OUTPUT |
>>> Successfully copied 18 'method_link' rows
>>>
>>> | 0 |
>>> | 16 | create_mlss_ss | 4 | 10
>>> | 9 | 9 | 2018-01-12 14:29:56 | 0 | FETCH_INPUT | Cannot
>>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. |
>>> 1 |
>>> | 16 | create_mlss_ss | 5 | 10
>>> | 10 | 10 | 2018-01-12 14:30:57 | 1 | FETCH_INPUT | Cannot
>>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. |
>>> 1 |
>>> | 16 | create_mlss_ss | 6 | 10
>>> | 11 | 11 | 2018-01-12 14:31:58 | 2 | FETCH_INPUT | Cannot
>>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. |
>>> 1 |
>>> | 16 | create_mlss_ss | 7 | 10
>>> | 12 | 12 | 2018-01-12 14:33:00 | 3 | FETCH_INPUT | Cannot
>>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. |
>>> 1 |
>>> | 16 | create_mlss_ss | 8 | 10
>>> | 13 | 13 | 2018-01-12 14:38:36 | 1 | FETCH_INPUT | Cannot
>>> find the method_link 'ENSEMBL_HOMOEOLOGUES' at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/RunnableDB/PrepareSpeciesSetsMLSS.pm line 70. |
>>> 1 |
>>> +-------------+-------------------------------+-------------
>>> ---+--------+---------+-----------+---------------------+---
>>> ----+--------------+----------------------------------------
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> -----------------+----------+
>>> 8 rows in set (0.01 sec)
>>>
>>> I wasn't aware about this method_link.
>>>
>>> Do you know how can I fix this?
>>>
>>> Thanks,
>>> Francesco.
>>>
>>> 2018-01-11 13:28 GMT+01:00 Francesco Lamanna <
>>> francesco.lamanna at gmail.com <mailto:francesco.lamanna at gmail.com>>:
>>>
>>> Hi Mateus,
>>>
>>> if I try to initialize the LoadMemebers pipeline without
>>> “--collection ensembl”, I get the following error:
>>>
>>> The following options are missing:
>>> {'collection'}
>>>
>>> I also tried to set 'collection' => undef, in the conf file, but I
>>> get another error:
>>>
>>> The master dabase must be defined with a collection at
>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>> EnsEMBL/Compara/PipeConfig/LoadMembers_conf.pm
>>> line 190.
>>>
>>> Cheers,
>>> Francesco
>>>
>>> 2018-01-11 12:01 GMT+01:00 Mateus Patricio <mateus at ebi.ac.uk
>>> <mailto:mateus at ebi.ac.uk>>:
>>>
>>> Hi Francesco,
>>>
>>> In your case the solution should be starting the LoadMembers
>>> without the option “--collection ensembl”.
>>>
>>> If you start the pipeline without it, it should use all the
>>> current species in your master database.
>>>
>>> In Ensembl we have different collections that are used for
>>> different purposes, and the default one is “ensembl”.
>>>
>>> Please let me know if this works.
>>>
>>> Cheers,
>>>
>>> Mateus.
>>>
>>>
>>> On 11 Jan 2018, at 10:13, Francesco Lamanna
>>>> <francesco.lamanna at gmail.com
>>>> <mailto:francesco.lamanna at gmail.com>> wrote:
>>>>
>>>> Hi Mateus,
>>>>
>>>> many thanks for your answer.
>>>>
>>>> I am trying to launch the LoadMembers pipeline in order to
>>>> make a member_db, but I get the following error:
>>>>
>>>> mysql> SELECT * from msg;
>>>> +-------------+------------------------+----------------+---
>>>> -----+---------+-----------+---------------------+-------+--
>>>> ------------+-----------------------------------------------
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> -+----------+
>>>> | analysis_id | logic_name | log_message_id |
>>>> job_id | role_id | worker_id | when_logged | retry |
>>>> status | msg | is_error |
>>>> +-------------+------------------------+----------------+---
>>>> -----+---------+-----------+---------------------+-------+--
>>>> ------------+-----------------------------------------------
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> -+----------+
>>>> | 3 | copy_table_from_master | 1 |
>>>> 4 | 3 | 3 | 2018-01-11 11:04:02 | 0 |
>>>> WRITE_OUTPUT | Successfully copied 1646504 'ncbi_taxa_node'
>>>> rows | 0 |
>>>> | 3 | copy_table_from_master | 2 |
>>>> 5 | 4 | 4 | 2018-01-11 11:05:46 | 0 |
>>>> WRITE_OUTPUT | Successfully copied 2504391 'ncbi_taxa_name'
>>>> rows | 0 |
>>>> | 4 | load_genomedb_factory | 3 |
>>>> 3 | 5 | 5 | 2018-01-11 11:07:06 | 0 |
>>>> FETCH_INPUT | Could not fetch collection ss with name=ensembl
>>>> at
>>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>>> line 106. | 1 |
>>>> | 4 | load_genomedb_factory | 4 |
>>>> 3 | 6 | 6 | 2018-01-11 11:08:00 | 1 |
>>>> FETCH_INPUT | Could not fetch collection ss with name=ensembl
>>>> at
>>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>>> line 106. | 1 |
>>>> | 4 | load_genomedb_factory | 5 |
>>>> 3 | 7 | 7 | 2018-01-11 11:08:25 | 2 |
>>>> FETCH_INPUT | Could not fetch collection ss with name=ensembl
>>>> at
>>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>>> line 106. | 1 |
>>>> | 4 | load_genomedb_factory | 6 |
>>>> 3 | 8 | 8 | 2018-01-11 11:09:25 | 3 |
>>>> FETCH_INPUT | Could not fetch collection ss with name=ensembl
>>>> at
>>>> /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/
>>>> EnsEMBL/Compara/RunnableDB/GenomeDBFactory.pm
>>>> line 106. | 1 |
>>>> +-------------+------------------------+----------------+---
>>>> -----+---------+-----------+---------------------+-------+--
>>>> ------------+-----------------------------------------------
>>>> ------------------------------------------------------------
>>>> ------------------------------------------------------------
>>>> -+----------+
>>>> 6
>>>>
>>>> Cheers,
>>>> Francesco.
>>>>
>>>> 2018-01-10 17:31 GMT+01:00 Mateus Patricio <mateus at ebi.ac.uk
>>>> <mailto:mateus at ebi.ac.uk>>:
>>>>
>>>> Hi Francesco
>>>>
>>>> The protein tree pipeline reuses the genes and sequence
>>>> members from the 'reuse_db' parameter, which in this case
>>>> should point to a members database.
>>>>
>>>> This members database can be created by running the
>>>> LoadMembers pipeline.
>>>>
>>>> You can initiate the pipeline with the following command
>>>> line:
>>>>
>>>> init_pipeline.pl <http://init_pipeline.pl/>
>>>> Bio::EnsEMBL::Compara::PipeCon
>>>> fig::EBI::Ensembl::LoadMembers_conf
>>>> --collection ensembl
>>>>
>>>> Then you should point the parameter reuse_db to this
>>>> database on your Protein Tree config file.
>>>>
>>>> 'reuse_db' => 'mysql://ensro@host:port/database',
>>>>
>>>> Please do let me know if you have further questions.
>>>>
>>>> Cheers,
>>>>
>>>> Mateus.
>>>>
>>>>
>>>> On 10 Jan 2018, at 16:06, Francesco Lamanna
>>>>> <francesco.lamanna at gmail.com
>>>>> <mailto:francesco.lamanna at gmail.com>> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> when I try to run the protein tree pipeline (v91) using
>>>>> the core Human and Chicken genomes I get the following
>>>>> error message:
>>>>>
>>>>> mysql> SELECT * from msg;
>>>>> +-------------+---------------
>>>>> ----------------+----------------+--------+---------+-------
>>>>> ----+---------------------+-------+--------------+----------
>>>>> ------------------------------------------------------------
>>>>> --------------+----------+
>>>>> | analysis_id | logic_name |
>>>>> log_message_id | job_id | role_id | worker_id |
>>>>> when_logged | retry | status | msg
>>>>> | is_error |
>>>>> +-------------+---------------
>>>>> ----------------+----------------+--------+---------+-------
>>>>> ----+---------------------+-------+--------------+----------
>>>>> ------------------------------------------------------------
>>>>> --------------+----------+
>>>>> | 9 | copy_ncbi_table
>>>>> | 1 | 5 | 3 | 4 |
>>>>> 2018-01-10 16:40:13 | 0 | WRITE_OUTPUT | Successfully
>>>>> copied 1646504 'ncbi_taxa_node' rows | 0 |
>>>>> | 9 | copy_ncbi_table
>>>>> | 2 | 6 | 4 | 3 |
>>>>> 2018-01-10 16:42:04 | 0 | WRITE_OUTPUT | Successfully
>>>>> copied 2504391 'ncbi_taxa_name' rows | 0 |
>>>>> | 10 | populate_method_links_from_db
>>>>> | 3 | 7 | 6 | 6 |
>>>>> 2018-01-10 16:43:44 | 0 | WRITE_OUTPUT | Successfully
>>>>> copied 18 'method_link' rows | 0 |
>>>>> | 25 | genome_member_copy
>>>>> | 4 | 13 | 11 | 11 |
>>>>> 2018-01-10 16:47:49 | 0 | FETCH_INPUT | ParamError:
>>>>> value for param_required('reuse_db') is required and has
>>>>> to be defined | 1 |
>>>>> | 25 | genome_member_copy
>>>>> | 5 | 14 | 12 | 12 |
>>>>> 2018-01-10 16:47:49 | 0 | FETCH_INPUT | ParamError:
>>>>> value for param_required('reuse_db') is required and has
>>>>> to be defined | 1 |
>>>>> +-------------+---------------
>>>>> ----------------+----------------+--------+---------+-------
>>>>> ----+---------------------+-------+--------------+----------
>>>>> ------------------------------------------------------------
>>>>> --------------+----------+
>>>>>
>>>>> I have no clue about what to put in the 'reuse_db'
>>>>> parameter (nor could I find any information in the
>>>>> compara docs).
>>>>>
>>>>> Can anyone please help me to solve this issue?
>>>>>
>>>>> Thanks,
>>>>> Francesco.
>>>>> _______________________________________________
>>>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>>> Posting guidelines and subscribe/unsubscribe info:
>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>> <http://lists.ensembl.org/mailman/listinfo/dev>
>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>> Posting guidelines and subscribe/unsubscribe info:
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> <http://lists.ensembl.org/mailman/listinfo/dev>
>>>> Ensembl Blog: http://www.ensembl.info/
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>> Posting guidelines and subscribe/unsubscribe info:
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> <http://lists.ensembl.org/mailman/listinfo/dev>
>>>> Ensembl Blog: http://www.ensembl.info/
>>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> <http://lists.ensembl.org/mailman/listinfo/dev>
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>> --
>> Matthieu Muffato, Ph.D.
>> Ensembl Compara and TreeFam Project Leader
>> European Bioinformatics Institute (EMBL-EBI)
>> European Molecular Biology Laboratory
>> Wellcome Trust Genome Campus, Hinxton
>> Cambridge, CB10 1SD, United Kingdom
>> Room A3-145
>> Phone + 44 (0) 1223 49 4631
>> Fax + 44 (0) 1223 49 4468
>>
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
> --
> *Wasiu Ajenifuja Akanni*
> Developer
>
> Compara group
> EMBL-EBI
> Phone: + 44 (0) 1223 494 237 <+44%201223%20494237>
> Room A3145, West building | Wellcome Trust Genome Campus | Hinxton |
> Cambridge | CB10 1SD | UK
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180115/157c4550/attachment.html>
More information about the Dev
mailing list