[ensembl-dev] Problem with load_genomedb in compara

Mateus Patricio mateus at ebi.ac.uk
Mon Dec 18 11:28:18 GMT 2017


Hi Francesco,

Could you please try adding the following to the locator field in the genome_db table for each of your custom genomes?

> Bio::EnsEMBL::Compara::GenomeMF/filename=/home/hd/hd_hd/hd_cc141/Vertebrates_project/Ensembl_compara/non_ensembl_species.jsonindex=1


Please match the index=1, index=2 to the order in which each genomes are declared in your file.

If your json file looks like this:

{
    "production_name" : "species1",
    "taxonomy_id"     : "8508",
    "cds_fasta"       : “species1.cds.fa",
    "prot_fasta"      : "species1.prot.fa",
    "gene_coord_gff"  : "species1.gff",
    "source"          : "augustus_maker",
},

{
    "production_name" : "species2",
    "taxonomy_id"     : "8496",
    "cds_fasta"       : “species2.cds.fa",
    "prot_fasta"      : “species2.prot.fa",
    "gene_coord_gff"  : "species2.gff",
    "source"          : "refseq",
}, 

You could add this info to the genome_db table:

INSERT INTO genome_db (locator) VALUES ('Bio::EnsEMBL::Compara::GenomeMF/filename=/home/hd/hd_hd/hd_cc141/Vertebrates_project/Ensembl_compara/non_ensembl_species.jsonindex=1’) WHERE name = ’species1'
INSERT INTO genome_db (locator) VALUES ('Bio::EnsEMBL::Compara::GenomeMF/filename=/home/hd/hd_hd/hd_cc141/Vertebrates_project/Ensembl_compara/non_ensembl_species.jsonindex=2’) WHERE name = ’species2’

I hope that helps.

Cheers,

Mateus.


> On 14 Dec 2017, at 17:17, Francesco Lamanna <francesco.lamanna at gmail.com> wrote:
> 
> Hi all,
> 
> I am trying to run the Compara protein-tree pipeline (v91) on a set of two core ensembl genomes and three custom genomes (stored locally) using a master database.
> 
> when I run beekeeper.pl <http://beekeeper.pl/> script load_genomedb fails to find my local genomes (but it loads correctly the ensembl core genomes), and drops this kind of error message:
> 
> Could not find species_name='petromyzon_marinus', assembly_name='germline_final' on the servers provided, please investigate at /home/hd/hd_hd/hd_cc141/EnsEMBL/ensembl-compara/modules/Bio/EnsEMBL/Compara/RunnableDB/LoadOneGenomeDB.pm line 179.
> 
> the information regarding the local genomes is stored in a .json file and loaded in the genome_db table of the master database. The .json file is invoked in pipeline conf script using:
> 
> 'curr_file_sources_locs'  => [ '/home/hd/hd_hd/hd_cc141/Vertebrates_project/Ensembl_compara/non_ensembl_species.json' ]
> 
> Thank you for your help,
> Francesco.
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20171218/c4188303/attachment.html>


More information about the Dev mailing list