[ensembl-dev] empty SpeciesFactory in FASTA pipeline

Lel Eory lel.eory at roslin.ed.ac.uk
Fri Dec 19 10:49:21 GMT 2014


The beekeeper output is the following:

beekeeper.pl -url $HIVEURL/ensadmin_fasta_dump_78 -runPipeline name: 
fasta_dump_78
Default meadow: LOCAL/MYHOST


Beekeeper : loop #1 ======================================================
GarbageCollector:       Checking for lost Workers...
GarbageCollector:       [Queen:] out of 0 Workers that haven't checked 
in during the last 5 seconds...
ScheduleSpecies            ( 1)       READY jobs(Sem:0, Rdy:1, InProg:0, 
Done+Pass:0, Fail:0)=1 Ave_msec:0, workers(Running:0, Reqired:1)   
h.cap:-  a.cap:-  (sync'd 0 sec ago)
DumpDNA                    ( 2)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:10  a.cap:-  (sync'd 0 sec ago)
DumpGenes                  ( 3)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:10  a.cap:-  (sync'd 0 sec ago)
ConcatFiles                ( 4)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:-  a.cap:-  (sync'd 0 sec ago)
PrimaryAssembly            ( 5)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:-  a.cap:-  (sync'd 0 sec ago)
CopyDNA                    ( 6)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:5  a.cap:-  (sync'd 0 sec ago)
BlastDNAIndex              ( 7)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:10  a.cap:-  (sync'd 0 sec ago)
BlastPepIndex              ( 8)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:5  a.cap:-  (sync'd 0 sec ago)
BlastGeneIndex             ( 9)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:5  a.cap:-  (sync'd 0 sec ago)
BlatDNAIndex               (10)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:5  a.cap:-  (sync'd 0 sec ago)
BlatSmDNAIndex             (11)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:5  a.cap:-  (sync'd 0 sec ago)
NcbiBlastDNAIndex          (12)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:10  a.cap:-  (sync'd 0 sec ago)
NcbiBlastPepIndex          (13)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:5  a.cap:-  (sync'd 0 sec ago)
NcbiBlastGeneIndex         (14)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:5  a.cap:-  (sync'd 0 sec ago)
SCPBlast                   (15)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:3  a.cap:-  (sync'd 0 sec ago)
ChecksumGeneratorFactory   (16)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:-  a.cap:-  (sync'd 0 sec ago)
ChecksumGenerator          (17)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:10  a.cap:-  (sync'd 0 sec ago)
Notify                     (18)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, 
Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   
h.cap:-  a.cap:-  (sync'd 0 sec ago)
total over 18 analyses :   0.00% complete (< 0.00 CPU_hrs) (1 to_do + 0 
done + 0 failed = 1 total)

===== Stats of active Roles as recorded in the pipeline database: ======
          ======= TOTAL ======= : 0 active Roles

Scheduler : Discarded 17 analyses because they do not need any Workers.
Scheduler : ScheduleSpecies            ( 1)       READY jobs(Sem:0, 
Rdy:1, InProg:0, Done+Pass:0, Fail:0)=1 Ave_msec:0, workers(Running:0, 
Reqired:1)   h.cap:-  a.cap:-  (sync'd 0 sec ago)
Scheduler : Before checking the Valley for pending jobs, the Scheduler 
allocated 1 x LOCAL:default extra workers for 'ScheduleSpecies' [1.0000 
hive_load remaining]
Scheduler : I recommend submitting 1 x LOCAL:default workers

Beekeeper : submitting 1 workers (rc_name=default) to LOCAL/MYHOST
Executing [ LOCAL/MYHOST ] 
$ENS_CVS_ROOT_DIR/ensembl-hive/scripts/runWorker.pl -url 
$HIVEURL/ensadmin_fasta_dump_78 -rc_name default &
Beekeeper : stopped looping because the number of loops was limited by 1 
and this limit expired
Beekeeper: dbc 0 disconnect cycles
Worker 1 [ UNSPECIALIZED ] resource_class_id=1, meadow=LOCAL/MYHOST, 
process=16051 at MYHOST.roslin.ed.ac.uk, last_check_in=NEVER, 
batch_size=UNSPECIALIZED, job_limit=NONE, life_span=3600, 
worker_log_dir=STDOUT/STDERR
Worker 1 [ UNSPECIALIZED ] Found 18 analyses matching '%' pattern
Worker 1 [ UNSPECIALIZED ] specializing to ScheduleSpecies(1)
Worker 1 [ Role 1 , ScheduleSpecies(1) ] Job 1 : complete
Worker 1 [ Role 1 , ScheduleSpecies(1) ] Having completed 1 jobs the 
Worker exits : NO_WORK



On 12/19/2014 10:29 AM, Andrew Yates wrote:
> And when you set it going what do you see?
>
>> On 19 Dec 2014, at 09:45, Lel Eory <lel.eory at roslin.ed.ac.uk> wrote:
>>
>> Hi Andy,
>>
>> I run initi_pipeline with -run_all 1 and pasted the output below.
>>
>> Thanks,
>> Lel
>>
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>
>> Running the command:
>>         db_cmd.pl -url $HIVEURL/ensadmin_fasta_dump_78 -sql 'DROP DATABASE IF EXISTS'
>> Done.
>>
>> Running the command:
>>         db_cmd.pl -url $HIVEURL/ensadmin_fasta_dump_78 -sql 'CREATE DATABASE'
>> Done.
>>
>> Running the command:
>>         db_cmd.pl -url $HIVEURL/ensadmin_fasta_dump_78 <$ENSEMBL_CVS_ROOT_DIR/ensembl-hive/sql/tables.mysql
>> Done.
>>
>> Running the command:
>>         db_cmd.pl -url $HIVEURL/ensadmin_fasta_dump_78 <$ENSEMBL_CVS_ROOT_DIR/ensembl-hive/sql/foreign_keys.sql
>> Done.
>>
>> Running the command:
>>         db_cmd.pl -url $HIVEURL/ensadmin_fasta_dump_78 <$ENSEMBL_CVS_ROOT_DIR/ensembl-hive/sql/procedures.mysql
>> Done.
>>
>> Adding hive_meta table entries ...
>> Created a new naked entry {"meta_key" => "hive_auto_rebalance_semaphores","meta_value" => 0}
>> Created a new naked entry {"meta_key" => "hive_sql_schema_version","meta_value" => 62}
>> Created a new naked entry {"meta_key" => "hive_use_param_stack","meta_value" => 0}
>> Created a new naked entry {"meta_key" => "hive_pipeline_name","meta_value" => "fasta_dump_78"}
>> Done.
>>
>> Adding pipeline-wide parameters ...
>> Created a new naked entry {"param_name" => "previous_release","param_value" => 77}
>> Created a new naked entry {"param_name" => "db_types","param_value" => "[]"}
>> Created a new naked entry {"param_name" => "release","param_value" => 78}
>> Created a new naked entry {"param_name" => "base_path","param_value" => "\"/home/ensadmin/tmp_tools_data\""}
>> Done.
>>
>> Adding Resources ...
>>         NB:'default' resource class is not in the database (did you forget to inherit from SUPER::resource_classes ?) - creating it for you
>> Created a new ResourceClass[]: default
>> Created a new ResourceClass[]: dump
>> Created a new ResourceDescription: (dump, LSF)->("-q long -M1000 -R"select[mem>1000] rusage[mem=1000]"", "")
>> Created a new ResourceClass[]: indexing
>> Created a new ResourceDescription: (indexing, LSF)->("-q normal -M3000 -R"select[mem>3000] rusage[mem=3000]"", "")
>> Done.
>>
>> Adding Analyses ...
>> Created a new Analysis[]: ScheduleSpecies->(Bio::EnsEMBL::Production::Pipeline::FASTA::ReuseSpeciesFactory, {"force_species" => [],"ftp_dir" => "","run_all" => 0,"sequence_type_list" => [],"species" => []}, default)
>> Created a new ScheduleSpecies            ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:-  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: DumpDNA->(Bio::EnsEMBL::Production::Pipeline::FASTA::DumpFile, {"process_logic_names" => [],"skip_logic_names" => []}, dump)
>> Created a new DumpDNA                    ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:10  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: DumpGenes->(Bio::EnsEMBL::Production::Pipeline::FASTA::DumpFile, {"process_logic_names" => [],"skip_logic_names" => []}, dump)
>> Created a new DumpGenes                  ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:10  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: ConcatFiles->(Bio::EnsEMBL::Production::Pipeline::FASTA::ConcatFiles, {}, default)
>> Created a new ConcatFiles                ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:-  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: PrimaryAssembly->(Bio::EnsEMBL::Production::Pipeline::FASTA::CreatePrimaryAssembly, {}, default)
>> Created a new PrimaryAssembly            ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:-  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: CopyDNA->(Bio::EnsEMBL::Production::Pipeline::FASTA::CopyDNA, {"ftp_dir" => ""}, default)
>> Created a new CopyDNA                    ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:5  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: BlastDNAIndex->(Bio::EnsEMBL::Production::Pipeline::FASTA::WuBlastIndexer, {"index_masked_files" => 0,"molecule" => "dna","program" => "xdformat","skip" => 0,"type" => "genomic"}, indexing)
>> Created a new BlastDNAIndex              ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:10  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: BlastPepIndex->(Bio::EnsEMBL::Production::Pipeline::FASTA::WuBlastIndexer, {"molecule" => "pep","program" => "xdformat","skip" => 0,"type" => "genes"}, default)
>> Created a new BlastPepIndex              ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:5  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: BlastGeneIndex->(Bio::EnsEMBL::Production::Pipeline::FASTA::WuBlastIndexer, {"molecule" => "dna","program" => "xdformat","skip" => 0,"type" => "genes"}, default)
>> Created a new BlastGeneIndex             ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:5  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: BlatDNAIndex->(Bio::EnsEMBL::Production::Pipeline::FASTA::BlatIndexer, {"index" => "dna","index_masked_files" => 1,"port_offset" => 30000,"program" => "faToTwoBit","skip" => 0}, indexing)
>> Created a new BlatDNAIndex               ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:5  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: BlatSmDNAIndex->(Bio::EnsEMBL::Production::Pipeline::FASTA::BlatIndexer, {"index" => "dna_sm","index_masked_files" => 1,"port_offset" => 30000,"program" => "faToTwoBit","skip" => 0}, indexing)
>> Created a new BlatSmDNAIndex             ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:5  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: NcbiBlastDNAIndex->(Bio::EnsEMBL::Production::Pipeline::FASTA::NcbiBlastIndexer, {"index_masked_files" => 0,"molecule" => "dna","program" => "makeblastdb","skip" => 0,"type" => "genomic"}, indexing)
>> Created a new NcbiBlastDNAIndex          ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:10  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: NcbiBlastPepIndex->(Bio::EnsEMBL::Production::Pipeline::FASTA::NcbiBlastIndexer, {"molecule" => "pep","program" => "makeblastdb","skip" => 0,"type" => "genes"}, default)
>> Created a new NcbiBlastPepIndex          ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:5  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: NcbiBlastGeneIndex->(Bio::EnsEMBL::Production::Pipeline::FASTA::NcbiBlastIndexer, {"molecule" => "dna","program" => "makeblastdb","skip" => 0,"type" => "genes"}, default)
>> Created a new NcbiBlastGeneIndex         ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:5  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: SCPBlast->(Bio::EnsEMBL::Production::Pipeline::FASTA::SCPBlast, {"genes_dir" => "","genomic_dir" => "","no_scp" => 1,"scp_identity" => "","scp_user" => "ensadmin","target_servers" => []}, default)
>> Created a new SCPBlast                   ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:3  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: ChecksumGeneratorFactory->(Bio::EnsEMBL::Production::Pipeline::FASTA::FindDirs, {"column_names" => ["dir"],"fan_branch_code" => 2}, default)
>> Created a new ChecksumGeneratorFactory   ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:-  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: ChecksumGenerator->(Bio::EnsEMBL::Production::Pipeline::ChecksumGenerator, {}, default)
>> Created a new ChecksumGenerator          ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:10  a.cap:-  (sync'd 0 sec ago)
>> Created a new Analysis[]: Notify->(Bio::EnsEMBL::Production::Pipeline::FASTA::EmailSummary, {"email" => "ensadmin\@sanger.ac.uk","subject" => "fasta_dump_78 has finished"}, default)
>> Created a new Notify                     ( 0)       EMPTY jobs(Sem:0, Rdy:0, InProg:0, Done+Pass:0, Fail:0)=0 Ave_msec:0, workers(Running:0, Reqired:0)   h.cap:-  a.cap:-  (sync'd 0 sec ago)
>> Done.
>>
>> Adding Control and Dataflow Rules ...
>> Created a new DataflowRule[]: ScheduleSpecies --#4--> CopyDNA
>> Created a new DataflowRule[]: ScheduleSpecies --#1--> Notify
>> Created a new DataflowRule[]: ScheduleSpecies --#3--> DumpGenes
>> Created a new DataflowRule[]: ScheduleSpecies --#2--> DumpDNA
>> Created a new DataflowRule[]: ScheduleSpecies --#5--> ChecksumGeneratorFactory
>> Created a new DataflowRule[]: DumpDNA --#1--> ConcatFiles
>> Created a new AnalysisCtrlRule: DumpDNA ---| DumpGenes
>> Created a new DataflowRule[]: DumpGenes --#3--> BlastGeneIndex
>> Created a new DataflowRule[]: DumpGenes --#3--> NcbiBlastGeneIndex
>> Created a new DataflowRule[]: DumpGenes --#2--> BlastPepIndex
>> Created a new DataflowRule[]: DumpGenes --#2--> NcbiBlastPepIndex
>> Created a new DataflowRule[]: ConcatFiles --#1--> PrimaryAssembly
>> Created a new DataflowRule[]: ConcatFiles --#1--> BlatSmDNAIndex
>> Created a new DataflowRule[]: ConcatFiles --#1--> BlatDNAIndex
>> Created a new DataflowRule[]: ConcatFiles --#1--> BlastDNAIndex
>> Created a new DataflowRule[]: ConcatFiles --#1--> NcbiBlastDNAIndex
>> Created a new AnalysisCtrlRule: DumpDNA ---| PrimaryAssembly
>> Created a new DataflowRule[]: BlastPepIndex --#1--> SCPBlast
>> Created a new DataflowRule[]: BlastGeneIndex --#1--> SCPBlast
>> Created a new AnalysisCtrlRule: DumpDNA ---| SCPBlast
>> Created a new AnalysisCtrlRule: DumpGenes ---| SCPBlast
>> Created a new AnalysisCtrlRule: PrimaryAssembly ---| SCPBlast
>> Created a new AnalysisCtrlRule: BlastDNAIndex ---| SCPBlast
>> Created a new AnalysisCtrlRule: BlastGeneIndex ---| SCPBlast
>> Created a new AnalysisCtrlRule: BlastPepIndex ---| SCPBlast
>> Created a new AnalysisCtrlRule: DumpDNA ---| ChecksumGeneratorFactory
>> Created a new AnalysisCtrlRule: DumpGenes ---| ChecksumGeneratorFactory
>> Created a new AnalysisCtrlRule: DumpGenes ---| ChecksumGeneratorFactory
>> Created a new AnalysisCtrlRule: PrimaryAssembly ---| ChecksumGeneratorFactory
>> Created a new AnalysisCtrlRule: BlastDNAIndex ---| ChecksumGeneratorFactory
>> Created a new AnalysisCtrlRule: BlastGeneIndex ---| ChecksumGeneratorFactory
>> Created a new AnalysisCtrlRule: BlastPepIndex ---| ChecksumGeneratorFactory
>> Created a new DataflowRule[]: ChecksumGeneratorFactory --#2--> ChecksumGenerator WITH TEMPLATE: {"dir" => "#dir#"}
>> Created a new AnalysisCtrlRule: SCPBlast ---| Notify
>> Created a new AnalysisCtrlRule: ChecksumGenerator ---| Notify
>> Done.
>>
>> On 12/19/2014 09:11 AM, Andrew Yates wrote:
>>> Hey Lel
>>>
>>> What does the pipeline do when you set -run_all 1 when initialising?
>>>
>>> Andy
>>>
>>> ------------
>>> Andrew Yates - Ensembl Support Coordinator
>>> European Molecular Biology Laboratory
>>> European Bioinformatics Institute
>>> Wellcome Trust Genome Campus
>>> Hinxton, Cambridge
>>> CB10 1SD, United Kingdom
>>> Tel: +44-(0)1223-492538
>>> Fax: +44-(0)1223-494468
>>> Skype: andrewyatz
>>> http://www.ensembl.org/
>>>
>>>> On 19 Dec 2014, at 09:09, Anne Lyle <annelyle at ebi.ac.uk> wrote:
>>>>
>>>> I’ll have to let the core team answer that one - I’m just a humble web developer :)
>>>>
>>>> Cheers
>>>>
>>>> Anne
>>>>
>>>>
>>>> On 18 Dec 2014, at 17:21, Lel Eory <lel.eory at roslin.ed.ac.uk> wrote:
>>>>
>>>>> Hi Anne,
>>>>>
>>>>> I run:
>>>>> init_pipeline.pl Bio::EnsEMBL::Production::Pipeline::PipeConfig::FASTA_conf -host=$DBHOST -user=$DBUSER -password=$DBPASS -registry /PATH2/reg.pm -base_path /PATH2/tmp_tools_data -no_scp 1
>>>>>
>>>>> Unfortunately beekeeper still skips the analyses steps:
>>>>> ~~~~~~~~
>>>>> Scheduler : Discarded 17 analyses because they do not need any Workers.
>>>>> ~~~~~~~~
>>>>>
>>>>>
>>>>> Then for the next beekeeper run the message is:
>>>>> ~~~~~~~~
>>>>> Scheduler : Discarded 17 analyses because they do not need any Workers.
>>>>> Scheduler : Analysis 'Notify' is BLOCKED, safe-synching it...
>>>>> Scheduler : Safe-sync of Analysis 'Notify' succeeded.
>>>>> Scheduler : Analysis 'Notify' is still BLOCKED, skipping it.
>>>>> ~~~~~~~~
>>>>>
>>>>> Can it still be an issue with the production db?
>>>>>
>>>>> Thank you,
>>>>> Lel
>>>>>
>>>>>
>>>>> On 12/18/2014 04:22 PM, Anne Lyle wrote:
>>>>>> Hi Lel
>>>>>>
>>>>>> The changelog and changelog_tables are populated “manually” by our developers via a web interface - they’re mainly needed for the website news, though we’ve hooked other processes into them to avoid duplication of effort. As Magali says, you should run your pipeline to ignore the changelog tables.
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> Anne
>>>>>>
>>>>>>
>>>>>> On 18 Dec 2014, at 16:12, Lel Eory <lel.eory at roslin.ed.ac.uk> wrote:
>>>>>>
>>>>>>> Hi Mag,
>>>>>>>
>>>>>>> Thank you for the detailed reply, I much appreciate it. From the query it is clear that I do not have changelog and changelog_species populated with information relevant to my species although the species table is populated.
>>>>>>> Do you run a script to populate these tables?
>>>>>>> If not I will look into the ensembl_production db and figure out what information do I nee to add.
>>>>>>>
>>>>>>> Thanks again,
>>>>>>> Lel
>>>>>>>
>>>>>>>
>>>>>>> On 12/18/2014 11:11 AM, mr6 at ebi.ac.uk wrote:
>>>>>>>> Hi Lel,
>>>>>>>>
>>>>>>>> This pipeline relies heavily on the production database.
>>>>>>>>
>>>>>>>> As we only want to dump fasta files for species which have changed for the
>>>>>>>> release, the ScheduleSpecies module checks in the production database if
>>>>>>>> there have been any changes declared for this species, for this release.
>>>>>>>> If there is no declaration, the species is skipped.
>>>>>>>>
>>>>>>>> To circumvent this, you should be able to use the -run_all option, as this
>>>>>>>> is used to tell the pipeline to ignore the declarations and just run
>>>>>>>> everything. It needs an argument though, so you would need to add -run_all
>>>>>>>> 1 to your init_pipeline command line
>>>>>>>>
>>>>>>>> The -species argument is meant to filter out a species or list of species.
>>>>>>>> So if you specify -species homo_sapiens, it will only take into account
>>>>>>>> human when deciding whether or not it should dump data. This will still
>>>>>>>> check in the production database if there is anything to dump for that
>>>>>>>> species though.
>>>>>>>>
>>>>>>>> The -force_species argument will allow you to skip the production database
>>>>>>>> check for a species or list of species.
>>>>>>>>
>>>>>>>> So running the following command line should work for you without needing
>>>>>>>> to add anything in the production database
>>>>>>>> init_pipeline.pl
>>>>>>>> Bio::EnsEMBL::Production::Pipeline::PipeConfig::FASTA_conf -user
>>>>>>>> write_user -password ******  -host=my_local_host -no_scp 1 -base_path
>>>>>>>>
>>>>>>>>
>>>>>>>> If you want to try to get the expected data in the production database,
>>>>>>>> this is the check that decides whether a species needs data dumping or
>>>>>>>> not:
>>>>>>>> my_base_path -registry my_registry.conf -run_all 1
>>>>>>>> select count(*)
>>>>>>>> from changelog c
>>>>>>>> join changelog_species cs using (changelog_id)
>>>>>>>> join species s using (species_id)
>>>>>>>> where c.release_id = 78
>>>>>>>> and (c.assembly = 'Y' or c.repeat_masking = 'Y')
>>>>>>>> and c.status = 'handed_over'
>>>>>>>> and s.production_name = 'species_name'
>>>>>>>>
>>>>>>>>
>>>>>>>> Let me know if that helps,
>>>>>>>> mag
>>>>>>>>
>>>>>>>>> Hello Developers,
>>>>>>>>>
>>>>>>>>> I try to run the FASTA pipeline per the document
>>>>>>>>> https://github.com/Ensembl/ensembl-production/blob/release/78/docs/fasta.textile
>>>>>>>>> My registry file is set-up as suggested.
>>>>>>>>> I run init_pipeline either with -run_all or with -species defined, but
>>>>>>>>> then beekeeper skips the analyses as no species defined for the pipeline.
>>>>>>>>>
>>>>>>>>>  From beekeeper:
>>>>>>>>> Scheduler : Discarded 17 analyses because they do not need any Workers.
>>>>>>>>>
>>>>>>>>> ....
>>>>>>>>> Worker 1 [ UNSPECIALIZED ] specializing to ScheduleSpecies(1)
>>>>>>>>>
>>>>>>>>> -------------------- WARNING ----------------------
>>>>>>>>> MSG: acanthisitta_chloris is not a valid species name (check DB and API
>>>>>>>>> version)
>>>>>>>>> FILE: Bio/EnsEMBL/Registry.pm LINE: 1200
>>>>>>>>> CALLED BY: Production/Pipeline/SpeciesFactory.pm  LINE: 85
>>>>>>>>> Date (localtime)    = Thu Dec 18 10:25:53 2014
>>>>>>>>> Ensembl API version = 78
>>>>>>>>> ---------------------------------------------------
>>>>>>>>>
>>>>>>>>> The problem is most likely related to my production database, as I can
>>>>>>>>> list all the core databases using my registry file and these are present
>>>>>>>>> for release 78. Can someone suggest a way to check what is wrong and why
>>>>>>>>> SpeciesFactory does not generate the list of species beekeeper needs?
>>>>>>>>>
>>>>>>>>> Thank you.
>>>>>>>>>
>>>>>>>>> Best wishes,
>>>>>>>>> Lel
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------------------------------------------
>>>>>>>>> Lel Eory, PhD
>>>>>>>>> The Roslin Institute
>>>>>>>>> University of Edinburgh Easter Bush Campus
>>>>>>>>> Midlothian EH25 9RG
>>>>>>>>> Scotland UK
>>>>>>>>> Phone: +44 131 6519212
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>>>>> Scotland, with registration number SC005336.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>>>>> Posting guidelines and subscribe/unsubscribe info:
>>>>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>> ---------------------------------------------
>>>>>>> Lel Eory, PhD
>>>>>>> The Roslin Institute
>>>>>>> University of Edinburgh Easter Bush Campus
>>>>>>> Midlothian EH25 9RG
>>>>>>> Scotland UK
>>>>>>> Phone: +44 131 6519212
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>>> Scotland, with registration number SC005336.
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>> _______________________________________________
>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>> ---------------------------------------------
>>>>> Lel Eory, PhD
>>>>> The Roslin Institute
>>>>> University of Edinburgh Easter Bush Campus
>>>>> Midlothian EH25 9RG
>>>>> Scotland UK
>>>>> Phone: +44 131 6519212
>>>>>
>>>>>
>>>>> -- 
>>>>> The University of Edinburgh is a charitable body, registered in
>>>>> Scotland, with registration number SC005336.
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Dev mailing list    Dev at ensembl.org
>>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>>> Ensembl Blog: http://www.ensembl.info/
>>>> _______________________________________________
>>>> Dev mailing list    Dev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>
>> ---------------------------------------------
>> Lel Eory, PhD
>> The Roslin Institute
>> University of Edinburgh Easter Bush Campus
>> Midlothian EH25 9RG
>> Scotland UK
>> Phone: +44 131 6519212
>>
>>
>> -- 
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/


---------------------------------------------
Lel Eory, PhD
The Roslin Institute
University of Edinburgh Easter Bush Campus
Midlothian EH25 9RG
Scotland UK
Phone: +44 131 6519212


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.





More information about the Dev mailing list