[ensembl-dev] Problem running gene build pipeline

Marc Hoeppner mphoeppner at gmail.com
Tue May 14 11:26:32 BST 2013


Hi!

I have been trying to complete a successful run with the gene build 
pipeline and am still stuck on the raw computes. I have tried running it 
in 'Local' mode as well as LSF (more on that below). In both cases, the 
rulemanager throws a bunch of errors I can't pin down. The pipeline is 
currently set up on a 48-core AMD machine, and I am using the latest 
checkout of branch 71. The local files for the pipeline configuration 
live in the sub-folder configs/pipeline-congigs/modules. All relevant 
folders are in my PERL5LIB. I made what I think to be the relevant 
modifications to BatchQueue.pm, Databases.pm and so on.

-> I have tested the individual modules I want to run using the 
test_RunnableDB script and all of them worked.

If I run the rulemanager on the full thing tho - or even on the subset 
of one module, it acts up. Here is my syntax:

perl /opt/bioinformatics/ensembl/ensembl-pipeline/scripts/rulemanager.pl 
-dbhost localhost -dbuser username -dbpass password -analysis 
RepeatMasker -submission_limit 10 -submission_number 10 -once -unlock

This throws a bunch of errors right up front, starting with


1)
'Use of uninitialized value $max_retry in numeric le (<=) at 
/data2/ensembl-test/test2/configs/pipeline-configs/modules/Bio/EnsEMBL/Pipeline/Job.pm 
line 1029.'

And that comes for every job (i.e thousands for a full genome). 
Apparently it doesn't like me limiting the run to a smaller subset? Is 
that normal..?

2) Every submission (locally as well as LSF) then comes with this the 
following errors:

---------------------------------------------------
Use of uninitialized value in string eq at 
/data2/ensembl-test/test2/configs/pipeline-configs/modules/Bio/EnsEMBL/Pipeline/Job.pm 
line 1047.
Use of uninitialized value in string eq at 
/data2/ensembl-test/test2/configs/pipeline-configs/modules/Bio/EnsEMBL/Pipeline/Job.pm 
line 1047.
Job: Null submission ID for the following, but continuing: 878
Use of uninitialized value in numeric ge (>=) at 
/data2/ensembl-test/test2/configs/pipeline-configs/modules/Bio/EnsEMBL/Pipeline/Job.pm 
line 541.
Use of uninitialized value $this_runner in -x at 
/data2/ensembl-test/test2/configs/pipeline-configs/modules/Bio/EnsEMBL/Pipeline/Job.pm 
line 335.

3) The individual error messages for each RepeatMasker run then are:

-------------------- EXCEPTION --------------------
MSG: Problems creating runnable RepeatMasker for 
contig:vcelegans_test:chrIV_124:1:50000:1 [Can't locate RepeatMasker.pm 
in @INC (@INC contains: /opt/bioinformatics/ensembl/ensembl/modules 
/opt/bioinformatics/ensembl/ensembl-compara/modules 
/opt/bioinformatics/ensembl/ensembl-variation/modules 
/opt/bioinformatics/ensembl/ensembl-functgenomics/modules 
/opt/bioinformatics/ensembl/ensembl-analysis/modules 
/opt/bioinformatics/ensembl/ensembl-pipeline/scripts 
/opt/bioinformatics/ensembl/ensembl-killllist/modules 
/data2/ensembl-test/test2/configs/pipeline-configs/modules 
/usr/bin/tRNAscan_SE /usr/bin/tRNAscan-SE/ /etc/perl 
/usr/local/lib/perl/5.14.2 /usr/local/share/perl/5.14.2 /usr/lib/perl5 
/usr/share/perl5 /usr/lib/perl/5.14 /usr/share/perl/5.14 
/usr/local/lib/site_perl .) at 
/data2/ensembl-test/test2/configs/pipeline-configs/modules/Bio/EnsEMBL/Pipeline/Job.pm 
line 641.
]

This latest part is what bugs me the most, since the RepeatMasker.pm 
definitely is in 
/opt/bioinformatics/ensembl-71/ensembl-analysis/modules/Bio/EnsEMBL/Analysis/RunnableDB/RepeatMasker.pm

So I am pretty much at a loss here.

Any sort of helpful advice would be greatly appreciated!

Cheers,
Marc

P.S.: Regarding LSF - although probably unrelated to my issues - I 
should say that I am using openlava instead, which is an open source 
fork of the original Platform LSF and should feature the same sort of 
functionality and binaries/commands. Haven't tried the grid engine yet tho.





More information about the Dev mailing list