[ensembl-dev] dust-ing
Lel Eory
lel.eory at ed.ac.uk
Fri Aug 30 10:23:03 BST 2013
Hi Bronwen,
>> The cvs_checkout/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/Config/BatchQueue.pm.example file does not contain configuration section for logic_name 'dust'. Can someone possibly give me the configuration corresponding to dust? (Grid engine parameters would be a bonus.)
> I have added this to the example file:
>
> {
> # this example uses the new 'memory' options which is an alternative to specifying memory
> # in the resource requirements. Each time a job is retried, the next element in the memory array will be used
> logic_name => 'dust',
> batch_size => 500, # calculate as approx. num toplevel slice / 20
> memory => ['700MB', '1500MB'],
> rerty_batch_size => 1, # assuming there are only a few, eg. less than 10 jobs
> retries => 3,
> },
Thanks for the example, I managed to set up the pipeline based on this.
>> If not can someone possibly print out the detailed help from tcdust and e-mail it back, if such a help exists for tcdust, to understand the various parameters the program accepts?
>>
>> From Dust.pm I assume the output goes to STDOUT and has the format of START..END - where START/END is the start and end coordinates of the low-complexity region - is this correct?
> Yes, that sounds right, after looking here
> ensembl-analysis/modules/Bio/EnsEMBL/Analysis/Runnable/Dust.pm
> in the parse_results method.
Dustmasker from the blast++ package (v.2.2.28) run OK, once I changed
the parsing from START..END format used by tcdust to START - END which
is returned by dustmasker.
Cheers,
Lel
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the Dev
mailing list