[ensembl-dev] dust-ing

Lel Eory lel.eory at ed.ac.uk
Fri Aug 2 09:50:26 BST 2013


Dear Developers,

I try to identify low complexity regions for some species by running the 
ensembl 'dust' analysis pipeline.

The 
cvs_checkout/ensembl-pipeline/modules/Bio/EnsEMBL/Pipeline/Config/BatchQueue.pm.example 
file does not contain configuration section for logic_name 'dust'. Can 
someone possibly give me the configuration corresponding to dust? (Grid 
engine parameters would be a bonus.)

The cvs_checkout/ensembl-doc/pipeline_docs/the_raw_computes.txt file 
says that the source for dust is coming from the NCBI blast suit (see. 
module description, line 245) and the "Analysis conf" section name the 
program as tcdust (same file line 573), which is consistent with the 
analysis tables from the databases. Is tcdust available to download from 
somewhere? (The NCBI blast+ package only have dustmasker, but no tcdust.)

If not can someone possibly print out the detailed help from tcdust and 
e-mail it back, if such a help exists for tcdust, to understand the 
various parameters the program accepts?

 From Dust.pm I assume the output goes to STDOUT and has the format of 
START..END - where START/END is the start and end coordinates of the 
low-complexity region - is this correct?

Many thanks,
Lel

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.





More information about the Dev mailing list