[ensembl-dev] two LastZ.pms

Kathryn Beal kbeal at ebi.ac.uk
Wed May 2 09:48:55 BST 2012


Hi Sharon,
We have replaced all the compara pipelines to use the new init_pipeline.pl system. In the process, there were major modifications to many of the associated modules. All the modules associated with the new pipelines can be found in Bio::EnsEMBL::Compara::RunnableDB. I have just removed all the modules  in Bio::EnsEMBL::Compara::Production::GenomicAlignBlock as these have either replacements or are no longer supported. This has affected HEAD but was done after the up-coming ensembl 67 branch.

Thank you for pointing out the problem with the lastz.conf file. I have corrected it so that both modules refer to 'Bio::EnsEMBL::Compara::RunnableDB::PairAligner::LastZ'.

  { TYPE => PAIR_ALIGNER,
    'method_link' => [1001, 'LASTZ_RAW'],
    'analysis_template' => {
        '-program'       => 'lastz',
        '-parameters'    => "{method_link=>'LASTZ_RAW',options=>'T=1 L=3000 H=2200 O=400 E=30'}",
        '-module'        => 'Bio::EnsEMBL::Compara::RunnableDB::PairAligner::LastZ',
    },
    'non_reference_collection_name'   => 'mouse all',
    'reference_collection_name'  => 'human all',
  },

  { TYPE => PAIR_ALIGNER,
    'method_link' => [1001, 'LASTZ_RAW'],
    'analysis_template' => {
        '-program'       => 'lastz',
        '-parameters'    => "{method_link=>'LASTZ_RAW',options=>'T=1 L=3000 H=2200 O=400 E=30'}",
        '-module'        => 'Bio::EnsEMBL::Compara::RunnableDB::PairAligner::LastZ',
    },
    'non_reference_collection_name'   => 'rat all',
    'reference_collection_name'  => 'human all',
  },

Cheers
Kathryn


> Hi Ensembl Compara Developers,
> 
> I  just find there are two LastZ.pm in the ensembl-compara package (v66). They were both used in the example configuration files
> ensembl-compara/modules/Bio/EnsEMBL/Compara/PipeConfig/Example/lastz.conf
> 
> { TYPE => PAIR_ALIGNER,
>    'method_link' => [1001, 'LASTZ_RAW'],
>    'analysis_template' => {
>        '-program'       => 'lastz',
>        '-parameters'    => "{method_link=>'LASTZ_RAW',options=>'T=1 L=3000 H=2200 O=400 E=30'}",
>        '-module'        => 'Bio::EnsEMBL::Compara::Production::GenomicAlignBlock::LastZ',
>    },
>    'non_reference_collection_name'   => 'mouse all',
>    'reference_collection_name'  => 'human all',
>  },
> 
>  { TYPE => PAIR_ALIGNER,
>    'method_link' => [1001, 'LASTZ_RAW'],
>    'analysis_template' => {
>        '-program'       => 'lastz',
>        '-parameters'    => "{method_link=>'LASTZ_RAW',options=>'T=1 L=3000 H=2200 O=400 E=30'}",
>        '-module'        => 'Bio::EnsEMBL::Compara::RunnableDB::PairAligner::LastZ',
>    },
>    'non_reference_collection_name'   => 'rat all',
>    'reference_collection_name'  => 'human all',
>  },
> 
> 
> I am wondering what are the differences between these two LastZ.pm modules, if they were created for difference uses, when to use which.
> 
> Thanks,
> 
> 
> Sharon
> 
> On Apr 25, 2012, at 4:24 AM, Javier Herrero wrote:
> 
> Dear Alison
> 
> The branch lengths are based on a mixture of an estimation based on 4D sites and an estimation based on million years since the divergence of two species (the data are taken from www.timetree.org<http://www.timetree.org/>).
> 
> At UCSC and Ensembl, we have agreed to use the same species tree for the alignments. However, we do not work exactly with the same set of species. You can find more information here: http://genomewiki.ucsc.edu/index.php/Human/hg19/GRCh37_46-way_multiple_alignment#4D_sites_branch_length_calculations
> 
> Additional species are included in the middle of an existing branch. An initial estimation of the branch lengths is based on the million years since the divergence of two species, the data being taken from www.timetree.org<http://www.timetree.org/>.
> 
> I hope this helps
> 
> Javier
> 
> On 24/04/12 17:23, Alison Wright wrote:
> 
> Hi,
> 
> I have accessed the Ensembl Species Tree generated for use in Phylowidget by the Compara team (which includes all the current species for the main Ensembl website, plus a few additional mammalian species of interest).
> 
> However, I cant find out what the branch lengths represent (number of substitutions per site? PAM units?) and how they were calculated?
> 
> Thanks very much for any help.
> 
> Alison Wright
> 
> 
> 
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> 
> 
> --
> Javier Herrero, PhD
> Ensembl Coordinator and Ensembl Compara Project Leader
> European Bioinformatics Institute (EMBL-EBI)
> Wellcome Trust Genome Campus, Hinxton
> Cambridge - CB10 1SD - UK
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list