[ensembl-dev] two LastZ.pms

Wei, Xuehong weix at cshl.edu
Tue May 1 20:47:55 BST 2012


Hi Ensembl Compara Developers,

I  just find there are two LastZ.pm in the ensembl-compara package (v66). They were both used in the example configuration files
ensembl-compara/modules/Bio/EnsEMBL/Compara/PipeConfig/Example/lastz.conf

 { TYPE => PAIR_ALIGNER,
    'method_link' => [1001, 'LASTZ_RAW'],
    'analysis_template' => {
        '-program'       => 'lastz',
        '-parameters'    => "{method_link=>'LASTZ_RAW',options=>'T=1 L=3000 H=2200 O=400 E=30'}",
        '-module'        => 'Bio::EnsEMBL::Compara::Production::GenomicAlignBlock::LastZ',
    },
    'non_reference_collection_name'   => 'mouse all',
    'reference_collection_name'  => 'human all',
  },

  { TYPE => PAIR_ALIGNER,
    'method_link' => [1001, 'LASTZ_RAW'],
    'analysis_template' => {
        '-program'       => 'lastz',
        '-parameters'    => "{method_link=>'LASTZ_RAW',options=>'T=1 L=3000 H=2200 O=400 E=30'}",
        '-module'        => 'Bio::EnsEMBL::Compara::RunnableDB::PairAligner::LastZ',
    },
    'non_reference_collection_name'   => 'rat all',
    'reference_collection_name'  => 'human all',
  },


I am wondering what are the differences between these two LastZ.pm modules, if they were created for difference uses, when to use which.

Thanks,


Sharon

On Apr 25, 2012, at 4:24 AM, Javier Herrero wrote:

Dear Alison

The branch lengths are based on a mixture of an estimation based on 4D sites and an estimation based on million years since the divergence of two species (the data are taken from www.timetree.org<http://www.timetree.org/>).

At UCSC and Ensembl, we have agreed to use the same species tree for the alignments. However, we do not work exactly with the same set of species. You can find more information here: http://genomewiki.ucsc.edu/index.php/Human/hg19/GRCh37_46-way_multiple_alignment#4D_sites_branch_length_calculations

Additional species are included in the middle of an existing branch. An initial estimation of the branch lengths is based on the million years since the divergence of two species, the data being taken from www.timetree.org<http://www.timetree.org/>.

I hope this helps

Javier

On 24/04/12 17:23, Alison Wright wrote:

Hi,

I have accessed the Ensembl Species Tree generated for use in Phylowidget by the Compara team (which includes all the current species for the main Ensembl website, plus a few additional mammalian species of interest).

However, I cant find out what the branch lengths represent (number of substitutions per site? PAM units?) and how they were calculated?

Thanks very much for any help.

Alison Wright





_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/



--
Javier Herrero, PhD
Ensembl Coordinator and Ensembl Compara Project Leader
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK

_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list