[ensembl-dev] Multiple alignment for more than 2 species

Matthieu Muffato muffato at ebi.ac.uk
Wed Sep 10 09:56:24 BST 2014


Dear Hunter,

The code on the tutorial indeed only works for pairwise alignments, for 
which we use LastZ.
We use a different method for the multiple alignments: EPO 
http://www.ensembl.org/info/genome/compara/analyses.html#epo

You can refer to our workshop exercises (in docs/workshop/) for an 
example of how to load multiple alignments:
https://github.com/Ensembl/ensembl-compara/blob/release/76/docs/workshop/API_workshop_exercises/GenomicAlignBlock_2.pl
But first replace
   fetch_by_dbID("619");
with
   fetch_by_method_link_type_species_set_name("EPO", "mammals")
The pair ("EPO", "mammals") can be found at the header of each table in 
the first link. You could use ("EPO", "primates") if you're only 
interested in primates, etc

Note that multiple alignments can only be loaded as a whole (in this 
case: the 16 mammals): you can't ask for only 3 species in the set.

Hope this helps,
Matthieu

On 10/09/14 09:42, Dazong Zhou wrote:
> Hi,
>
> I get problems on multiple sequence alignment for more than 2 species
> using the sample code at:
> http://www.ensembl.org/info/docs/api/compara/compara_tutorial.html (the
> code below "Here is an example script with all of this:").
>
> The sample runs well on default, but fails on adding additional species.
> That is to say, the below code works when changing "human:mouse" to
> "human:dog" but returns no data when changing to "human:mouse:dog" .
>
>
> my $species = "human";
> my $coord_system = "chromosome";
> my $seq_region = "14";
> my $seq_region_start = 75000000;
> my $seq_region_end = 75010000;
> my $alignment_type = "LASTZ_NET";
> my $set_of_species = "human:mouse";
> my $output_file = undef;
> my $output_format = "clustalw";
> my $help;
>
> GetOptions(
>      "help" => \$help,
>      "species=s" => \$species,
>      "coord_system=s" => \$coord_system,
>      "seq_region=s" => \$seq_region,
>      "seq_region_start=i" => \$seq_region_start,
>      "seq_region_end=i" => \$seq_region_end,
>      "alignment_type=s" => \$alignment_type,
>      "set_of_species=s" => \$set_of_species,
>      "output_format=s" => \$output_format,
>      "output_file=s" => \$output_file);
>
> Thanks,
> Hunter
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>

-- 
Matthieu Muffato, Ph.D.
Ensembl Compara Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
Room  A3-145
Phone + 44 (0) 1223 49 4631
Fax   + 44 (0) 1223 49 4468




More information about the Dev mailing list