[ensembl-dev] Fetching EPO multiple alignment

Kathryn Beal kbeal at ebi.ac.uk
Wed Jun 6 15:29:34 BST 2012


Hi Fenix,
The DumpFakeMultiAlign script uses pairwise alignments and attempts to create a multiple alignment from them. If you want the EPO alignments, then you can either take them directly from the ftp site:

ftp://ftp.ensembl.org/pub/release-67/emf/ensembl-compara/epo_12_eutherian/

or use the DumpMultiAlign.pl script in the same directory as the DumpFakeMultiAlign.pl script ie in ensembl-compara/scripts/dumps/
To run it:
eg

 perl DumpMultiAlign.pl --compara_url mysql://anonymous@ensembldb:5306/ensembl_compara_67 --species human --seq_region 3 --seq_region_start 15112000 --seq_region_end 15113000 --alignment_type EPO --set_of_species mammals --restrict --output_file my_output_file

Using the compara_url to define the location of the database removes the need to set up a registry configuration file. The set_of_species can be defined using the species_set_name. The list of appropriate names can be found from the http://www.ensembl.org/info/docs/compara/analyses.html site in the description of the current multiple alignments.

There are a couple of warning messages which you can ignore. I will attend to these shortly and will put an update on HEAD. We intend to remove the DumpFakeMultiAlign.pl script since it is now very outdated.

Cheers
Kathryn


> Dear all,
> 
> I used the DumpFakeMultiAlign.pl to try to fetch EPO MSA.
> My first question is 
> How to set the registry_conf? or Can I set species in command without aliases?
> 
> I use "Bio::EnsEMBL::Registry->load_registry_from_url" to set my registry_conf.
> Then run command 
> perl DumpFakeMultiAlign.pl --species human --seq_region 3 --seq_region_start 15112000 --seq_region_end 15113000 --alignment_type EPO --set_of_species human:chimp:gorilla:orangutan:macaque:marmoset:mouse:rat:cow:pig:cat:horse
> 
> but encounter an error
> 
> ------------------ DEPRECATED ---------------------
> Deprecated method call in file DumpFakeMultiAlign.pl line 275.
> Method Bio::EnsEMBL::DBSQL::MetaContainer::get_Species is deprecated.
> Call is deprecated. Use $self->get_common_name() / $self->get_classification() / $self->get_scientific_name() instead
> Ensembl API version = 67
> ---------------------------------------------------
> 
> -------------------- EXCEPTION --------------------
> MSG: No matches found for name 'Homininae Homo sapiens' and assembly '--undef--'
> STACK Bio::EnsEMBL::Compara::DBSQL::GenomeDBAdaptor::fetch_by_name_assembly /usr/src/ensembl-compara/modules//Bio/EnsEMBL/Compara/DBSQL/GenomeDBAdaptor.pm:131
> STACK toplevel DumpFakeMultiAlign.pl:278
> Ensembl API version = 67
> ---------------------------------------------------
> 
> I tried to read manuals and Q&A, but it is not easy to understand for me so far.
> Please give me some hints or solution.
> 
> Sincerely,
> 
> Fenix
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list