[ensembl-dev] species tree for EnsemblGenomes compara databases

Matthieu Muffato muffato at ebi.ac.uk
Tue Jan 17 13:14:40 GMT 2012


Dear Alexandra

Since release 65, the Compara pipelines are now storing their species 
tree in the "meta" table. However, this is manually done and the 
information hasn't been passed on to Ensembl Genomes. This is why none 
of the eg databases contain the species tree used for the protein tree 
pipeline. The only exception being ensembl_compara_protists_12_65, which 
is a patched version of the eg11 database (no new species in eg12) and 
thus still contains the species tree at the previous location.
We will pay attention that the tree is copied over in the next releases.

Meanwhile, you can use the script 
ensembl-compara/scripts/taxonomy/taxonTreeTool.pl to recreate and print 
the species tree (from the NCBI taxonomy) exactly as it is done in the 
protein tree pipeline. You can run it on each ensembl / ensemblgenomes 
database with this command line:

perl taxonTreeTool.pl -url 
mysql://anonymous@mysql.ebi.ac.uk:4157/ensembl_compara_metazoa_12_65 
-taxa_compara -mini

(Only the last line is in the same format as the "species_tree_string")

Best,
Matthieu

On 13/01/12 12:18, alexandra louis wrote:
> Hi everybody,
> is there a way to get the species tree that has been used to build the
> protein trees for each EnsemblGenome compara databases?
>
> I found the "species_tree_string" tag in the table protein_tree_tag for
> ensembl_compara_protists_12_65
>
> species_tree_string
> (((2850*,35128*)2836,((164328*,67593*,403677*)4783,65071*)4762)33634,(5702*,5664*)5654,((5855*,5851*)418103,(5825*,5823*)418101,36329*)5820,44689*)2759;
>
>
> but, I can't find anything for the metazoa or plants.
>
> any clues?
>
> thanks a lot,
>
> Alex
>
>




More information about the Dev mailing list