[ensembl-dev] species tree for EnsemblGenomes compara databases
Matthieu Muffato
muffato at ebi.ac.uk
Tue Jan 17 13:14:40 GMT 2012
Dear Alexandra
Since release 65, the Compara pipelines are now storing their species
tree in the "meta" table. However, this is manually done and the
information hasn't been passed on to Ensembl Genomes. This is why none
of the eg databases contain the species tree used for the protein tree
pipeline. The only exception being ensembl_compara_protists_12_65, which
is a patched version of the eg11 database (no new species in eg12) and
thus still contains the species tree at the previous location.
We will pay attention that the tree is copied over in the next releases.
Meanwhile, you can use the script
ensembl-compara/scripts/taxonomy/taxonTreeTool.pl to recreate and print
the species tree (from the NCBI taxonomy) exactly as it is done in the
protein tree pipeline. You can run it on each ensembl / ensemblgenomes
database with this command line:
perl taxonTreeTool.pl -url
mysql://anonymous@mysql.ebi.ac.uk:4157/ensembl_compara_metazoa_12_65
-taxa_compara -mini
(Only the last line is in the same format as the "species_tree_string")
Best,
Matthieu
On 13/01/12 12:18, alexandra louis wrote:
> Hi everybody,
> is there a way to get the species tree that has been used to build the
> protein trees for each EnsemblGenome compara databases?
>
> I found the "species_tree_string" tag in the table protein_tree_tag for
> ensembl_compara_protists_12_65
>
> species_tree_string
> (((2850*,35128*)2836,((164328*,67593*,403677*)4783,65071*)4762)33634,(5702*,5664*)5654,((5855*,5851*)418103,(5825*,5823*)418101,36329*)5820,44689*)2759;
>
>
> but, I can't find anything for the metazoa or plants.
>
> any clues?
>
> thanks a lot,
>
> Alex
>
>
More information about the Dev
mailing list