[ensembl-dev] Use of Coelomata in NCBI Taxonomy

Matthieu Muffato muffato at ebi.ac.uk
Tue Dec 18 15:18:32 GMT 2012


I don't know exactly how you apply the filter, but yes, it probably 
needs to be updated (like using "Coelomata" instead of "Bilateria")

There are also ways of filtering homologies / tree nodes automatically. 
For example, the NCBITaxon adaptor allows you to:
  - fetch taxon objects from their scientific names (fetch_node_by_name)
  - fetch the last common ancestor of a list of taxa 
(fetch_first_shared_ancestor_indexed)
Any NCBITaxon object has the has_ancestor($anc) that returns a boolean 
if $anc is an ancestor of the current object.

You can combine these methods to check, for any homology, whether it is 
under the Bilateria node, and even better: "Bilateria" can be defined by 
a list of species. The script should then be able to cope with other 
taxon changes (if any)

Matthieu

On 18/12/12 14:59, Moretti Sébastien wrote:
> It means we cannot, in ensembl compara, make a filtering on Bilateria
> anymore. We need to use Coelomata as Vertebrate/Nematoda/Arthropoda root ?
>
>> Hi Sébastien
>>
>> By default, the gene-tree pipeline is building a species tree from the
>> NCBI taxonomy. This stands for Ensembl (vertebrates) and some
>> divisions of Ensembl Genomes (for other divisions, the feature is
>> overridden by a custom species tree).
>>
>> The NCBI has recently updated the taxonomy at the base of the animal
>> kingdom between e68 and e69. The tree now looks like:
>> (Metazoa > ... > Bilateria > ) Coelomata
>>    > Deuterostomia ( > Chordata > ... > Homo sapiens)
>>    > Protostomia > Ecdysozoa
>>             > Panarthropoda ( > ... > Drosophila melanogaster)
>>             > Nematoda ( > ... > Caenorhabditis elegans)
>> instead of:
>> (Metazoa > ... >) Bilateria
>>    > Pseudocoelomata > Nematoda ( > ... > Caenorhabditis elegans)
>>    > Coelomata
>>             > Protostomia ( > Panarthropoda > ... > Drosophila
>> melanogaster)
>>             > Deuterostomia ( > Chordata > ... > Homo sapiens)
>>
>> In the e69 Compara database, you can see homologies linked to
>> "Coelomata" and "Ecdysozoa", but not to "Metazoa" or "Bilateria". In
>> Ensembl Genomes, the change has also been included between versions 15
>> and 16, but there are more species covering this part of the tree of
>> life. As a result, you can still find homologies linked to
>> "Bilateria", "Metazoa", etc.
>>
>> We thought about publishing a blog post about that, but did not find
>> any exciting consequence
>>
>> Best regards,
>> Matthieu
>>
>> On 17/12/12 13:55, Moretti Sébastien wrote:
>>> Hi
>>>
>>> The documentation links towards
>>> http://www.phylowidget.org/full/index.html?tree=http://tinyurl.com/ensembltree&useBranchLengths=true&minTextSize=7,
>>>
>>> but this does not seem to be the tree used. You seem to be using the
>>> NCBI Taxonomy.
>>>
>>> Moreover, NCBI taxonomy includes Bilateria and Metazoa, which we do not
>>> find in Ensembl Vertebrata. We do find them in Ensembl Metazoa.
>>>
>>> Could you please clarify these points for Ensembl Compara users?
>


-- 
Matthieu Muffato, Ph.D.
Ensembl Developer - Comparative Genomics
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom




More information about the Dev mailing list