[ensembl-dev] Gene Tree ambiguous nodes

Matthieu Muffato muffato at ebi.ac.uk
Tue May 17 10:57:19 BST 2011


Hello Sébastien

In the database, the value is stored in the protein_tree_tag table (tag:
"duplication_confidence_score"). It can be retrieved with the following
API method: $node->get_tagvalue("duplication_confidence_score");

Regards,
Matthieu Muffato

>> Hello Sébastien
>>
>> An ambiguous node is a duplication node with a duplication confidence
>> score of 0. It means that the two resulting copies of the duplication
>> can
>> not be found at the same time in the same species. There is indeed a
>> correlation with the bootstrap value, but the latter isn't use in the
>> definition.
>
> Do you know where this duplication confidence score is stored in the
> compara database ?
> Or how to access it through the ensembl API ?
>
>> Right now, at most one lost taxon id is stored in the database. So the
>> the
>> API cannot help you to retrieve the full information, you'll have to
>> rebuild the list of lost taxa by comparing the gene trees to the species
>> tree.
>
> Okay.
>
>> Regards
>> Matthieu Muffato
>>
>>> At the same time, do you store lost taxa in trees (NHX) as TreeFam does
>>> ?
>>>
>>>> Hi
>>>>
>>>> I wonder how ambiguous nodes are defined in gene trees.
>>>> Nothing seems to be attached to the D flag in NHX labels for ambiguous
>>>> nodes.
>>>>
>>>>
>>>> ambiguous nodes seem to be related to bootstrap values.
>>>> Is it true ?
>>>>
>>>> If true, what is the bootstrap threshold you use to define a node as
>>>> ambiguous ?
>>>>
>>>> Regards
>
> --
> Sébastien Moretti
> Department of Ecology and Evolution,
> Biophore, University of Lausanne,
> CH-1015 Lausanne, Switzerland
> Tel.: +41 (21) 692 4221/4079
> http://bioinfo.unil.ch/
>






More information about the Dev mailing list