[ensembl-dev] Gene Tree ambiguous nodes

Sébastien Moretti sebastien.moretti at unil.ch
Tue May 17 12:59:36 BST 2011


Thanks Matthieu

What is the threshold to define a node as ambiguous ?
Value range seems to be between 0 and 1

> Hello Sébastien
>
> In the database, the value is stored in the protein_tree_tag table (tag:
> "duplication_confidence_score"). It can be retrieved with the following
> API method: $node->get_tagvalue("duplication_confidence_score");
>
> Regards,
> Matthieu Muffato
>
>>> Hello Sébastien
>>>
>>> An ambiguous node is a duplication node with a duplication confidence
>>> score of 0. It means that the two resulting copies of the duplication
>>> can
>>> not be found at the same time in the same species. There is indeed a
>>> correlation with the bootstrap value, but the latter isn't use in the
>>> definition.
>>
>> Do you know where this duplication confidence score is stored in the
>> compara database ?
>> Or how to access it through the ensembl API ?
>>
>>> Right now, at most one lost taxon id is stored in the database. So the
>>> the
>>> API cannot help you to retrieve the full information, you'll have to
>>> rebuild the list of lost taxa by comparing the gene trees to the species
>>> tree.
>>
>> Okay.
>>
>>> Regards
>>> Matthieu Muffato
>>>
>>>> At the same time, do you store lost taxa in trees (NHX) as TreeFam does
>>>> ?
>>>>
>>>>> Hi
>>>>>
>>>>> I wonder how ambiguous nodes are defined in gene trees.
>>>>> Nothing seems to be attached to the D flag in NHX labels for ambiguous
>>>>> nodes.
>>>>>
>>>>>
>>>>> ambiguous nodes seem to be related to bootstrap values.
>>>>> Is it true ?
>>>>>
>>>>> If true, what is the bootstrap threshold you use to define a node as
>>>>> ambiguous ?
>>>>>
>>>>> Regards

-- 
Sébastien Moretti
Department of Ecology and Evolution,
Biophore, University of Lausanne,
CH-1015 Lausanne, Switzerland
Tel.: +41 (21) 692 4221/4079
http://bioinfo.unil.ch/




More information about the Dev mailing list