[ensembl-dev] protein tree id and tagvalue

Matthieu Muffato muffato at ebi.ac.uk
Wed Jul 4 11:57:36 BST 2012


On Wed 04 Jul 2012 10:51:44 BST, Moretti Sébastien wrote:
> So now I use GeneTree adaptor, this way, with tree_type = tree (not
> clusters), and member_type = protein (not ncrna):
>
> my $tree_adaptor = $reg->get_adaptor('Multi', 'compara', 'GeneTree');
> my @children = @{$tree_adaptor->fetch_all(-tree_type => 'tree',
> -member_type => 'protein')};
>
> for my $tree (@children){
>     my $node_id   = $tree->root->node_id;
>     my $tax_level = $tree->root->get_tagvalue('taxon_name');
> }
>
>
>
> Now my script crashes later on.
> I check the tree root taxon_name to see if this is the taxonomic level
> I expect. If this is Bilateria instead of Euteleostomi I want to
> navigate in the tree until I reach a Euteleostomi node.
>
> I used a recursive function that use my $tree object with the children
> method
> $tree->children();
> But children is not a method of Bio::EnsEMBL::Compara::GeneTree
> and $tree->root->children() does not look to do the right thing.
>
> How could I do that ?

children() is indeed not defined in GeneTree but in GeneTreeNode. It 
returns an arrayref of GeneTreeNode. You can call node_id(), 
get_tagvalue(...) on each of them

>
> An additional question:
> Now the right way to use release_tree()
> is $tree->root->release_tree; ?

Yes, you can still use GeneTreeNode::release_tree(). I have not found 
yet a nice (transparent) way of cleaning up the memory

Regards,
Matthieu

> Regards
> Sébastien
>
>>> I am a bit confused.
>>> What is the best between ProteinTree and GeneTree ?
>>>
>>> Am I right saying that ProteinTree has to be associated with
>>> fetch_all()
>>> , and GeneTree with fetch_all(-tree_type => 'tree', -member_type =>
>>> 'protein') ?
>>
>> Hi Sébastien
>>
>> The two adaptors are designed to return different kinds of objects.
>>
>> Methods from the ProteinTreeAdaptor return nodes (root nodes, internal
>> nodes, leaves). It is a specialized version of GeneTreeNodeAdaptor to
>> only keep "protein" nodes in the result and discard "ncrna" nodes.
>>
>> Methods from the GeneTreeAdaptor return a GeneTree object.
>>
>> In the past, we did not have a GeneTree object, and a tree was instanced
>> by its root node. Since that, fetch_all in the ProteinTreeAdaptor had
>> this special behaviour: it only returned root nodes and not internal
>> nodes / leaves (in Ensembl, the fetch_all() method of an adaptor usually
>> returns everything).
>>
>> To fetch all trees, both adaptors only have a fetch_all() method.
>> The ProteinTreeAdaptor one does not have any arguments and returns all
>> the root nodes. This includes trees, super-trees, and the clusterset
>> (the artificial tree that connects them all). You will have to filter
>> the result. The GeneTreeAdaptor one has more arguments and you can
>> select a type of tree, a type of member, etc. The ProteinTreeAdaptor
>> will actually be deprecated in e68. So I encourage you to use
>> GeneTreeAdaptor instead.
>>
>>> And that an adaptor with ProteinTree will require
>>> $tree->tree->get_tagvalue('taxon_name')
>>> but an adaptor with GeneTree will require
>>> $tree->root->get_tagvalue('taxon_name') ???
>>
>> In your example, the first variable name is a bit confusing
>> With ProteinTreeAdaptor, you fetch $root, and you can call
>> $root->get_tagvalue('taxon_name')
>> With GeneTreeAdaptor, you fetch $tree. Then, as you said, you can call
>> $tree->root->get_tagvalue('taxon_name')
>>
>> Regards,
>> Matthieu
>


-- 
Matthieu Muffato, Ph.D.
Ensembl Developer - Comparative Genomics
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom




More information about the Dev mailing list