[ensembl-dev] find_leaf_by_name in ensembl Metazoa

Matthieu Muffato muffato at ebi.ac.uk
Tue Dec 18 10:51:59 GMT 2012


Hi Sébastien

It works for this gene, but I'm not even sure it would work for all the 
genes of that species. And the rule may be even different for other species.

A safer way is to use API to translate gene IDs to protein IDs. We are 
actually only interested in the protein used to build the gene tree (the 
"canonical" protein) and not all the possible translations.

my $gene_member = 
$member_adaptor->fetch_by_source_stable_id('ENSEMBLGENE', 'ADAR011299');
my $peptide_member = $gene_member->get_canonical_Member();

$tree->find_leaf_by_name($peptide_member->stable_id);

This should work for any species, and both on Ensembl and Ensembl Genomes.

Matthieu

On 18/12/12 10:43, Moretti Sébastien wrote:
> Hi Matthieu
>
> you mean I can do something like
>      $tree->find_leaf_by_name($seq_name) ||
> $tree->find_leaf_by_name($seq_name.'-PA');
>
> In other words, adding -PA in all failed cases will fix my problem ?
>
>> Hi Sébastien
>>
>> This happens because the names of the gene tree leaves are protein IDs
>> and ADAR011299 is a gene ID. In your case, it should work with
>> ADAR011299-PA.
>> metazoa.ensembl.org/Anopheles_darlingi/Gene/Compara_Tree?db=core;g=ADAR011299
>>
>>
>>
>> For some species, gene IDs and protein IDs are often identical, which
>> can be quite confusing.
>>
>> Best regards,
>> Matthieu
>>
>> On 18/12/12 10:12, Moretti Sébastien wrote:
>>> Hi
>>>
>>> with ensembl API 69, or previous APIs, I used to get a leaf object with
>>> this function:
>>>      my $leaf = $tree->find_leaf_by_name($seq_name);
>>>      print $leaf->node_id;
>>>
>>> I've never got problems with ensembl vertebrate data.
>>>
>>>
>>> I tried the same script with Ensembl Metazoa and got this error message:
>>>      Can't call method "node_id" on an undefined value
>>> or
>>>      Use of uninitialized value $leaf in concatenation
>>> $leaf appears to be undefined, find_leaf_by_name returns undef.
>>>
>>>
>>> Do you have an explanation for this ?
>>> Regards
>>>
>>> e.g.
>>>      my $leaf = $tree->find_leaf_by_name('ADAR011299');
>>>      in EMGT00050000000001 gene family


-- 
Matthieu Muffato, Ph.D.
Ensembl Developer - Comparative Genomics
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom




More information about the Dev mailing list