[ensembl-dev] find_leaf_by_name in ensembl Metazoa

Daniel Lawson lawson at ebi.ac.uk
Tue Dec 18 10:51:34 GMT 2012


Hi Sebastien,

I think it will solve some of the failed cases.

The issue with nomenclature for genes, transcripts & translations will
always be there. When you code against the Ensembl project databases
(vertebrate species) all of the identifiers are consistent as they are all
assigned by the Ensembl team. When you code against the Ensembl Genomes
(Metazoa or other divisions) you will be working with identifiers assigned
by many different groups/projects who do not necessarily use the same
nomenclature. The -RA, -PA system originated with the Drosophila community
and is used for most insect/arthropod genomes.

You may want to retrieve the translation protein ID for a gene ID before
calling the gene tree leaf. That would be the scalable solution.

regards
Dan




On 18 December 2012 10:43, Moretti Sébastien <sebastien.moretti at unil.ch>wrote:

> Hi Matthieu
>
> you mean I can do something like
>     $tree->find_leaf_by_name($seq_**name) ||
> $tree->find_leaf_by_name($seq_**name.'-PA');
>
> In other words, adding -PA in all failed cases will fix my problem ?
>
>
>  Hi Sébastien
>>
>> This happens because the names of the gene tree leaves are protein IDs
>> and ADAR011299 is a gene ID. In your case, it should work with
>> ADAR011299-PA.
>> metazoa.ensembl.org/Anopheles_**darlingi/Gene/Compara_Tree?db=**
>> core;g=ADAR011299<http://metazoa.ensembl.org/Anopheles_darlingi/Gene/Compara_Tree?db=core;g=ADAR011299>
>>
>>
>> For some species, gene IDs and protein IDs are often identical, which
>> can be quite confusing.
>>
>> Best regards,
>> Matthieu
>>
>> On 18/12/12 10:12, Moretti Sébastien wrote:
>>
>>> Hi
>>>
>>> with ensembl API 69, or previous APIs, I used to get a leaf object with
>>> this function:
>>>      my $leaf = $tree->find_leaf_by_name($seq_**name);
>>>      print $leaf->node_id;
>>>
>>> I've never got problems with ensembl vertebrate data.
>>>
>>>
>>> I tried the same script with Ensembl Metazoa and got this error message:
>>>      Can't call method "node_id" on an undefined value
>>> or
>>>      Use of uninitialized value $leaf in concatenation
>>> $leaf appears to be undefined, find_leaf_by_name returns undef.
>>>
>>>
>>> Do you have an explanation for this ?
>>> Regards
>>>
>>> e.g.
>>>      my $leaf = $tree->find_leaf_by_name('**ADAR011299');
>>>      in EMGT00050000000001 gene family
>>>
>> --
> Sébastien Moretti
> Department of Ecology and Evolution,
> Biophore, University of Lausanne,
> CH-1015 Lausanne, Switzerland
> Tel.: +41 (21) 692 4221/4079
> http://selectome.unil.ch/ http://bgee.unil.ch/
>
> ______________________________**_________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/**mailman/listinfo/dev<http://lists.ensembl.org/mailman/listinfo/dev>
> Ensembl Blog: http://www.ensembl.info/
>



-- 
Ensembl Genomes | VectorBase | i5K insect genome initiative
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20121218/007df5d3/attachment.html>


More information about the Dev mailing list