[ensembl-dev] Bio::LocatableSeq::end warning message

Andy Yates ayates at ebi.ac.uk
Tue Mar 8 17:05:29 GMT 2011


Hi Chris,

That's fine that the logic should shout a bit more when incorrect values have been passed into the system but what is concerning me more is that a module is trimming the final stop codon which is causing the coordinates given to Bio::LocatableSeq to be incorrect. I've gone through the Compara side of things & cannot see anything which trims stops in the calls used.

Can you think of any default behaviour being inherited from PrimarySeq which could cause this?

Cheers,

Andy

On 8 Mar 2011, at 16:26, Chris Fields wrote:

> (Speaking from the bioperl end)
> 
> IIRC this was due to a logic issue in Bio::Range (which LocatableSeq inherits) that had been around for a long time in bioperl and led to silent bugs elsewhere in the code (not warning when bad start/end was given).  
> 
> The last word I heard on the subject, the Ensembl folks will likely tell you that they only support bioperl 1.2.3.  The bioperl devs are more than happy to help push that up to the latest release if possible, soon to be v1.6.2.
> 
> chris
> 
> On Mar 8, 2011, at 10:14 AM, Hans-Rudolf Hotz wrote:
> 
>>> I'm wondering if it might depend on the fact that I don't have the
>>> recommended Bioperl installation on this machine? (running the 1.6.1
>>> currently)
>> 
>> 
>> Yes, this looks very similar to the BioPer 1.6.1 versus BioPerl 1.2.3 issue in the "getConstrainedElements.pl" example script (taken from: /ensembl-compara/scripts/examples/) I tried to report a while ago.
>> 
>> 
>> Regards, Hans
>> 
>> 
>> 
>> 
>> On 03/08/2011 04:54 PM, Giuseppe G. wrote:
>>> Hi Andy,
>>> 
>>> thanks for your reply. On my machine your script produces the warnings.
>>> The following one does as well:
>>> 
>>> ----------------------
>>> my $ID = 'PF13_0197a';
>>> my $organism = 'Plasmodium falciparum';
>>> my $registry = 'Bio::EnsEMBL::Registry';
>>> 
>>> $registry->load_registry_from_multiple_dbs(
>>> {
>>> -host => 'ensembldb.ensembl.org',
>>> -user => 'anonymous',
>>> -verbose => 1,
>>> },
>>> {
>>> -host => 'mysql.ebi.ac.uk',
>>> -user => 'anonymous',
>>> -port => 4157,
>>> -verbose => 1,
>>> }
>>> );
>>> 
>>> my $member_adaptor = $registry->get_adaptor('pan_homology', 'compara',
>>> 'Member');
>>> my $homology_adaptor = $registry->get_adaptor('pan_homology', 'compara',
>>> 'Homology');
>>> 
>>> my $member = $member_adaptor->fetch_by_source_stable_id('ENSEMBLGENE',
>>> $ID);
>>> my $homologies = $homology_adaptor->fetch_all_by_Member($member);
>>> foreach my $h (@{$homologies}) {
>>> my $sa = $h->get_SimpleAlign();
>>> warn $sa->overall_percentage_identity();
>>> }
>>> ----------------------------------------------
>>> 
>>> I'm wondering if it might depend on the fact that I don't have the
>>> recommended Bioperl installation on this machine? (running the 1.6.1
>>> currently)
>>> 
>>> Best,
>>> Giuseppe
>>> 
>>> 
>>> On 08/03/11 15:00, Andy Yates wrote:
>>>> Hi Giuseppe,
>>>> 
>>>> I've just gone& run the following query in an attempt to replicate
>>>> your issue:
>>>> 
>>>> use strict;
>>>> use warnings;
>>>> use Bio::EnsEMBL::Registry;
>>>> Bio::EnsEMBL::Registry->load_registry_from_db(
>>>> -HOST => 'mysql.ebi.ac.uk',-PORT => 4157, -USER => 'anonymous',
>>>> -DB_VERSION => 61
>>>> );
>>>> my $dba = Bio::EnsEMBL::Registry->get_DBAdaptor('pan_homology',
>>>> 'compara');
>>>> my $ha = $dba->get_HomologyAdaptor();
>>>> my $ma = $dba->get_MemberAdaptor();
>>>> my $stable_id = 'ESTEXT_FGENESH1_KG.C_4000003';
>>>> my $member = $ma->fetch_by_source_stable_id('ENSEMBLGENE', $stable_id);
>>>> my $homologies = $ha->fetch_all_by_Member($member);
>>>> foreach my $h (@{$homologies}) {
>>>> my $sa = $h->get_SimpleAlign();
>>>> warn $sa->overall_percentage_identity();
>>>> }
>>>> 
>>>> However the issue did not reappear. Can you confirm if the above
>>>> script works on your setup or elaborate a bit more on the problem
>>>> you're seeing.
>>>> 
>>>> Andy
>>>> 
>>>> On 8 Mar 2011, at 12:23, Giuseppe G. wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I'm on version 61 and using the pan taxonomic database. Have you seen
>>>>> this kind of warning before:
>>>>> 
>>>>> 
>>>>> --------------------- WARNING ---------------------
>>>>> MSG: In sequence ESTEXT_FGENESH1_KG.C_4000003 residue count gives end
>>>>> value 353.
>>>>> Overriding value [354] with value 353 for Bio::LocatableSeq::end().
>>>>> MSMYGFEALNFNVDGGYLEAIVRGYRSGLLTSADYNNLCQCETLDDIKMHLGATDYGPYLANEPSPLHTATIVEKCTQKLVDEYNHMLTQATEPLSTFLEYITYGHMIDNVVLIVTGTLHERDVHELLEKCHPLGMFDSIASLAV---AQNMRELYRLVLVDTPLAPYFSECITSED----------LDDMNIEIMRNTLYKAYLEDFYRFCQKLGGATSTIMCDLLAFEADRRAVNITINSIGTELTR---DDRRKLYSKFGILYPYGHEELAACDDFDAVRGAMEKYPPYQAIFSKLS-FGES--------------QMLDKAFYEEEVKRLILSFEQQFHYAVFFAYMRLREQETRNLMWISECVAQNQKSRIHDGIVMTF----
>>>>> 
>>>>> ---------------------------------------------------
>>>>> 
>>>>> I've just seen it so not sure what part of the code it comes from. I
>>>>> see the warning refers to Bio::LocatableSeq. The only bioperl method
>>>>> I'm using in this code is overall_percentage_identity() from Simple
>>>>> Align. I use it to get an OPI value for the homologous sequences as
>>>>> follows:
>>>>> 
>>>>> my $pairwise_alignment_from_multiple = $homology->get_SimpleAlign;
>>>>> 
>>>>> $opi = $pairwise_alignment_from_multiple->overall_percentage_identity;
>>>>> 
>>>>> The warnings are not there when version is< 60. Any ideas?
>>>>> 
>>>>> Best,
>>>>> Giuseppe
>>>>> 
>>>>> --
>>>>> 
>>>>> The University of Edinburgh is a charitable body, registered in
>>>>> Scotland, with registration number SC005336.
>>>>> 
>>>>> _______________________________________________
>>>>> Dev mailing list
>>>>> Dev at ensembl.org
>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> 
>>> 
>> 
>> _______________________________________________
>> Dev mailing list
>> Dev at ensembl.org
>> http://lists.ensembl.org/mailman/listinfo/dev
> 

-- 
Andrew Yates                   Ensembl Genomes Engineer
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/








More information about the Dev mailing list