[ensembl-dev] Bio::LocatableSeq::end warning message

Andy Yates ayates at ebi.ac.uk
Tue Mar 8 16:10:35 GMT 2011


Hi Giuseppe,

A quick investigation into that particular protein's sequence (ESTEXT_FGENESH1_KG.C_4000003) there is a final stop codon which should not be present in sequences stored in the compara schema. My guess is that an API (probably BioPerl) is trimming that stop which gives a sequence length of 363 rather than 364.

As for the reason why this only appears in resources less than 61 is because this gene is from Physcomitrella patens which came into Ensembl Genomes in release 8. 

My gut feeling is that you can ignore the BioPerl message as it is only an warning & your OPI value should be unaffected. I'll see what can be done about future releases

Best regards,

Andy

On 8 Mar 2011, at 15:54, Giuseppe G. wrote:

> Hi Andy,
> 
> thanks for your reply. On my machine your script produces the warnings. The following one does as well:
> 
> ----------------------
> my $ID = 'PF13_0197a';
> my $organism = 'Plasmodium falciparum';
> my $registry = 'Bio::EnsEMBL::Registry';
> 
> $registry->load_registry_from_multiple_dbs(
> {
>     -host       => 'ensembldb.ensembl.org',
>     -user       => 'anonymous',
>     -verbose    => 1,
>    },
> {
>     -host    => 'mysql.ebi.ac.uk',
>     -user    => 'anonymous',
>     -port    => 4157,
>     -verbose => 1,
>    }
> );
> 
> my $member_adaptor = $registry->get_adaptor('pan_homology', 'compara', 'Member');
> my $homology_adaptor = $registry->get_adaptor('pan_homology', 'compara', 'Homology');
> 
> my $member = $member_adaptor->fetch_by_source_stable_id('ENSEMBLGENE', $ID);
> my $homologies = $homology_adaptor->fetch_all_by_Member($member);
> foreach my $h (@{$homologies}) {
>  my $sa = $h->get_SimpleAlign();
>  warn $sa->overall_percentage_identity();
> }
> ----------------------------------------------
> 
> I'm wondering if it might depend on the fact that I don't have the recommended Bioperl installation on this machine? (running the 1.6.1 currently)
> 
> Best,
> Giuseppe
> 
> 
> On 08/03/11 15:00, Andy Yates wrote:
>> Hi Giuseppe,
>> 
>> I've just gone&  run the following query in an attempt to replicate your issue:
>> 
>> use strict;
>> use warnings;
>> use Bio::EnsEMBL::Registry;
>> Bio::EnsEMBL::Registry->load_registry_from_db(
>>   -HOST =>  'mysql.ebi.ac.uk',-PORT =>  4157, -USER =>  'anonymous',
>>   -DB_VERSION =>  61
>> );
>> my $dba = Bio::EnsEMBL::Registry->get_DBAdaptor('pan_homology', 'compara');
>> my $ha = $dba->get_HomologyAdaptor();
>> my $ma = $dba->get_MemberAdaptor();
>> my $stable_id = 'ESTEXT_FGENESH1_KG.C_4000003';
>> my $member = $ma->fetch_by_source_stable_id('ENSEMBLGENE', $stable_id);
>> my $homologies = $ha->fetch_all_by_Member($member);
>> foreach my $h (@{$homologies}) {
>>   my $sa = $h->get_SimpleAlign();
>>   warn $sa->overall_percentage_identity();
>> }
>> 
>> However the issue did not reappear. Can you confirm if the above script works on your setup or elaborate a bit more on the problem you're seeing.
>> 
>> Andy
>> 
>> On 8 Mar 2011, at 12:23, Giuseppe G. wrote:
>> 
>>> Hi,
>>> 
>>> I'm on version 61 and using the pan taxonomic database. Have you seen this kind of warning before:
>>> 
>>> 
>>> --------------------- WARNING ---------------------
>>> MSG: In sequence ESTEXT_FGENESH1_KG.C_4000003 residue count gives end value 353.
>>> Overriding value [354] with value 353 for Bio::LocatableSeq::end().
>>> MSMYGFEALNFNVDGGYLEAIVRGYRSGLLTSADYNNLCQCETLDDIKMHLGATDYGPYLANEPSPLHTATIVEKCTQKLVDEYNHMLTQATEPLSTFLEYITYGHMIDNVVLIVTGTLHERDVHELLEKCHPLGMFDSIASLAV---AQNMRELYRLVLVDTPLAPYFSECITSED----------LDDMNIEIMRNTLYKAYLEDFYRFCQKLGGATSTIMCDLLAFEADRRAVNITINSIGTELTR---DDRRKLYSKFGILYPYGHEELAACDDFDAVRGAMEKYPPYQAIFSKLS-FGES--------------QMLDKAFYEEEVKRLILSFEQQFHYAVFFAYMRLREQETRNLMWISECVAQNQKSRIHDGIVMTF----
>>> ---------------------------------------------------
>>> 
>>> I've just seen it so not sure what part of the code it comes from. I see the warning refers to Bio::LocatableSeq. The only bioperl method I'm using in this code is overall_percentage_identity() from Simple Align. I use it to get an OPI value for the homologous sequences as follows:
>>> 
>>> my $pairwise_alignment_from_multiple = $homology->get_SimpleAlign;
>>> 
>>> $opi = $pairwise_alignment_from_multiple->overall_percentage_identity;
>>> 
>>> The warnings are not there when version is<  60. Any ideas?
>>> 
>>> Best,
>>> Giuseppe
>>> 
>>> --
>>> 
>>> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
>>> 
>>> _______________________________________________
>>> Dev mailing list
>>> Dev at ensembl.org
>>> http://lists.ensembl.org/mailman/listinfo/dev
>> 
> 
> -- 
> 
> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

-- 
Andrew Yates                   Ensembl Genomes Engineer
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/








More information about the Dev mailing list