[ensembl-dev] VEP 79 API problems

Guillermo Marco Puche guillermo.marco at sistemasgenomicos.com
Thu Jun 4 17:02:12 BST 2015


Hello Will,

Wrong behavior machine has Centos 5.4 and Perl v5.8.8 built for 
x86_64-linux-thread-multi.

So should I completly remove ?

*  foreach my $line(split /\r|\R/) {*


I was thinking about just removing \R from regex.

Regards,
Guillermo.

On 04/06/15 17:49, Will McLaren wrote:
> Thanks
>
> I had forgotten about that change. You could just edit the script and 
> change or even remove the regexp:
>
> foreach my $line(($_)) {
>
> What's your Perl version and system architecture? I'm surprised this 
> has not caught anyone else out.
>
> Will
>
> On 4 June 2015 at 14:47, Guillermo Marco Puche 
> <guillermo.marco at sistemasgenomicos.com 
> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
>     Hi Will,
>
>     I've been comparing variant_effect_predictor script from version
>     75 vs 79.
>     After adding a few prints to VEP.pm inside I've spotted the bug.
>     However I cannot resolve it.
>
>     Those lanes are new from 75 to 79 in VEP script (175 and 176):
>     *
>     *
>
>     *     # split again to avoid Windows character nonsense*
>
>     *     foreach my $line(split /\r|\R/) {*
>
>
>
>     I've checked that script is spliting line each time it finds a
>     capital R in VCF file as identifying it as a newline character
>     from Windows. I can't reproduce it in virtual machine since its a
>     fresh Linux install. In my work environment I'm getting this kind
>     of bug, so I guess it has something to do with file enconding or
>     locale? Has anyone else experienced this?
>
>     Now I know where's the error but I've no idea how to solve it.
>
>     Regards,
>     Guillermo.
>
>
>     On 04/06/15 15:16, Will McLaren wrote:
>>
>>     Sorry Guillermo, I'm running out of ideas.
>>
>>     Does the test unit run OK?
>>
>>     perl
>>     ensembl-tools/scripts/variant_effect_predictor/t/variant_effect_predictor.t
>>
>>     Will
>>
>>     On 4 Jun 2015 12:27, "Guillermo Marco Puche"
>>     <guillermo.marco at sistemasgenomicos.com
>>     <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>
>>         Hi Will,
>>
>>         I'm getting the exact same error with example_GRCh37.vcf:
>>
>>         ERROR: Could not detect input file format
>>
>>
>>         I've made a test script as you suggest with the following
>>         code and I don't get any error:
>>
>>         #!/usr/bin/env perl
>>
>>         use strict;
>>         use Bio::EnsEMBL::Variation::Utils::VEP qw(detect_format);
>>
>>
>>         Regards,
>>         Guillermo.
>>
>>         On 04/06/15 13:12, Will McLaren wrote:
>>>
>>>         Hi again
>>>
>>>         If the script is not detecting the input format then it is
>>>         almost certainly an issue with the input file. There's very
>>>         little code that gets run to detect the format, and it's all
>>>         internal to the VEP code.
>>>
>>>         You could write a short script to test the method, just
>>>         import detect_format from Bio:EnsEMBL::Variation::Utils::VEP
>>>
>>>         Does it detect the example_GRCh37.vcf format correctly?
>>>
>>>         The file you shared on Dropbox works fine for me on my Mac
>>>         and a Linux box.
>>>
>>>         Will
>>>
>>>         On 4 Jun 2015 10:44, "Guillermo Marco Puche"
>>>         <guillermo.marco at sistemasgenomicos.com
>>>         <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>
>>>             Hi again Will,
>>>
>>>             I'm trying with latest ensembl 80.
>>>             If I don't specify *-format vcf* I get the following error:
>>>
>>>             perl ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl  <http://variant_effect_predictor.pl>  -i input.vcf -database --force_overwrite
>>>             2015-06-04 11:36:59 - Starting...
>>>             ERROR: Could not detect input file format
>>>
>>>
>>>             If I force format with*-format vcf *I get all the
>>>             errors. (see error log attached). I'm using the same
>>>             input.vcf file I posted yesterday.
>>>             Just to discard it's not VCF, I've installed a fresh
>>>             linux on virtual machine and just cloned and setup
>>>             Ensembl and Bioperl. On fresh Linux install I was only
>>>             asked to install MySQL perl module (I installed it via
>>>             CPAN).
>>>             It's working like a cake.
>>>
>>>             I discard there's a problem with the input VCF because
>>>             I'm using exactly the input over the two environments
>>>             (and the same one you used to test it yesterday)
>>>
>>>             So my question is: Does VEP script use any other
>>>             library, environment variable or tool which may be
>>>             interfering?
>>>
>>>             Best regards,
>>>             Guillermo.
>>>
>>>
>>>             On 04/06/15 09:32, Guillermo Marco Puche wrote:
>>>>             Hi again Will,
>>>>
>>>>             I've completly cleaned PERL5LIB environment var. I've
>>>>             been testing changing between bioperl 1.2.3 and bioperl
>>>>             1.6.1 and got same warnings/errors.
>>>>             I've cloned again all 79 API like you suggested in a
>>>>             new tmp location and included it in $PERL5LIB.
>>>>
>>>>             *echo $PERL5LIB*
>>>>             /share/apps/local/bioperl-live:/share/gluster/tests/gmarco/tmp/ensembl/modules:/share/gluster/tests/gmarco/tmp/ensembl-funcgen/modules:/share/gluster/tests/gmarco/tmp/ensembl-variation/modules
>>>>
>>>>             *ll /share/gluster/tests/gmarco/tmp*
>>>>             total 20
>>>>             drwxrwxr-x  8 gmarco users 4096 jun 4 08:44 ensembl
>>>>             drwxrwxr-x  8 gmarco users  146 jun 4 08:46 ensembl-funcgen
>>>>             drwxrwxr-x  5 gmarco users   64 jun 4 08:43 ensembl-tools
>>>>             drwxrwxr-x 10 gmarco users 4096 jun 4 08:45
>>>>             ensembl-variation
>>>>
>>>>             *perl ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl  <http://variant_effect_predictor.pl>  -i input.vcf -database --force_overwrite*
>>>>             2015-06-04 09:29:13 - Starting...
>>>>             ERROR: Could not detect input file format
>>>>
>>>>             If use the following flags *-format vcf* *-vcf *then I
>>>>             start getting all those errors (see yesterday log).
>>>>
>>>>             Is there any other Perl lib or requirement I could be
>>>>             missing? As I said it's very weird I have 0 problems
>>>>             with Ensembl 75 local API.
>>>>
>>>>             Best regards,
>>>>             Guillermo.
>>>>
>>>>             On 03/06/15 18:14, Will McLaren wrote:
>>>>>             Hi again,
>>>>>
>>>>>             I can't recreate the problem with that input file I'm
>>>>>             afraid, either on my normal setup or scrubbing
>>>>>             PERL5LIB and starting from scratch.
>>>>>
>>>>>             See commands I used and input below.
>>>>>
>>>>>             Perhaps you haven't got release/79 of ensembl-tools too?
>>>>>
>>>>>             Have you tried running the installer from within
>>>>>             ensembl-tools/scripts/variant_effect_predictor? This
>>>>>             shouldn't affect your PERL5LIB or other git checkouts.
>>>>>
>>>>>             Will
>>>>>
>>>>>             ===================
>>>>>
>>>>>             mkdir ~/src/tmp
>>>>>             cd ~/src/tmp
>>>>>             git clone --branch release/79
>>>>>             https://github.com/Ensembl/ensembl-tools.git
>>>>>             git clone --branch release/79
>>>>>             https://github.com/Ensembl/ensembl.git
>>>>>             git clone --branch release/79
>>>>>             https://github.com/Ensembl/ensembl-variation.git
>>>>>             git clone --branch release/79
>>>>>             https://github.com/Ensembl/ensembl-funcgen.git
>>>>>             export
>>>>>             PERL5LIB=ensembl/modules:ensembl-variation/modules:ensembl-funcgen/modules:/Users/will/src/bioperl-1.2.3/:/Users/will/src/lib/perl5/
>>>>>             perl
>>>>>             ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl
>>>>>             <http://variant_effect_predictor.pl>  -i
>>>>>             ~/Downloads/input.vcf  -database
>>>>>             2015-06-03 17:09:54 - Starting...
>>>>>             2015-06-03 17:09:54 - Detected format of input file as vcf
>>>>>             2015-06-03 17:09:54 - Read 1 variants into buffer
>>>>>             2015-06-03 17:09:54 - Reading transcript data from
>>>>>             cache and/or database
>>>>>             [================================================================================================================================]
>>>>>              [ 100% ]
>>>>>             2015-06-03 17:10:00 - Retrieved 7 transcripts (0 mem,
>>>>>             0 cached, 7 DB, 0 duplicates)
>>>>>             2015-06-03 17:10:00 - Analyzing chromosome 1
>>>>>             2015-06-03 17:10:00 - Analyzing variants
>>>>>             [================================================================================================================================]
>>>>>              [ 100% ]
>>>>>             2015-06-03 17:10:00 - Calculating consequences
>>>>>             2015-06-03 17:10:00 - Processed 1 total variants (0
>>>>>             vars/sec, 0 vars/sec total)
>>>>>             2015-06-03 17:10:00 - Wrote stats summary to
>>>>>             variant_effect_output.txt_summary.html
>>>>>             2015-06-03 17:10:00 - Finished!
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             On 3 June 2015 at 16:51, Guillermo Marco Puche
>>>>>             <guillermo.marco at sistemasgenomicos.com
>>>>>             <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>>>
>>>>>                 Hi Will,
>>>>>
>>>>>                 I've been checking and I can't see any unintended
>>>>>                 whitespace or problem with tabulations.
>>>>>                 I've no issues with old vep 75 script and API.
>>>>>                 I've updated the Bioperl lib in $PERL5LIB variable
>>>>>                 from 1.2.3 to 1.6.1 (I didn't see this change
>>>>>                 before sorry) however I'm still getting all those
>>>>>                 errors.
>>>>>
>>>>>                 Here's a link where you can download the VCF I'm
>>>>>                 using as input:
>>>>>                 https://www.dropbox.com/sh/felwyoo5kl2mgty/AAC177Digqy-_mEmyk9WvmYba/input.vcf?dl=0
>>>>>
>>>>>                 Thank you.
>>>>>
>>>>>                 Best regards,
>>>>>                 Guille.
>>>>>
>>>>>
>>>>>                 On 03/06/15 17:30, Will McLaren wrote:
>>>>>>                 Hi Guille,
>>>>>>
>>>>>>                 It looks to me like your input is not being
>>>>>>                 parsed properly.
>>>>>>
>>>>>>                 Check the formatting of your input VCF; double
>>>>>>                 check that it is valid VCF, and that you haven't
>>>>>>                 got any unintended whitespace on any of the lines.
>>>>>>
>>>>>>                 If you still have an issue, can you send a line
>>>>>>                 or two of the input that recreates these issues?
>>>>>>
>>>>>>                 Thanks
>>>>>>
>>>>>>                 Will McLaren
>>>>>>                 Ensembl Variation
>>>>>>
>>>>>>
>>>>>>                 On 3 June 2015 at 16:16, Guillermo Marco Puche
>>>>>>                 <guillermo.marco at sistemasgenomicos.com
>>>>>>                 <mailto:guillermo.marco at sistemasgenomicos.com>>
>>>>>>                 wrote:
>>>>>>
>>>>>>                     Dear devs,
>>>>>>
>>>>>>                     I'm trying ensembl 79 VEP.
>>>>>>
>>>>>>                     This is my dummy input VCF:
>>>>>>                     http://pastebin.com/kFKWH50q#
>>>>>>
>>>>>>                     I've cloned and installed API from github as
>>>>>>                     always (this step is repeated for variaton,
>>>>>>                     funcgen and compara):
>>>>>>
>>>>>>                       * git clone --branch release/79
>>>>>>                         https://github.com/Ensembl/ensembl.git
>>>>>>                         ensembl_79
>>>>>>
>>>>>>                     PERL5LIB env variable is correctly pointing
>>>>>>                     to the cloned API:
>>>>>>
>>>>>>                       * echo $PERL5LIB
>>>>>>                         /share/apps/local/bioperl-live:/share/apps/src/ensembl_79/modules:/share/apps/src/ensembl_79-compara/modules:/share/apps/src/ensembl_79-variation/modules:/share/apps/src/ensembl_79-functgenomics/modules
>>>>>>
>>>>>>                     However I'm getting a lot of errors I really
>>>>>>                     don't understand. It seems like a bug with
>>>>>>                     API installation with me. If I change
>>>>>>                     $PERL5LIB variable to point to 75 API
>>>>>>                     (previous version I was using) I can't
>>>>>>                     reproduce the errors VEP script works for
>>>>>>                     this old 75 version.
>>>>>>
>>>>>>                     I've been reading the docs again and I can't
>>>>>>                     seen any additional PERL library requirement.
>>>>>>
>>>>>>                     Here's the error log:
>>>>>>                     http://pastebin.com/VvQrkEQZ
>>>>>>
>>>>>>
>>>>>>                     Thank you!
>>>>>>
>>>>>>                     Best regards,
>>>>>>                     Guille.
>>>>>>
>>>>>>
>>>>>>                     _______________________________________________
>>>>>>                     Dev mailing list Dev at ensembl.org
>>>>>>                     <mailto:Dev at ensembl.org>
>>>>>>                     Posting guidelines and subscribe/unsubscribe
>>>>>>                     info:
>>>>>>                     http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>                     Ensembl Blog: http://www.ensembl.info/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 _______________________________________________
>>>>>>                 Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>>>>>>                 Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>                 Ensembl Blog:http://www.ensembl.info/
>>>>>
>>>>>
>>>>>
>>>>>                 _______________________________________________
>>>>>                 Dev mailing list Dev at ensembl.org
>>>>>                 <mailto:Dev at ensembl.org>
>>>>>                 Posting guidelines and subscribe/unsubscribe info:
>>>>>                 http://lists.ensembl.org/mailman/listinfo/dev
>>>>>                 Ensembl Blog: http://www.ensembl.info/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             _______________________________________________
>>>>>             Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>>>>>             Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>>>>             Ensembl Blog:http://www.ensembl.info/
>>>>
>>>>
>>>>
>>>>
>>>>             _______________________________________________
>>>>             Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>>>>             Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>>>             Ensembl Blog:http://www.ensembl.info/
>>>
>>>
>>>
>>>             _______________________________________________
>>>             Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>             Posting guidelines and subscribe/unsubscribe info:
>>>             http://lists.ensembl.org/mailman/listinfo/dev
>>>             Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>>
>>>         _______________________________________________
>>>         Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>>>         Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>>         Ensembl Blog:http://www.ensembl.info/
>>
>>         _______________________________________________
>>         Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>         Posting guidelines and subscribe/unsubscribe info:
>>         http://lists.ensembl.org/mailman/listinfo/dev
>>         Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>>     _______________________________________________
>>     Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>>     Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>     Ensembl Blog:http://www.ensembl.info/
>
>
>     _______________________________________________
>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>     Posting guidelines and subscribe/unsubscribe info:
>     http://lists.ensembl.org/mailman/listinfo/dev
>     Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150604/93dc4244/attachment.html>


More information about the Dev mailing list