[ensembl-dev] VEP 79 API problems

Guillermo Marco Puche guillermo.marco at sistemasgenomicos.com
Thu Jun 4 19:09:32 BST 2015


Yes I guess it's clear Perl version is the problem. I'll remove \R from 
this line in the script until I can update Perl version in my work 
environment.

As always, thank you for your fantastic support.

Best regards,
Guillermo.

El 04/06/2015 a las 18:19, Healy, Matthew escribió:
>
> In the regex documentation for Perl 5.8.8 there is no mention of \R 
> (there is of course \r lowercase), so the Perl version probably is the 
> issue:
>
> http://perldoc.perl.org/5.8.8/perlre.html
>
> *From:*dev-bounces at ensembl.org [mailto:dev-bounces at ensembl.org] *On 
> Behalf Of *Will McLaren
> *Sent:* 04 June, 2015 12:12 PM
> *To:* Ensembl developers list
> *Subject:* Re: [ensembl-dev] VEP 79 API problems
>
> I'm wondering if 5.8.8 has different regex handling to newer Perl 
> versions. Someone else in the Ensembl team may know better than me on 
> this one.
>
> I believe the Ensembl project now recommends at least 5.10 (according 
> to http://www.ensembl.org/info/docs/api/api_installation.html at 
> least); most people in the wild use 5.14 or 5.16 AFAIK.
>
> If you can possibly try a newer version of Perl this may solve your 
> issues. Perlbrew is a nice way to manage different versions and module 
> sets http://perlbrew.pl/
>
> Will
>
> On 4 June 2015 at 17:02, Guillermo Marco Puche 
> <guillermo.marco at sistemasgenomicos.com 
> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
> Hello Will,
>
> Wrong behavior machine has Centos 5.4 and Perl v5.8.8 built for 
> x86_64-linux-thread-multi.
>
> So should I completly remove ?
>
> *  foreach my $line(split /\r|\R/) {*
>
>
> I was thinking about just removing \R from regex.
>
> Regards,
> Guillermo.
>
> On 04/06/15 17:49, Will McLaren wrote:
>
>     Thanks
>
>     I had forgotten about that change. You could just edit the script
>     and change or even remove the regexp:
>
>     foreach my $line(($_)) {
>
>     What's your Perl version and system architecture? I'm surprised
>     this has not caught anyone else out.
>
>     Will
>
>     On 4 June 2015 at 14:47, Guillermo Marco Puche
>     <guillermo.marco at sistemasgenomicos.com
>     <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
>     Hi Will,
>
>     I've been comparing variant_effect_predictor script from version
>     75 vs 79.
>     After adding a few prints to VEP.pm inside I've spotted the bug.
>     However I cannot resolve it.
>
>     Those lanes are new from 75 to 79 in VEP script (175 and 176):
>
>     *     # split again to avoid Windows character nonsense*
>
>     *     foreach my $line(split /\r|\R/) {*
>
>
>
>     I've checked that script is spliting line each time it finds a
>     capital R in VCF file as identifying it as a newline character
>     from Windows. I can't reproduce it in virtual machine since its a
>     fresh Linux install. In my work environment I'm getting this kind
>     of bug, so I guess it has something to do with file enconding or
>     locale? Has anyone else experienced this?
>
>     Now I know where's the error but I've no idea how to solve it.
>
>     Regards,
>     Guillermo.
>
>     On 04/06/15 15:16, Will McLaren wrote:
>
>         Sorry Guillermo, I'm running out of ideas.
>
>         Does the test unit run OK?
>
>         perl
>         ensembl-tools/scripts/variant_effect_predictor/t/variant_effect_predictor.t
>
>         Will
>
>         On 4 Jun 2015 12:27, "Guillermo Marco Puche"
>         <guillermo.marco at sistemasgenomicos.com
>         <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
>         Hi Will,
>
>         I'm getting the exact same error with example_GRCh37.vcf:
>
>         ERROR: Could not detect input file format
>
>
>         I've made a test script as you suggest with the following code
>         and I don't get any error:
>
>         #!/usr/bin/env perl
>
>           
>
>         use strict;
>
>         use Bio::EnsEMBL::Variation::Utils::VEP qw(detect_format);
>
>
>         Regards,
>         Guillermo.
>
>         On 04/06/15 13:12, Will McLaren wrote:
>
>             Hi again
>
>             If the script is not detecting the input format then it is
>             almost certainly an issue with the input file. There's
>             very little code that gets run to detect the format, and
>             it's all internal to the VEP code.
>
>             You could write a short script to test the method, just
>             import detect_format from Bio:EnsEMBL::Variation::Utils::VEP
>
>             Does it detect the example_GRCh37.vcf format correctly?
>
>             The file you shared on Dropbox works fine for me on my Mac
>             and a Linux box.
>
>             Will
>
>             On 4 Jun 2015 10:44, "Guillermo Marco Puche"
>             <guillermo.marco at sistemasgenomicos.com
>             <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
>             Hi again Will,
>
>             I'm trying with latest ensembl 80.
>             If I don't specify *-format vcf* I get the following error:
>
>             perl ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl  <http://variant_effect_predictor.pl>  -i input.vcf -database --force_overwrite
>
>             2015-06-04 11:36:59 - Starting...
>
>             ERROR: Could not detect input file format
>
>
>             If I force format with*-format vcf *I get all the errors.
>             (see error log attached). I'm using the same input.vcf
>             file I posted yesterday.
>             Just to discard it's not VCF, I've installed a fresh linux
>             on virtual machine and just cloned and setup Ensembl and
>             Bioperl. On fresh Linux install I was only asked to
>             install MySQL perl module (I installed it via CPAN).
>             It's working like a cake.
>
>             I discard there's a problem with the input VCF because I'm
>             using exactly the input over the two environments (and the
>             same one you used to test it yesterday)
>
>             So my question is: Does VEP script use any other library,
>             environment variable or tool which may be interfering?
>
>             Best regards,
>             Guillermo.
>
>             On 04/06/15 09:32, Guillermo Marco Puche wrote:
>
>                 Hi again Will,
>
>                 I've completly cleaned PERL5LIB environment var. I've
>                 been testing changing between bioperl 1.2.3 and
>                 bioperl 1.6.1 and got same warnings/errors.
>                 I've cloned again all 79 API like you suggested in a
>                 new tmp location and included it in $PERL5LIB.
>
>                 *echo $PERL5LIB*
>
>                 /share/apps/local/bioperl-live:/share/gluster/tests/gmarco/tmp/ensembl/modules:/share/gluster/tests/gmarco/tmp/ensembl-funcgen/modules:/share/gluster/tests/gmarco/tmp/ensembl-variation/modules
>
>                 *ll /share/gluster/tests/gmarco/tmp*
>
>                 total 20
>                 drwxrwxr-x  8 gmarco users 4096 jun  4 08:44 ensembl
>                 drwxrwxr-x  8 gmarco users  146 jun  4 08:46
>                 ensembl-funcgen
>                 drwxrwxr-x  5 gmarco users   64 jun  4 08:43 ensembl-tools
>                 drwxrwxr-x 10 gmarco users 4096 jun  4 08:45
>                 ensembl-variation
>
>                 *perl ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl  <http://variant_effect_predictor.pl>  -i input.vcf -database --force_overwrite*
>
>                 2015-06-04 09:29:13 - Starting...
>                 ERROR: Could not detect input file format
>
>                 If use the following flags *-format vcf* *-vcf *then I
>                 start getting all those errors (see yesterday log).
>
>                 Is there any other Perl lib or requirement I could be
>                 missing? As I said it's very weird I have 0 problems
>                 with Ensembl 75 local API.
>
>                 Best regards,
>                 Guillermo.
>
>                 On 03/06/15 18:14, Will McLaren wrote:
>
>                     Hi again,
>
>                     I can't recreate the problem with that input file
>                     I'm afraid, either on my normal setup or scrubbing
>                     PERL5LIB and starting from scratch.
>
>                     See commands I used and input below.
>
>                     Perhaps you haven't got release/79 of
>                     ensembl-tools too?
>
>                     Have you tried running the installer from within
>                     ensembl-tools/scripts/variant_effect_predictor?
>                     This shouldn't affect your PERL5LIB or other git
>                     checkouts.
>
>                     Will
>
>                     ===================
>
>                     mkdir ~/src/tmp
>
>                     cd ~/src/tmp
>
>                     git clone --branch release/79
>                     https://github.com/Ensembl/ensembl-tools.git
>
>                     git clone --branch release/79
>                     https://github.com/Ensembl/ensembl.git
>
>                     git clone --branch release/79
>                     https://github.com/Ensembl/ensembl-variation.git
>
>                     git clone --branch release/79
>                     https://github.com/Ensembl/ensembl-funcgen.git
>
>                     export
>                     PERL5LIB=ensembl/modules:ensembl-variation/modules:ensembl-funcgen/modules:/Users/will/src/bioperl-1.2.3/:/Users/will/src/lib/perl5/
>
>                     perl
>                     ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl
>                     <http://variant_effect_predictor.pl>  -i
>                     ~/Downloads/input.vcf  -database
>
>                     2015-06-03 17:09:54 - Starting...
>
>                     2015-06-03 17:09:54 - Detected format of input
>                     file as vcf
>
>                     2015-06-03 17:09:54 - Read 1 variants into buffer
>
>                     2015-06-03 17:09:54 - Reading transcript data from
>                     cache and/or database
>
>                     [================================================================================================================================]
>                      [ 100% ]
>
>                     2015-06-03 17:10:00 - Retrieved 7 transcripts (0
>                     mem, 0 cached, 7 DB, 0 duplicates)
>
>                     2015-06-03 17:10:00 - Analyzing chromosome 1
>
>                     2015-06-03 17:10:00 - Analyzing variants
>
>                     [================================================================================================================================]
>                      [ 100% ]
>
>                     2015-06-03 17:10:00 - Calculating consequences
>
>                     2015-06-03 17:10:00 - Processed 1 total variants
>                     (0 vars/sec, 0 vars/sec total)
>
>                     2015-06-03 17:10:00 - Wrote stats summary to
>                     variant_effect_output.txt_summary.html
>
>                     2015-06-03 17:10:00 - Finished!
>
>                     On 3 June 2015 at 16:51, Guillermo Marco Puche
>                     <guillermo.marco at sistemasgenomicos.com
>                     <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
>                     Hi Will,
>
>                     I've been checking and I can't see any unintended
>                     whitespace or problem with tabulations.
>                     I've no issues with old vep 75 script and API.
>                     I've updated the Bioperl lib in $PERL5LIB variable
>                     from 1.2.3 to 1.6.1 (I didn't see this change
>                     before sorry) however I'm still getting all those
>                     errors.
>
>                     Here's a link where you can download the VCF I'm
>                     using as input:
>                     https://www.dropbox.com/sh/felwyoo5kl2mgty/AAC177Digqy-_mEmyk9WvmYba/input.vcf?dl=0
>
>                     Thank you.
>
>                     Best regards,
>                     Guille.
>
>                     On 03/06/15 17:30, Will McLaren wrote:
>
>                         Hi Guille,
>
>
>                         It looks to me like your input is not being
>                         parsed properly.
>
>                         Check the formatting of your input VCF; double
>                         check that it is valid VCF, and that you
>                         haven't got any unintended whitespace on any
>                         of the lines.
>
>                         If you still have an issue, can you send a
>                         line or two of the input that recreates these
>                         issues?
>
>                         Thanks
>
>                         Will McLaren
>
>                         Ensembl Variation
>
>                         On 3 June 2015 at 16:16, Guillermo Marco Puche
>                         <guillermo.marco at sistemasgenomicos.com
>                         <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
>                         Dear devs,
>
>                         I'm trying ensembl 79 VEP.
>
>                         This is my dummy input VCF:
>                         http://pastebin.com/kFKWH50q#
>                         <http://pastebin.com/kFKWH50q>
>
>                         I've cloned and installed API from github as
>                         always (this step is repeated for variaton,
>                         funcgen and compara):
>
>                         ·git clone --branch release/79
>                         https://github.com/Ensembl/ensembl.git ensembl_79
>
>                         PERL5LIB env variable is correctly pointing to
>                         the cloned API:
>
>                         ·echo $PERL5LIB
>                         /share/apps/local/bioperl-live:/share/apps/src/ensembl_79/modules:/share/apps/src/ensembl_79-compara/modules:/share/apps/src/ensembl_79-variation/modules:/share/apps/src/ensembl_79-functgenomics/modules
>
>                         However I'm getting a lot of errors I really
>                         don't understand. It seems like a bug with API
>                         installation with me. If I change $PERL5LIB
>                         variable to point to 75 API (previous version
>                         I was using) I can't reproduce the errors VEP
>                         script works for this old 75 version.
>
>                         I've been reading the docs again and I can't
>                         seen any additional PERL library requirement.
>
>                         Here's the error log: http://pastebin.com/VvQrkEQZ
>
>
>                         Thank you!
>
>                         Best regards,
>                         Guille.
>
>
>                         _______________________________________________
>                         Dev mailing list Dev at ensembl.org
>                         <mailto:Dev at ensembl.org>
>                         Posting guidelines and subscribe/unsubscribe
>                         info:
>                         http://lists.ensembl.org/mailman/listinfo/dev
>                         Ensembl Blog: http://www.ensembl.info/
>
>                         _______________________________________________
>
>                         Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>
>                         Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>
>                         Ensembl Blog:http://www.ensembl.info/
>
>
>                     _______________________________________________
>                     Dev mailing list Dev at ensembl.org
>                     <mailto:Dev at ensembl.org>
>                     Posting guidelines and subscribe/unsubscribe info:
>                     http://lists.ensembl.org/mailman/listinfo/dev
>                     Ensembl Blog: http://www.ensembl.info/
>
>                     _______________________________________________
>
>                     Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>
>                     Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>
>                     Ensembl Blog:http://www.ensembl.info/
>
>
>
>
>                 _______________________________________________
>
>                 Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>
>                 Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>
>                 Ensembl Blog:http://www.ensembl.info/
>
>
>             _______________________________________________
>             Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>             Posting guidelines and subscribe/unsubscribe info:
>             http://lists.ensembl.org/mailman/listinfo/dev
>             Ensembl Blog: http://www.ensembl.info/
>
>             _______________________________________________
>
>             Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>
>             Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>
>             Ensembl Blog:http://www.ensembl.info/
>
>
>         _______________________________________________
>         Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>         Posting guidelines and subscribe/unsubscribe info:
>         http://lists.ensembl.org/mailman/listinfo/dev
>         Ensembl Blog: http://www.ensembl.info/
>
>         _______________________________________________
>
>         Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>
>         Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>
>         Ensembl Blog:http://www.ensembl.info/
>
>
>     _______________________________________________
>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>     Posting guidelines and subscribe/unsubscribe info:
>     http://lists.ensembl.org/mailman/listinfo/dev
>     Ensembl Blog: http://www.ensembl.info/
>
>     _______________________________________________
>
>     Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>
>     Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>
>     Ensembl Blog:http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info: 
> http://lists.ensembl.org/mailman/listinfo/dev 
> <http://lists.ensembl.org/mailman/listinfo/dev>
> Ensembl Blog: http://www.ensembl.info/
>
> ------------------------------------------------------------------------
> This message (including any attachments) may contain confidential, 
> proprietary, privileged and/or private information. The information is 
> intended to be for the use of the individual or entity designated 
> above. If you are not the intended recipient of this message, please 
> notify the sender immediately, and delete the message and any 
> attachments. Any disclosure, reproduction, distribution or other use 
> of this message or any attachments by an individual or entity other 
> than the intended recipient is prohibited.
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150604/37a0a7ec/attachment.html>


More information about the Dev mailing list