[ensembl-dev] VEP 79 API problems

Guillermo Marco Puche guillermo.marco at sistemasgenomicos.com
Thu Jun 4 10:44:42 BST 2015


Hi again Will,

I'm trying with latest ensembl 80.
If I don't specify *-format vcf* I get the following error:

perl ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl -i input.vcf -database --force_overwrite
2015-06-04 11:36:59 - Starting...
ERROR: Could not detect input file format


If I force format with*-format vcf *I get all the errors. (see error log 
attached). I'm using the same input.vcf file I posted yesterday.
Just to discard it's not VCF, I've installed a fresh linux on virtual 
machine and just cloned and setup Ensembl and Bioperl. On fresh Linux 
install I was only asked to install MySQL perl module (I installed it 
via CPAN).
It's working like a cake.

I discard there's a problem with the input VCF because I'm using exactly 
the input over the two environments (and the same one you used to test 
it yesterday)

So my question is: Does VEP script use any other library, environment 
variable or tool which may be interfering?

Best regards,
Guillermo.


On 04/06/15 09:32, Guillermo Marco Puche wrote:
> Hi again Will,
>
> I've completly cleaned PERL5LIB environment var. I've been testing 
> changing between bioperl 1.2.3 and bioperl 1.6.1 and got same 
> warnings/errors.
> I've cloned again all 79 API like you suggested in a new tmp location 
> and included it in $PERL5LIB.
>
> *echo $PERL5LIB*
> /share/apps/local/bioperl-live:/share/gluster/tests/gmarco/tmp/ensembl/modules:/share/gluster/tests/gmarco/tmp/ensembl-funcgen/modules:/share/gluster/tests/gmarco/tmp/ensembl-variation/modules
>
> *ll /share/gluster/tests/gmarco/tmp*
> total 20
> drwxrwxr-x  8 gmarco users 4096 jun  4 08:44 ensembl
> drwxrwxr-x  8 gmarco users  146 jun  4 08:46 ensembl-funcgen
> drwxrwxr-x  5 gmarco users   64 jun  4 08:43 ensembl-tools
> drwxrwxr-x 10 gmarco users 4096 jun  4 08:45 ensembl-variation
>
> *perl ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl -i input.vcf -database --force_overwrite*
> 2015-06-04 09:29:13 - Starting...
> ERROR: Could not detect input file format
>
> If use the following flags *-format vcf* *-vcf *then I start getting 
> all those errors (see yesterday log).
>
> Is there any other Perl lib or requirement I could be missing? As I 
> said it's very weird I have 0 problems with Ensembl 75 local API.
>
> Best regards,
> Guillermo.
>
> On 03/06/15 18:14, Will McLaren wrote:
>> Hi again,
>>
>> I can't recreate the problem with that input file I'm afraid, either 
>> on my normal setup or scrubbing PERL5LIB and starting from scratch.
>>
>> See commands I used and input below.
>>
>> Perhaps you haven't got release/79 of ensembl-tools too?
>>
>> Have you tried running the installer from within 
>> ensembl-tools/scripts/variant_effect_predictor? This shouldn't affect 
>> your PERL5LIB or other git checkouts.
>>
>> Will
>>
>> ===================
>>
>> mkdir ~/src/tmp
>> cd ~/src/tmp
>> git clone --branch release/79 
>> https://github.com/Ensembl/ensembl-tools.git
>> git clone --branch release/79 https://github.com/Ensembl/ensembl.git
>> git clone --branch release/79 
>> https://github.com/Ensembl/ensembl-variation.git
>> git clone --branch release/79 
>> https://github.com/Ensembl/ensembl-funcgen.git
>> export 
>> PERL5LIB=ensembl/modules:ensembl-variation/modules:ensembl-funcgen/modules:/Users/will/src/bioperl-1.2.3/:/Users/will/src/lib/perl5/
>> perl 
>> ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl 
>> <http://variant_effect_predictor.pl>  -i ~/Downloads/input.vcf  -database
>> 2015-06-03 17:09:54 - Starting...
>> 2015-06-03 17:09:54 - Detected format of input file as vcf
>> 2015-06-03 17:09:54 - Read 1 variants into buffer
>> 2015-06-03 17:09:54 - Reading transcript data from cache and/or database
>> [================================================================================================================================] 
>>  [ 100% ]
>> 2015-06-03 17:10:00 - Retrieved 7 transcripts (0 mem, 0 cached, 7 DB, 
>> 0 duplicates)
>> 2015-06-03 17:10:00 - Analyzing chromosome 1
>> 2015-06-03 17:10:00 - Analyzing variants
>> [================================================================================================================================] 
>>  [ 100% ]
>> 2015-06-03 17:10:00 - Calculating consequences
>> 2015-06-03 17:10:00 - Processed 1 total variants (0 vars/sec, 0 
>> vars/sec total)
>> 2015-06-03 17:10:00 - Wrote stats summary to 
>> variant_effect_output.txt_summary.html
>> 2015-06-03 17:10:00 - Finished!
>>
>>
>>
>>
>> On 3 June 2015 at 16:51, Guillermo Marco Puche 
>> <guillermo.marco at sistemasgenomicos.com 
>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>
>>     Hi Will,
>>
>>     I've been checking and I can't see any unintended whitespace or
>>     problem with tabulations.
>>     I've no issues with old vep 75 script and API. I've updated the
>>     Bioperl lib in $PERL5LIB variable from 1.2.3 to 1.6.1 (I didn't
>>     see this change before sorry) however I'm still getting all those
>>     errors.
>>
>>     Here's a link where you can download the VCF I'm using as input:
>>     https://www.dropbox.com/sh/felwyoo5kl2mgty/AAC177Digqy-_mEmyk9WvmYba/input.vcf?dl=0
>>
>>     Thank you.
>>
>>     Best regards,
>>     Guille.
>>
>>
>>     On 03/06/15 17:30, Will McLaren wrote:
>>>     Hi Guille,
>>>
>>>     It looks to me like your input is not being parsed properly.
>>>
>>>     Check the formatting of your input VCF; double check that it is
>>>     valid VCF, and that you haven't got any unintended whitespace on
>>>     any of the lines.
>>>
>>>     If you still have an issue, can you send a line or two of the
>>>     input that recreates these issues?
>>>
>>>     Thanks
>>>
>>>     Will McLaren
>>>     Ensembl Variation
>>>
>>>
>>>     On 3 June 2015 at 16:16, Guillermo Marco Puche
>>>     <guillermo.marco at sistemasgenomicos.com
>>>     <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>
>>>         Dear devs,
>>>
>>>         I'm trying ensembl 79 VEP.
>>>
>>>         This is my dummy input VCF: http://pastebin.com/kFKWH50q#
>>>
>>>         I've cloned and installed API from github as always (this
>>>         step is repeated for variaton, funcgen and compara):
>>>
>>>           * git clone --branch release/79
>>>             https://github.com/Ensembl/ensembl.git ensembl_79
>>>
>>>         PERL5LIB env variable is correctly pointing to the cloned API:
>>>
>>>           * echo $PERL5LIB
>>>             /share/apps/local/bioperl-live:/share/apps/src/ensembl_79/modules:/share/apps/src/ensembl_79-compara/modules:/share/apps/src/ensembl_79-variation/modules:/share/apps/src/ensembl_79-functgenomics/modules
>>>
>>>         However I'm getting a lot of errors I really don't
>>>         understand. It seems like a bug with API installation with
>>>         me. If I change $PERL5LIB variable to point to 75 API
>>>         (previous version I was using) I can't reproduce the errors
>>>         VEP script works for this old 75 version.
>>>
>>>         I've been reading the docs again and I can't seen any
>>>         additional PERL library requirement.
>>>
>>>         Here's the error log: http://pastebin.com/VvQrkEQZ
>>>
>>>
>>>         Thank you!
>>>
>>>         Best regards,
>>>         Guille.
>>>
>>>
>>>         _______________________________________________
>>>         Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>         Posting guidelines and subscribe/unsubscribe info:
>>>         http://lists.ensembl.org/mailman/listinfo/dev
>>>         Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>>
>>>
>>>     _______________________________________________
>>>     Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>>>     Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>>     Ensembl Blog:http://www.ensembl.info/
>>
>>
>>
>>     _______________________________________________
>>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>     Posting guidelines and subscribe/unsubscribe info:
>>     http://lists.ensembl.org/mailman/listinfo/dev
>>     Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>>
>> _______________________________________________
>> Dev mailing listDev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog:http://www.ensembl.info/
>
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150604/01581023/attachment.html>
-------------- next part --------------
perl ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl -i input.vcf -database --force_overwrite -format vcf
2015-06-04 11:25:16 - Starting...
Argument "from" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 5.

WARNING: Start from or end 3 coordinate invalid on line 6

WARNING: Invalid input formatting on line 7

WARNING: Invalid input formatting on line 11

WARNING: Invalid input formatting on line 18

WARNING: Invalid input formatting on line 20
Argument "From" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 16.

WARNING: Start From or end 3 coordinate invalid on line 22

WARNING: Invalid input formatting on line 23
Argument "of" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 18.

WARNING: Start of or end 5 coordinate invalid on line 26

WARNING: Invalid input formatting on line 28

WARNING: Invalid input formatting on line 30
Argument "from" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 20.

WARNING: Start from or end 3 coordinate invalid on line 31

WARNING: Invalid input formatting on line 32

WARNING: Invalid input formatting on line 34
Argument "of" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 21.

WARNING: Start of or end 10 coordinate invalid on line 35
Argument "is" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 22.

WARNING: Start is or end 4 coordinate invalid on line 37

WARNING: Invalid input formatting on line 44
Argument "depths" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 28.

WARNING: Start depths or end 2 coordinate invalid on line 45
Argument "read" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 29.

WARNING: Start read or end 5 coordinate invalid on line 47

WARNING: Invalid input formatting on line 49

WARNING: Invalid input formatting on line 51
Argument "Phred-scaled" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 32.

WARNING: Start Phred-scaled or end 2 coordinate invalid on line 53

WARNING: Invalid input formatting on line 55

WARNING: Invalid input formatting on line 56
Argument "frequencies" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 33.

WARNING: Start frequencies or end 2 coordinate invalid on line 57

WARNING: Invalid input formatting on line 59

WARNING: Invalid input formatting on line 60
Argument "Depth" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 34.

WARNING: Start Depth or end 7 coordinate invalid on line 61

WARNING: Invalid input formatting on line 63
Argument "of" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 35.
Argument "og" isn't numeric in numeric ge (>=) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 562, <GEN0> line 35.

WARNING: Start og or end 4 coordinate invalid on line 64
Argument "from" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 36.
Argument "fron" isn't numeric in numeric ge (>=) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 562, <GEN0> line 36.

WARNING: Start fron or end 4 coordinate invalid on line 66

WARNING: Invalid input formatting on line 68
Argument "quality" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 37.

WARNING: Start quality or end 19 coordinate invalid on line 69
Argument "quality" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 38.

WARNING: Start quality or end 17 coordinate invalid on line 71

WARNING: Invalid input formatting on line 73
Argument "of" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 39.

WARNING: Start of or end 4 coordinate invalid on line 74

WARNING: Invalid input formatting on line 76

WARNING: Invalid input formatting on line 77
Argument "of" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 40.

WARNING: Start of or end 4 coordinate invalid on line 78
Argument "of" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 41.

WARNING: Start of or end 4 coordinate invalid on line 80

WARNING: Invalid input formatting on line 82
Argument "of" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 42.

WARNING: Start of or end 4 coordinate invalid on line 83

WARNING: Invalid input formatting on line 85
Argument "than" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 44.

WARNING: Start than or end 1 coordinate invalid on line 87
Argument "artifact" isn't numeric in addition (+) at /share/gluster/tests/gmarco/tmp_80/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 515, <GEN0> line 45.

WARNING: Start artifact or end 1 coordinate invalid on line 89

WARNING: Invalid input formatting on line 91

WARNING: Invalid input formatting on line 93

WARNING: Invalid input formatting on line 120

WARNING: Invalid input formatting on line 121

WARNING: Invalid input formatting on line 122

WARNING: Invalid input formatting on line 123

WARNING: Invalid input formatting on line 125

WARNING: Invalid input formatting on line 126

WARNING: Invalid input formatting on line 127

WARNING: Invalid input formatting on line 128

WARNING: Invalid input formatting on line 129

WARNING: Invalid input formatting on line 130

WARNING: Invalid input formatting on line 131

WARNING: Invalid input formatting on line 132

WARNING: Invalid input formatting on line 133
2015-06-04 11:25:17 - Read 1 variants into buffer
2015-06-04 11:25:17 - Reading transcript data from cache and/or database
[========================================================================================================================================================================================================]  [ 100% ]
2015-06-04 11:25:59 - Retrieved 7 transcripts (0 mem, 0 cached, 7 DB, 0 duplicates)
2015-06-04 11:25:59 - Analyzing chromosome 1
2015-06-04 11:25:59 - Analyzing variants
[========================================================================================================================================================================================================]  [ 100% ]
2015-06-04 11:25:59 - Calculating consequences
2015-06-04 11:25:59 - Processed 1 total variants (0 vars/sec, 0 vars/sec total)
2015-06-04 11:25:59 - Wrote stats summary to variant_effect_output.txt_summary.html
2015-06-04 11:25:59 - See variant_effect_output.txt_warnings.txt for details of 59 warnings
2015-06-04 11:25:59 - Finished!


More information about the Dev mailing list