[ensembl-dev] vep 2.3 issues

Will McLaren wm2 at ebi.ac.uk
Mon Jan 30 11:02:45 GMT 2012


Hi Hardip,

Did you try deleting the adaptors.gz file as I described in a previous email?

Will

On 27 January 2012 21:19, Hardip Patel <hardip.patel at anu.edu.au> wrote:
> Hi will
>
> i tried to run the script with following options
>
> perl5.14.2 variant_effect_predictor.pl --output_file out.vep --species homo_sapiens --format vcf --buffer 1000000000 --terms ensembl --canonical --hgnc --regulatory --protein --gene --condel b --polyphen b --sift b --force_overwrite --input_file infile.vcf -cache -skip_db_check
>
> However it is giving the following error.
>
> Can't call method "fetch_by_region" on unblessed reference at ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm line 2144, <GEN0> line 131296.
>
> So i can use the database but not the cache for chr14 and chr15. strange!!!
>
> anyways as database is working, i will get by. i will try to create cache and see if i can use that. any other suggestions?
>
> kind regards
>
>
>
> Hardip R. Patel, PhD
> Post-doctoral Research Fellow
>
> Genome Discovery Unit and RNA Biology Lab
> Genome Biology Department
> The John Curtin School of Medical Research
> College of Medicine, Biology and Environment
> The Australian National University
> Building 131, Garran Road, ANU Campus, Acton - 0200, ACT, Australia
> Email: hardip.patel at anu.edu.au, patelhardip at gmail.com
> Phone Number: (+61) 0449 180 715
>
>
>
>
> On 28/01/2012, at 12:40 AM, Will McLaren wrote:
>
>> Hi Hardip,
>>
>> The cache has everything in it (transcripts, regulatory features and
>> variations).
>>
>> The only reason to use the database is for a few options, described in
>> the docs here:
>>
>> http://www.ensembl.org/info/docs/variation/vep/vep_script.html#limitations
>>
>> but it doesn't look like you're using any of those.
>>
>> Will
>>
>> On 27 January 2012 13:36, Hardip Patel <hardip.patel at anu.edu.au> wrote:
>>> Hi Will
>>>
>>> I am currently trying to run it on a full file for chr14 using just the database. Once it is finished, I will try using the cache only and see how that goes. I will keep you posted.
>>>
>>> However, could you please confirm that just using the cache will still be able to cover all variations? I am guessing that Cache has everything calculated and is a substitute for database.
>>>
>>> Kind regards
>>>
>>>
>>> Hardip R. Patel, PhD
>>> Post-doctoral Research Fellow
>>>
>>> Genome Discovery Unit and RNA Biology Lab
>>> Genome Biology Department
>>> The John Curtin School of Medical Research
>>> College of Medicine, Biology and Environment
>>> The Australian National University
>>> Building 131, Garran Road, ANU Campus, Acton - 0200, ACT, Australia
>>> Email: hardip.patel at anu.edu.au, patelhardip at gmail.com
>>> Phone Number: (+61) 0449 180 715
>>>
>>>
>>>
>>>
>>> On 28/01/2012, at 12:31 AM, Will McLaren wrote:
>>>
>>>> Hi Hardip,
>>>>
>>>> Of course, it would be useful to know the source of the problem, for both of us!
>>>>
>>>> Have you tried just using the cache (and not the database)?
>>>>
>>>> Will
>>>>
>>>> On 27 January 2012 13:28, Hardip Patel <hardip.patel at anu.edu.au> wrote:
>>>>> Dear Will,
>>>>>
>>>>> Thank you once again for the quick response.
>>>>>
>>>>>> Is there a reason why you are specifying the cache AND a local
>>>>>> database? Using the options below you shouldn't need the database as
>>>>>> well.
>>>>>
>>>>> Reading the manual about VEP usage, I thought that I needed both the local database and cache to ensure that variations not listed in cache are covered by database searches. Am I wrong in thinking that? I wanted to use the cache for speed when it can.
>>>>>> This error is raised because the script can't get a slice adaptor for
>>>>>> your database. This might be for a number of reasons
>>>>> I have tried fetching chromosome slice using the slice adaptor from local database and it seems to work correctly as expected. Is it what you meant?
>>>>>
>>>>>> have you generated the cache yourself, or downloaded it from our website? If
>>>>>> you have downloaded it, I am surprised that you are getting the line:
>>>>>>
>>>>>> 2012-01-27 10:41:59 - Reading cached adaptor data
>>>>>
>>>>>
>>>>> I did download the cache from the ensembl website (twice). I have got that as the output for all other chromosomes as well. I am not sure what does it do exactly but since it did not complain for other chromosomes, I took it as a success.
>>>>>
>>>>>> Both files work fine for me, using either cache or database.
>>>>>
>>>>> I tried running the script again without using the cache and it works fine for me now. So I guess it solves the problem.
>>>>>
>>>>>> However, a quick look on Ensembl shows you're not missing much, at least for your chr15 file:
>>>>>
>>>>>
>>>>> The file that I sent you were only test files. I actually have nearly 130,000 variations to analyze. I am looking at whole genome resequencing data and therefore I wanted something faster for better runtimes.
>>>>>
>>>>> In short, I will avoid using the cache from now on just to make sure that there are no issues in running the script. However, out of curiosity, it would be nice to know the root of the problem.
>>>>>
>>>>> Your help is greatly appreciated.
>>>>>
>>>>> Kind regards
>>>>>
>>>>>
>>>>> Hardip R. Patel, PhD
>>>>> Post-doctoral Research Fellow
>>>>>
>>>>> Genome Discovery Unit and RNA Biology Lab
>>>>> Genome Biology Department
>>>>> The John Curtin School of Medical Research
>>>>> College of Medicine, Biology and Environment
>>>>> The Australian National University
>>>>> Building 131, Garran Road, ANU Campus, Acton - 0200, ACT, Australia
>>>>> Email: hardip.patel at anu.edu.au, patelhardip at gmail.com
>>>>> Phone Number: (+61) 0449 180 715
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 27/01/2012, at 9:00 PM, Will McLaren wrote:
>>>>>
>>>>>> Hi Hardip,
>>>>>>
>>>>>> Is there a reason why you are specifying the cache AND a local
>>>>>> database? Using the options below you shouldn't need the database as
>>>>>> well.
>>>>>>
>>>>>> This error is raised because the script can't get a slice adaptor for
>>>>>> your database. This might be for a number of reasons - have you
>>>>>> generated the cache yourself, or downloaded it from our website? If
>>>>>> you have downloaded it, I am surprised that you are getting the line:
>>>>>>
>>>>>> 2012-01-27 10:41:59 - Reading cached adaptor data
>>>>>>
>>>>>> in your output.
>>>>>>
>>>>>> You could try deleting (or simply renaming for a backup) the
>>>>>> adaptors.gz file in ~/.vep/homo_sapiens/65/ - this should force the
>>>>>> script to reload the database adaptors on startup and may solve the
>>>>>> problem.
>>>>>>
>>>>>> It is odd that it doesn't work for those two chromosomes! Both files
>>>>>> work fine for me, using either cache or database. However, a quick
>>>>>> look on Ensembl shows you're not missing much, at least for your chr15
>>>>>> file:
>>>>>>
>>>>>> http://www.ensembl.org/Homo_sapiens/Location/View?r=15%3A20000447-20014544
>>>>>>
>>>>>> (this is the range of the chromosome covered by the variants in that file).
>>>>>>
>>>>>> Hope this helps
>>>>>>
>>>>>> Will
>>>>>>
>>>>>> On 27 January 2012 03:50, Hardip Patel <hardip.patel at anu.edu.au> wrote:
>>>>>>> Dear all
>>>>>>>
>>>>>>> I am trying to run the following command for using VEP. I am running local
>>>>>>> version of the ensembl database (v65). I am using perl version 5.14.2.
>>>>>>>
>>>>>>> perl5.14.2 variant_effect_predictor.pl --output_file outfile.vep --species
>>>>>>> homo_sapiens --host host --user user --password password --port 1111
>>>>>>> --db_version 65 --format vcf --buffer 1000000000 --terms ensembl --canonical
>>>>>>> --hgnc --cache --regulatory --protein --gene --condel b --polyphen b --sift
>>>>>>> b --force_overwrite --input_file infile.vcf -skip_db_check
>>>>>>>
>>>>>>> I have VCF files generated for all chromosomes separately for 15 samples.
>>>>>>> The script works all good for all chromosomes for all samples except for
>>>>>>> chr14 and chr15.
>>>>>>>
>>>>>>> Here is the log report after trying to run the script for chr14 and chr15.
>>>>>>>
>>>>>>> Use of qw(...) as parentheses is deprecated at variant_effect_predictor.pl
>>>>>>> line 848.
>>>>>>> 2012-01-27 10:41:59 - Read existing cache info
>>>>>>> 2012-01-27 10:41:59 - Reading cached adaptor data
>>>>>>> Use of uninitialized value $_[3] in join or string at
>>>>>>> ensembl-api/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm
>>>>>>> line 3497.
>>>>>>> 2012-01-27 10:41:59 - INFO: Defined host ##### is different from cached
>>>>>>> 2012-01-27 10:41:59 - Starting...
>>>>>>> 2012-01-27 10:42:03 - Read 131269 variants into buffer
>>>>>>> 2012-01-27 10:42:04 - Analyzing chromosome 14
>>>>>>> 2012-01-27 10:42:04 - Reading transcript data from cache and/or database
>>>>>>> [>                                              ]    [ 0% ]
>>>>>>> Can't call method "fetch_by_region" on unblessed reference at
>>>>>>> ensembl-api/ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm
>>>>>>> line 2144, <GEN0> line 131296.
>>>>>>>
>>>>>>> I have tried downloading the cache again and see if that helps. However, I
>>>>>>> am not able to run it on these two chromosomes at all. Could you please let
>>>>>>> me know how to correct this issue?
>>>>>>>
>>>>>>> I have attached two vcf files as examples to test the code on your end.
>>>>>>>
>>>>>>> NB: it is the chromosome 14 and 15 that are the issue. rest are all fine and
>>>>>>> giving appropriate results.
>>>>>>>
>>>>>>>
>>>>>>> Any help is greatly appreciated.
>>>>>>>
>>>>>>> Kind regards
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hardip R. Patel, PhD
>>>>>>> Post-doctoral Research Fellow
>>>>>>>
>>>>>>> Genome Discovery Unit and RNA Biology Lab
>>>>>>> Genome Biology Department
>>>>>>> The John Curtin School of Medical Research
>>>>>>> College of Medicine, Biology and Environment
>>>>>>> The Australian National University
>>>>>>> Building 131, Garran Road, ANU Campus, Acton - 0200, ACT, Australia
>>>>>>> Email: hardip.patel at anu.edu.au, patelhardip at gmail.com
>>>>>>> Phone Number: (+61) 0449 180 715
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>>> List admin (including subscribe/unsubscribe):
>>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>
>>>>>> Scanned by Messagelabs ***
>>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing list    Dev at ensembl.org
>>>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>>>
>>>> Scanned by Messagelabs ***
>>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>> Scanned by Messagelabs ***
>




More information about the Dev mailing list