[ensembl-dev] Could not find variation cache for

Will McLaren wm2 at ebi.ac.uk
Mon Jun 8 15:41:18 BST 2015


My apologies, paste failure!
http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#gtf is the
correct link for the file specs.

--variant_class was added for release 80, so you can exclude that if you
are using 79.

Regards

Will

On 8 June 2015 at 15:35, Schmucki, Roland <roland.schmucki at roche.com> wrote:

> Hi Will
>
> Many thanks for your explanations.
> However, the tools claims that it cannot find the --variant_class option
>
>  perl variant_effect_predictor.pl --no_progress --variant_c rass
> --biotype --numbers --offline --custom
> ../ref/pao1.gff.gz,pao1-genes,gff,overlap,0 --format vcf -i ./test.vcf -o
> ./test.txt --species pao1 --dir_cache
> ./variant_effect_predictor_version79/cache_files
> Unknown option: variant_class
> ERROR: Failed to parse command-line flags
>
> I am using version 79, is this a version issue?
>
> Also, I could not find the gtf/gff specifications via the given second
> link?
>
> Thanks for help!
>
> Best,
> R.
>
> On Mon, Jun 8, 2015 at 10:38 AM, Will McLaren <wm2 at ebi.ac.uk> wrote:
>
>> Hi Roland,
>>
>> You can ignore that warning message; when you specify --everything, it
>> switches on a few options which tell the VEP to expect to find cache files
>> containing co-located variants. Since you generated your cache yourself,
>> these files don't exist, which is why the code is complaining. You can
>> either continue to ignore the warnings, or substitute --everything for the
>> list of flags specified here:
>>
>>
>> http://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_everything
>>
>> In fact in your case only the following will work with a user-generated
>> cache anyway: --variant_class, --biotype, --numbers
>>
>> Regarding the lack of protein-changing results, there is every chance
>> that the cache has not been generated correctly from the GTF. I notice you
>> converted a GFF; it's worth checking that the requirements on the input GTF
>> are quite strict, see
>> http://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_everything
>>
>> It is on our to-do list to make this script compatible with a wider
>> spectrum of GFF/GTF formatting.
>>
>> Regards
>>
>> Will
>>
>> On 5 June 2015 at 13:52, Schmucki, Roland <roland.schmucki at roche.com>
>> wrote:
>>
>>> Dear Will
>>>
>>> Thank you very much for the quick response.
>>> I would like to post this issue to the public Ensembl mailing list.
>>> Here is a brief description of the problem I encountered:
>>>
>>>
>>> When running VEP with ensembl annotation files I get errors of the form
>>> "Could not find variation cache for Chromosome..."
>>>
>>> I downloaded a  genome (i.e. pao1, $name.fa) and annotation ($name.gff3)
>>> from Ensembl ftp and then created the cache files according to the VEP
>>> tutorial:
>>>
>>>
>>> sort -k1,1 -k4,4n $name.gff | bgzip > $name.gff.gz
>>> tabix -p gff $name.gff.gz
>>> ./cufflinks/gffread $name.gff -T -o $name.gtf
>>> perl gtf2vep.pl -i $name.gtf -f $name.fa -d 79 -s $name --dir
>>> variant_effect_predictor_version79/cache_files_
>>> and move the cache files to the correct location manually.
>>>
>>> This all seem to have worked fine without any error or warning messages.
>>> Then I mapped the reads to the genome, ran Freebayes (variants.vcf with
>>> 2700 variants) and at the very end applied VEP with the following command:
>>>
>>>
>>> perl variant_effect_predictor.pl --everything --offline --custom
>>> $name.gff.gz,$name-genes,gff,overlap,0 --format vcf -i variants.vcf -o
>>> variants.txt --species $name --dir_cache $VEP_DATA
>>>
>>>
>>> The variable VEP_DATA points to the corresponding cache file:
>>> with the following files (creation date and file size) there in:
>>> $VEP_DATA/pao1/79/Chromosome/
>>> 292135 Jun  5 09:10 3000001-4000000.gz
>>> 294904 Jun  5 09:10 1000001-2000000.gz
>>> 290186 Jun  5 09:10 1-1000000.gz
>>> 290763 Jun  5 09:10 5000001-6000000.gz
>>> 284789 Jun  5 09:10 2000001-3000000.gz
>>> 292462 Jun  5 09:10 4000001-5000000.gz
>>> 78483 Jun  5 09:10 6000001-7000000.gz
>>>
>>>
>>> When I run VEP I get the following errors and warnings (See attached log
>>> file for all details):
>>> WARNING: Could not find variation cache for Chromosome:1-1000000
>>> WARNING: Could not find variation cache for Chromosome:5000001-6000000
>>> etc.
>>>
>>>
>>> I don't understand why I got this errors/warnings?
>>> Thanks a lot for any advice!
>>>
>>> Best,
>>>
>>> R.
>>>
>>>
>>> PS: there is an output file generated with variant annotations of the
>>> form:
>>>
>>> #Uploaded_variation     Location        Allele  Gene    Feature
>>> Feature_type    Consequence     cDNA_position   CDS_position    Pro
>>> tein_position        Amino_acids     Codons  Existing_variation
>>>  Extra
>>> Chromosome_2415_G/T     Chromosome:2415 T       gene:PA0005
>>> transcript:AAG03395     Transcript      downstream_gene_variant -
>>>        -       -       -       -       -
>>> IMPACT=MODIFIER;pao1-genes=gene:PA0002,exon_Chromosome:2056-3159,CDS:AAG03392,transc
>>>
>>> However, no amino acid changes are found which is unlikely.
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
>
> --
>
> Roland Schmucki, PhD
> Computational Biologist, Pharmaceutical Sciences
> Roche Pharma Research and Early Development
>
>
> Roche Innovation Center Basel
>
> F. Hoffmann-La Roche Ltd
> Grenzacherstrasse 124
> 4070 Basel
>
> Switzerland
> Phone +41 61 687 13 30
>
>
>
>
> Confidentiality Note: This message is intended only for the use of the
> named recipient(s) and may contain confidential and/or proprietary
> information. If you are not the intended recipient, please contact the
> sender and delete this message. Any unauthorized use of the information
> contained in this message is prohibited.
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150608/a128e1b9/attachment.html>


More information about the Dev mailing list