You are using the GRCh37 cache. Chromosome 20 on GRCh37 is only 63,025,520bp long:

http://grch37.ensembl.org/Homo_sapiens/Location/Chromosome?r=20 <http://grch37.ensembl.org/Homo_sapiens/Location/Chromosome?r=20>

So it is inevitable there will be no variants reported in that 25kb region at the end, and hence the cache file is not found.

Double check that your data are actually mapped to GRCh37, and not GRCh38, where chromosome 20 is now 64,444,167bp long.

If you are confident your data are correct and you are using the correct assembly, then you can safely ignore these warnings.


> On 14 May 2015, at 09:48, Svein Tore Koksrud Seljebotn <s.t.seljebotn at medisin.uio.no> wrote:
> Hi,
> I'm using VEP release-79 with cache from ftp://ftp.ensembl.org/pub/release-79/variation/VEP/homo_sapiens_merged_vep_79_GRCh37.tar.gz .
> I tried to annotate some human exome data and I get the following error message in a separate file called {output_name}_warnings.txt:
> "WARNING: Could not find variation cache for 20:63000001-64000000"
> Looking at the files for this range inside the cache, I can see that those files contain almost not data. There are also other regions that are as small, while most are from 1MB and upwards.
> My question, is this something I should worry about? Is my cache corrupted somehow?
> Command line:
> vep --cache --dir_cache=/work/genomic/funcAnnot/VEP/cache/ --fasta=/work/genomic/gatkBundle_2.5/human_g1k_v37_decoy.fasta --offline --sift=b --polyphen=b --ccds --hgvs --numbers --domains --regulatory --canonical --protein --biotype --gmaf --maf_1kg --maf_esp --pubmed --allow_non_variant --fork=4 --vcf --allele_number --no_escape --failed=1 --no_stats --merged --symbol -custom /work/genomic/variantDbs/repeatMasker/repeatMasker_hg19.20150508.reformat.sorted.bed.gz,repeatMasker,bed,overlap,0 -i test.na12878.vcf -o vep_processed.vcf
