[ensembl-dev] VEP installer script fails to download homo_sapiens_refseq_vep_73.tar.gz

Chris Boustred cboustred at gmail.com
Thu Nov 7 16:47:13 GMT 2013


Hi,

I am using the VEP installer script to download and unpack caches to use 
with the VEP script.

I would like to use the human refseq cache, to get NM_ transcript IDs, 
as this is what my colleagues would like reported in their output.

When prompted which cache to download, if I choose '25 : 
homo_sapiens_refseq_vep_73.tar.gz' it is downloaded - put into a tmp 
folder within ~/.vep, however it looks as if it fails to unpack as the 
resulting cache folder (homo_sapiens) is empty?

If I choose '26 : homo_sapiens_vep_73.tar.gz' the unpacked 
'homo_sapiens' folder contains all the cache information.

I therefore downloaded the cache files directly from 
ftp://ftp.ensembl.org/pub/release-73/variation/VEP/ however when I 
unpack them both they are both named 'homo_sapiens'. I believe in the 
past the refseq cache had a different name e.g. homo_sapiens_refseq ? I 
am using --dir_cache to get around this.

Finally, when running the VEP script with the refseq cache and using the 
--symbol flag I was getting the error:

Can't call method "display_xref" on an undefined value at 
/home/chris/VEP/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm 
line 1997.

And the process hangs.

If I run with the --refseq flag I no longer get the error but the output 
of --symbol is not populated i.e. the gene HGNC symbol.

I don't any get errors if I use the ensembl vep cache...

Here are the three commands I am running:

1. Using ref seq cache without --refseq flag (throws the 
"/VEP/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm line 
1997" error

perl $VEP/variant_effect_predictor.pl \
-fork 4 \
--buffer_size 10000 \
--cache \
--dir_cache /home/chris/.vep/Refseq \
--dir_plugins /home/chris/.vep/Plugins \
--fasta 
/home/chris/.vep/EnsemblRef/Homo_sapiens.GRCh37.73.dna.primary_assembly.fa \
--input_file $inputVCF \
--output_file $outputVCF \
--sift b  \
--polyphen b  \
--allele_number \
--numbers \
--domains \
--HGVS \
--protein \
--symbol \
--ccds \
--canonical \
--biotype \
--check_alleles \
--gmaf \
--maf_1kg \
--maf_esp \
--pubmed \
--vcf \
--force_overwrite \
--plugin FATHMM,"python ~/Reference_sequences/Variants/FATHMM/fathmm.py"


2. As above but with --refseq flag - works without an error but HGNC 
(--symbol) is not populated?

perl $VEP/variant_effect_predictor.pl \
-fork 4 \
--buffer_size 10000 \
--cache \
--dir_cache /home/chris/.vep/Refseq \
--dir_plugins /home/chris/.vep/Plugins \
--fasta 
/home/chris/.vep/EnsemblRef/Homo_sapiens.GRCh37.73.dna.primary_assembly.fa \
--input_file $inputVCF \
--output_file $outputVCF \
--sift b  \
--polyphen b  \
--allele_number \
--numbers \
--domains \
--HGVS \
--protein \
--symbol \
--ccds \
--canonical \
--biotype \
--check_alleles \
--gmaf \
--maf_1kg \
--maf_esp \
--pubmed \
--vcf \
--refseq \
--force_overwrite \
--plugin FATHMM,"python ~/Reference_sequences/Variants/FATHMM/fathmm.py"

3. Using ensembl cache - works but no ref seq trasncript IDs!

perl $VEP/variant_effect_predictor.pl \
-fork 4 \
--buffer_size 10000 \
--cache \
--dir_cache /home/chris/.vep/ \
--dir_plugins /home/chris/.vep/Plugins \
--fasta 
/home/chris/.vep/EnsemblRef/Homo_sapiens.GRCh37.73.dna.primary_assembly.fa \
--input_file $inputVCF \
--output_file $outputVCF \
--sift b  \
--polyphen b  \
--allele_number \
--numbers \
--domains \
--HGVS \
--protein \
--symbol \
--ccds \
--canonical \
--biotype \
--check_alleles \
--gmaf \
--maf_1kg \
--maf_esp \
--pubmed \
--vcf \
--refseq \
--force_overwrite \
--plugin FATHMM,"python ~/Reference_sequences/Variants/FATHMM/fathmm.py"

Any help with the above would be much appreciated!

Thanks

Chris



-- 

*Chris Boustred*
Laboratory Bioinformatician
Regional Molecular Genetics
Great Ormond Street for Children NHS Foundation Trust
Level 6, York House
37 Queen Square
London
WC1N 3BH
christopher.boustred at gosh.nhs.uk <mailto:christopher.boustred at gosh.nhs.uk>
cboustred at gmail.com <mailto:cboustred at gmail.com>
Phone: 020 7762 6874
Fax: 020 7813 8196

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20131107/9dc11a74/attachment.html>


More information about the Dev mailing list