[ensembl-dev] Fwd: Variant_effect_predictor Refseq and HGNC annotation and exception

NextGenSeb nextgenseb at gmail.com
Wed Feb 6 04:29:27 GMT 2013



Dear all,

I recently came across your variant effect predictor, so first of all
thanks for making that available, great tool!
I have a few questions however pertaining to the --refseq and --hgnc flags.

First of all for the --refseq flag:

After downloading both the  homo_sapiens_refseq_vep_70.tar.gz
homo_sapiens_vep_70.tar.gz caches from your ftp site and putting them
into /Data/VEP/homo_sapiens_refseq and /Data/VEP/homo_sapiens folders,
respectively, I tried the following two scenarios:

perl
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl
-i ./test.short.vcf --format vcf -o test.vep --verbose --force_overwrite
--refseq --cache --dir /Data/VEP/ --offline --everything --species
homo_sapiens

perl
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl
-i ./test.short.vcf --format vcf -o test.vep --verbose --force_overwrite
--refseq --cache --dir /Data/VEP/ --offline --everything --species
homo_sapiens_refseq

Both commands ran without error, however neither incorporated the refseq
annotation in the output. Only when deleting both caches again and
unpacking homo_sapiens_refseq_vep_70.tar.gz into /Data/VEP/homo_sapiens
did the refseq NM_ tags get incorporated, but then even with --refseq
not set. This behavior did not change  when removing the --offline or
the --everything flag.

In principle that would be fine, however the -hgnc flag appears to only
work with the homo_sapiens_vep_70.tar.gz data set and not the refseq
one, again regardless of whether run in online or offline mode.

Now removing all cache files and running

perl
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl
-i ./test.short.vcf --format vcf -o test.vep --verbose --force_overwrite
--refseq --cache --dir /Data/VEP/ --write_cache --species homo_sapiens

does actually give the correct output. However in this case adding the
--everything flag not only will put a huge strain on the server
connection when using bigger datasets then my current test one, but also
throws an exception (see end of email). Hence I would rather work of an
offline cache. Could you please advise me whether there is a way to fix
this issue?

Thanks in advance for your help,
Cheers
Seb



perl
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl
-i ./test.short.vcf --format vcf -o test.vep --verbose --force_overwrite
--refseq --cache --dir /Data/VEP/test --write_cache --everything
--species homo_sapiens --host useastdb.ensembl.org

#----------------------------------#
# ENSEMBL VARIANT EFFECT PREDICTOR #
#----------------------------------#

version 2.8

By Will McLaren (wm2 at ebi.ac.uk)

Configuration options:

cache              1
canonical          1
ccds               1
core_type          otherfeatures
dir                /Data/VEP/test
domains            1
everything         1
force_overwrite    1
format             vcf
gmaf               1
hgnc               1
hgvs               1
host               useastdb.ensembl.org
input_file         ./test.short.vcf
numbers            1
output_file        test.vep
polyphen           b
port               5306
protein            1
refseq             1
regulatory         1
sift               b
species            homo_sapiens
toplevel_dir       /Data/VEP/test
verbose            1
write_cache        1

--------------------



Will only load v70 databases
Species 'homo_sapiens' loaded from database 'homo_sapiens_core_70_37'
Species 'homo_sapiens' loaded from database 'homo_sapiens_cdna_70_37'
Species 'homo_sapiens' loaded from database 'homo_sapiens_vega_70_37'
Species 'homo_sapiens' loaded from database
'homo_sapiens_otherfeatures_70_37'
Species 'homo_sapiens' loaded from database 'homo_sapiens_rnaseq_70_37'
homo_sapiens_variation_70_37 loaded
homo_sapiens_funcgen_70_37 loaded
Bio::EnsEMBL::Compara::DBSQL::DBAdaptor not found so the following
compara databases will be ignored: ensembl_compara_70
ensembl_ancestral_70 loaded
ensembl_ontology_70 loaded
ensembl_stable_ids_70 loaded
2013-02-06 11:21:35 - Connected to core version 70 database and
variation version 70 database
2013-02-06 11:21:35 - INFO: Cache directory
/Data/VEP/test/homo_sapiens/70 not found - it will be created
2013-02-06 11:21:40 - INFO: Database will be accessed when using --hgvs
2013-02-06 11:21:40 - INFO: Database will be accessed when using --sift;
consider using the complete cache containing sift data (see
documentation for details)
2013-02-06 11:21:40 - INFO: Database will be accessed when using
--polyphen; consider using the complete cache containing polyphen data
(see documentation for details)
2013-02-06 11:21:40 - INFO: Database will be accessed when using
--regulatory; consider using the complete cache containing regulatory
data (see documentation for details)
2013-02-06 11:21:40 - Starting...
2013-02-06 11:21:42 - Read 62 variants into buffer
2013-02-06 11:21:42 - Reading transcript data from cache and/or database
[======================================================================================================]
[ 100% ]
2013-02-06 12:01:12 - Retrieved 409 transcripts (0 mem, 0 cached, 411
DB, 2 duplicates)
2013-02-06 12:01:12 - Reading regulatory data from cache and/or database
[======================================================================================================]
[ 100% ]
2013-02-06 12:05:08 - Retrieved 1738 regulatory features (0 mem, 0
cached, 1738 DB, 0 duplicates)
2013-02-06 12:05:08 - Checking for existing variations
[======================================================================================================]
[ 100% ]
2013-02-06 12:05:33 - Analyzing chromosome 1
2013-02-06 12:05:33 - Analyzing variants
[======================================================================================================]
[ 100% ]
2013-02-06 12:05:33 - Analyzing RegulatoryFeatures
[======================================================================================================]
[ 100% ]
2013-02-06 12:05:33 - Analyzing MotifFeatures
[======================================================================================================]
[ 100% ]
2013-02-06 12:05:33 - Calculating consequences
[=======> ]    [ 9% ]
-------------------- EXCEPTION --------------------
MSG: Got to have an Exon object, not a
STACK Bio::EnsEMBL::Translation::start_Exon
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Translation.pm:271
STACK Bio::EnsEMBL::Transcript::transfer
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Transcript.pm:2511
STACK Bio::EnsEMBL::Variation::BaseTranscriptVariation::_three_prime_utr
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/BaseTranscriptVariation.pm:638
STACK
Bio::EnsEMBL::Variation::TranscriptVariationAllele::_get_alternate_cds
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/TranscriptVariationAllele.pm:1169
STACK
Bio::EnsEMBL::Variation::TranscriptVariationAllele::_get_fs_peptides
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/TranscriptVariationAllele.pm:1090
STACK
Bio::EnsEMBL::Variation::TranscriptVariationAllele::_get_hgvs_peptides
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/TranscriptVariationAllele.pm:930
STACK Bio::EnsEMBL::Variation::TranscriptVariationAllele::hgvs_protein
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/TranscriptVariationAllele.pm:705
STACK Bio::EnsEMBL::Variation::Utils::VEP::tva_to_line
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:1639
STACK Bio::EnsEMBL::Variation::Utils::VEP::vf_to_consequences
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:1472
STACK Bio::EnsEMBL::Variation::Utils::VEP::vf_list_to_cons
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:1275
STACK Bio::EnsEMBL::Variation::Utils::VEP::get_all_consequences
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:1056
STACK main::main
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl:270
STACK toplevel
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl:116
Date (localtime)    = Wed Feb  6 12:05:33 2013
Ensembl API version = 70
---------------------------------------------------







More information about the Dev mailing list