[ensembl-dev] Fwd: Variant_effect_predictor Refseq and HGNC annotation and exception
NextGenSeb
nextgenseb at gmail.com
Wed Feb 6 04:29:27 GMT 2013
Dear all,
I recently came across your variant effect predictor, so first of all
thanks for making that available, great tool!
I have a few questions however pertaining to the --refseq and --hgnc flags.
First of all for the --refseq flag:
After downloading both the homo_sapiens_refseq_vep_70.tar.gz
homo_sapiens_vep_70.tar.gz caches from your ftp site and putting them
into /Data/VEP/homo_sapiens_refseq and /Data/VEP/homo_sapiens folders,
respectively, I tried the following two scenarios:
perl
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl
-i ./test.short.vcf --format vcf -o test.vep --verbose --force_overwrite
--refseq --cache --dir /Data/VEP/ --offline --everything --species
homo_sapiens
perl
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl
-i ./test.short.vcf --format vcf -o test.vep --verbose --force_overwrite
--refseq --cache --dir /Data/VEP/ --offline --everything --species
homo_sapiens_refseq
Both commands ran without error, however neither incorporated the refseq
annotation in the output. Only when deleting both caches again and
unpacking homo_sapiens_refseq_vep_70.tar.gz into /Data/VEP/homo_sapiens
did the refseq NM_ tags get incorporated, but then even with --refseq
not set. This behavior did not change when removing the --offline or
the --everything flag.
In principle that would be fine, however the -hgnc flag appears to only
work with the homo_sapiens_vep_70.tar.gz data set and not the refseq
one, again regardless of whether run in online or offline mode.
Now removing all cache files and running
perl
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl
-i ./test.short.vcf --format vcf -o test.vep --verbose --force_overwrite
--refseq --cache --dir /Data/VEP/ --write_cache --species homo_sapiens
does actually give the correct output. However in this case adding the
--everything flag not only will put a huge strain on the server
connection when using bigger datasets then my current test one, but also
throws an exception (see end of email). Hence I would rather work of an
offline cache. Could you please advise me whether there is a way to fix
this issue?
Thanks in advance for your help,
Cheers
Seb
perl
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl
-i ./test.short.vcf --format vcf -o test.vep --verbose --force_overwrite
--refseq --cache --dir /Data/VEP/test --write_cache --everything
--species homo_sapiens --host useastdb.ensembl.org
#----------------------------------#
# ENSEMBL VARIANT EFFECT PREDICTOR #
#----------------------------------#
version 2.8
By Will McLaren (wm2 at ebi.ac.uk)
Configuration options:
cache 1
canonical 1
ccds 1
core_type otherfeatures
dir /Data/VEP/test
domains 1
everything 1
force_overwrite 1
format vcf
gmaf 1
hgnc 1
hgvs 1
host useastdb.ensembl.org
input_file ./test.short.vcf
numbers 1
output_file test.vep
polyphen b
port 5306
protein 1
refseq 1
regulatory 1
sift b
species homo_sapiens
toplevel_dir /Data/VEP/test
verbose 1
write_cache 1
--------------------
Will only load v70 databases
Species 'homo_sapiens' loaded from database 'homo_sapiens_core_70_37'
Species 'homo_sapiens' loaded from database 'homo_sapiens_cdna_70_37'
Species 'homo_sapiens' loaded from database 'homo_sapiens_vega_70_37'
Species 'homo_sapiens' loaded from database
'homo_sapiens_otherfeatures_70_37'
Species 'homo_sapiens' loaded from database 'homo_sapiens_rnaseq_70_37'
homo_sapiens_variation_70_37 loaded
homo_sapiens_funcgen_70_37 loaded
Bio::EnsEMBL::Compara::DBSQL::DBAdaptor not found so the following
compara databases will be ignored: ensembl_compara_70
ensembl_ancestral_70 loaded
ensembl_ontology_70 loaded
ensembl_stable_ids_70 loaded
2013-02-06 11:21:35 - Connected to core version 70 database and
variation version 70 database
2013-02-06 11:21:35 - INFO: Cache directory
/Data/VEP/test/homo_sapiens/70 not found - it will be created
2013-02-06 11:21:40 - INFO: Database will be accessed when using --hgvs
2013-02-06 11:21:40 - INFO: Database will be accessed when using --sift;
consider using the complete cache containing sift data (see
documentation for details)
2013-02-06 11:21:40 - INFO: Database will be accessed when using
--polyphen; consider using the complete cache containing polyphen data
(see documentation for details)
2013-02-06 11:21:40 - INFO: Database will be accessed when using
--regulatory; consider using the complete cache containing regulatory
data (see documentation for details)
2013-02-06 11:21:40 - Starting...
2013-02-06 11:21:42 - Read 62 variants into buffer
2013-02-06 11:21:42 - Reading transcript data from cache and/or database
[======================================================================================================]
[ 100% ]
2013-02-06 12:01:12 - Retrieved 409 transcripts (0 mem, 0 cached, 411
DB, 2 duplicates)
2013-02-06 12:01:12 - Reading regulatory data from cache and/or database
[======================================================================================================]
[ 100% ]
2013-02-06 12:05:08 - Retrieved 1738 regulatory features (0 mem, 0
cached, 1738 DB, 0 duplicates)
2013-02-06 12:05:08 - Checking for existing variations
[======================================================================================================]
[ 100% ]
2013-02-06 12:05:33 - Analyzing chromosome 1
2013-02-06 12:05:33 - Analyzing variants
[======================================================================================================]
[ 100% ]
2013-02-06 12:05:33 - Analyzing RegulatoryFeatures
[======================================================================================================]
[ 100% ]
2013-02-06 12:05:33 - Analyzing MotifFeatures
[======================================================================================================]
[ 100% ]
2013-02-06 12:05:33 - Calculating consequences
[=======> ] [ 9% ]
-------------------- EXCEPTION --------------------
MSG: Got to have an Exon object, not a
STACK Bio::EnsEMBL::Translation::start_Exon
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Translation.pm:271
STACK Bio::EnsEMBL::Transcript::transfer
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Transcript.pm:2511
STACK Bio::EnsEMBL::Variation::BaseTranscriptVariation::_three_prime_utr
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/BaseTranscriptVariation.pm:638
STACK
Bio::EnsEMBL::Variation::TranscriptVariationAllele::_get_alternate_cds
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/TranscriptVariationAllele.pm:1169
STACK
Bio::EnsEMBL::Variation::TranscriptVariationAllele::_get_fs_peptides
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/TranscriptVariationAllele.pm:1090
STACK
Bio::EnsEMBL::Variation::TranscriptVariationAllele::_get_hgvs_peptides
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/TranscriptVariationAllele.pm:930
STACK Bio::EnsEMBL::Variation::TranscriptVariationAllele::hgvs_protein
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/TranscriptVariationAllele.pm:705
STACK Bio::EnsEMBL::Variation::Utils::VEP::tva_to_line
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:1639
STACK Bio::EnsEMBL::Variation::Utils::VEP::vf_to_consequences
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:1472
STACK Bio::EnsEMBL::Variation::Utils::VEP::vf_list_to_cons
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:1275
STACK Bio::EnsEMBL::Variation::Utils::VEP::get_all_consequences
/usr/bioinf/source/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:1056
STACK main::main
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl:270
STACK toplevel
/usr/bioinf/source/variant_effect_predictor/variant_effect_predictor.pl:116
Date (localtime) = Wed Feb 6 12:05:33 2013
Ensembl API version = 70
---------------------------------------------------
More information about the Dev
mailing list