[ensembl-dev] VEP exception when using a specific rsid

Tjaart de Beer tjaart at ebi.ac.uk
Wed Oct 9 11:52:02 BST 2013


Hi Will,

Thanks for the answers. I'll look at the two options and see which one
suits me best for now. By the way the link for installing a local mirror
from this page

http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#local

"...installing a local mirror, see HERE."

which points to

http://www.ensembl.org/info/docs/webcode/install/ensembl-data.html

results in an error.

Thanks again,

Tjaart

> Hi Tjaart,
>
> Firstly, thanks for finding this, there was an odd bug with patched
> chromosome regions that was causing this error. I've fixed it on the CVS
> branch for 73 - if you update your ensembl-variation checkout, or rerun
> the
> installer if you used that, you should pick up the fixed code.
>
> Secondly, while it is possible to use rsIDs as input, I wouldn't recommend
> doing this on such a large scale. Each rsID has to be looked up in the
> database to find its genomic location and alleles. So even though you are
> using the cache, the VEP will be querying the public Ensembl database for
> each rsID, and then using the cache for it's consequence predictions.
>
> If you do need to query in this way, I'd suggest setting up a local mirror
> of the Ensembl variation database, see
> http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#local
>
> Another alternative would be to extract the corresponding VCF entries for
> your rsIDs from our VCF dumps (
> ftp://ftp.ensembl.org/pub/release-73/variation/vcf/homo_sapiens/), using
> something like vcftools (and the --snps flag). You can either get the
> version of the VCF with consequences, which may save you running the VEP
> at
> all, or get the version without consequences and run this through the VEP
> if you need more than the basic consequence information.
>
> Hope that helps
>
> Will McLaren
> Ensembl Variation
>
>
> On 8 October 2013 16:55, Tjaart de Beer <tjaart at ebi.ac.uk> wrote:
>
>> Hi all,
>>
>> I just installed VEP to have a look at some human variant data. I have
>> about 550,000 rsids. As far as I understand from the documentation an
>> rsid
>> on its own should be enough. When I run my rsids I get the following
>> error:
>>
>> -------------------- EXCEPTION --------------------
>> MSG: SEQ_REGION_NAME argument is required
>> STACK Bio::EnsEMBL::Slice::new
>> /home/tjaart/my_genes/variant_effect_predictor/Bio/EnsEMBL/Slice.pm:149
>> STACK Bio::EnsEMBL::Variation::Utils::VEP::get_slice
>>
>> /home/tjaart/my_genes/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:3306
>> STACK Bio::EnsEMBL::Variation::Utils::VEP::cache_transcripts
>>
>> /home/tjaart/my_genes/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:3596
>> STACK Bio::EnsEMBL::Variation::Utils::VEP::fetch_transcripts
>>
>> /home/tjaart/my_genes/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:2837
>> STACK Bio::EnsEMBL::Variation::Utils::VEP::vf_list_to_cons
>>
>> /home/tjaart/my_genes/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:1180
>> STACK Bio::EnsEMBL::Variation::Utils::VEP::get_all_consequences
>>
>> /home/tjaart/my_genes/variant_effect_predictor/Bio/EnsEMBL/Variation/Utils/VEP.pm:1125
>> STACK main::main variant_effect_predictor.pl:360
>> STACK toplevel variant_effect_predictor.pl:198
>> Date (localtime)    = Tue Oct  8 16:43:28 2013
>> Ensembl API version = 73
>>
>> As far as I could make out this means that the variant is not in
>> Ensembl.
>> In my test set I have traced it to rs7289804 which does not occur in
>> Ensembl (doing a web search).
>>
>> I was wondering if there is a way around this with some kind of flag to
>> ignore rsids for which there is not enough data or it not being present?
>> I
>> couldn't find such a flag in the documentation.
>>
>> My command is:
>>
>> perl variant_effect_predictor.pl -i ../test.dat --cache --coding_only
>> --filter coding_change --force_overwrite
>>
>> My input file contains no extra line breaks or any strange characters.
>>
>> Any help would be appreciated. Thanks!
>>
>>
>> --
>> Dr. Tjaart de Beer
>> Thornton group
>> European Bioinformatics Institute (EMBL-EBI)
>> European Molecular Biology Laboratory
>> Wellcome Trust Genome Campus
>> Hinxton
>> Cambridge CB10 1SD
>> United Kingdom
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>


--
Dr. Tjaart de Beer
Thornton group
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom






More information about the Dev mailing list