[ensembl-dev] FW: Loftee and VCFCols Difficulties via port 3337

Alex Beesley Alex.Beesley at telethonkids.org.au
Fri Oct 9 03:22:30 BST 2015


Dear Team

I am experiencing significant difficulties with both the LoF.pm and VCFCols.pm plugins with VEP (FYI I am using a GRCh37 cache downloaded using the installer script and default settings (ensembl-tools release-82)).

# Issue 1
I want to use VCFCols.pm in order to obtain the original REF and ALT alleles from the VCF (to aid with interpretation of complex variants). However it seems that the only way to run VCFCols.pm plugin is in the online mode - if one tries to run it in offline mode (see first code example below), VEP returns an error relating to "$config->{ga}->fetch_by_transcript_stable_id($transcript_id)". However, when running online (see second code example), it is extremely slow. This is incredibly frustrating because I do not wish to use any of the VAX functionality or its related databases, I simply wish to grab the original REF, ALT and other VCF column headers (including the genotypes and FORMAT fields) in my VEP output. Is there another way to grab the original VCF columns in the VEP output other than using VCFCols.pm? Or a way to modify the plugin such that it can work offline?


perl ${VEP}/variant_effect_predictor.pl -i ${INPUT_VCF} -o ${INPUT_VCF%*.vcf}.vep --cache --assembly GRCh37 --offline \

        --force_overwrite --check_existing --fork 24 \

        --everything --flag_pick \

        --plugin CADD,${CADD_SNV},${CADD_INDEL} \

        --plugin ExAC,${EXAC} \

--plugin VCFCols \

        --plugin LoF,human_ancestor_fa:/home/san/alex/.vep/Plugins/loftee-master/human_ancestor.fa.gz \

        --fields Uploaded_variation,Location,REF,ALT,INFO,FORMAT,LoF,LoF_filter,LoF_flags,CADD_RAW,CADD_PHRED,ExAC_AF



perl ${VEP}/variant_effect_predictor.pl -i ${INPUT_VCF} -o ${INPUT_VCF%*.vcf}.ONLINE.vep --cache --assembly GRCh37 --port 3337 \

        --force_overwrite --check_existing --fork 24 \

        --everything --flag_pick \

        --plugin CADD,${CADD_SNV},${CADD_INDEL} \

        --plugin ExAC,${EXAC} \

--plugin VCFCols \

        --plugin LoF,human_ancestor_fa:/home/san/alex/.vep/Plugins/loftee-master/human_ancestor.fa.gz \

        --fields Uploaded_variation,Location,REF,ALT,INFO,FORMAT,LoF,LoF_filter,LoF_flags,CADD_RAW,CADD_PHRED,ExAC_AF



# Issue 2

When running VEP in either of the two modes shown above, I obtain different confidence calls from the LoF.pm in regards to frameshift mutations. Specifically, for the example shown below, the LoF.pm plugin will call the variant HC (high confidence) in ONLINE mode, but LC (low confidence) when running offline. The particular flag thrown up for the LC call relates to non-canonical intron splice sites, however I have checked this particular variant on UCSC and the splice appear to be canonical, thus the ONLINE vep output is correct, and the offline appears to be incorrect. Since I am using a local cache (and I have also tried using a local fasta file), I am at a loss to explain why I would get completely different results by these two approaches for a LoF call. As mentioned above, my cache was downloaded using the installer script and default settings (ensembl-tools release-82).



# Running Offline

#Uploaded_variation               Consequence        IMPACT  LoF

10_126691951_C/- - 10:126691951 - frameshift_variant  HIGH   LC NON_CAN_SPLICE_SURR

10_126692023_G/- - 10:126692023 - frameshift_variant  HIGH   LC NON_CAN_SPLICE_SURR


# Running Online

#Uploaded_variation               Consequence        IMPACT  LoF

10_126691951_C/- - 10:126691951 - frameshift_variant  HIGH   HC

10_126692023_G/- - 10:126692023 - frameshift_variant  HIGH   HC



I appreciate that neither VCFCols.pm nor LoF.pm were developed by your team, but I would be very grateful if you could help me on these issues as I have been struggling to get VEP customised for my needs for some time now. In regards to issue 1, I believe a lot of your users would benefit from a tool that could grab the original VCF headers in the VEP output, and in regards to the second issue, there must be something strange going on in regards to compatibility with the downloaded caches and the online databases but I am at a loss to explain it.


Many thanks in advance

Alex  Beesley

Telethon Kids Institute

Perth, Western Australia



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20151009/e87bb37f/attachment.html>


More information about the Dev mailing list