[ensembl-dev] VEP annotation discrepancy

Will McLaren wm2 at ebi.ac.uk
Thu Apr 17 09:47:29 BST 2014


Hi Andrew,

The VEP deals with sequence variants and structural variants slightly
differently.

The first line is being interpreted as a structural variant since the INFO
column contains the SVLEN and SVTYPE fields; the second is being
interpreted as a sequence variant since these fields are absent and you
have the ref and alt sequence defined.

--pick has not been extended to work with SVs yet, hence why you are seeing
multiple annotations for 1).

VEP's annotation is richest for sequence variants, since when a variant is
parsed as a structural variant the sequence is not considered, just the
coordinates. So in this case I'd recommend you remove the SVLEN and SVTYPE
fields when you have the sequence in the REF and ALT fields.

Hope that helps

Will McLaren
Ensembl Variation




On 16 April 2014 23:06, Andrew Carson <acarson at invivoscribe.com> wrote:

> Hi,
>
> I noticed some strange behavior when annotating variants from different
> variant calling strategies. Basically, I call the same variant by 2
> different strategies, but I get different annotations from VEP. The only
> difference between the two input variants is in the INFO/FORMAT/Genotype
> fields of the vcf file. But, for one of the outputs, the --pick option is
> not working. And there are some clear differences in the output.
>
>
>
> Here is an example:
>
>
>
> Input 1)
>
> 13      28602226        .     AAG     A .       PASS
> END=28602228;HOMLEN=20;HOMSEQ=AGAGAGAGAGAGAGAGAGAG;SVLEN=-2;SVTYPE=DEL
> GT:AD   0/1:49
>
>
>
> Input 2)
>
> 13      28602226        .     AAG     A .       PASS
> ADP=72;WT=0;HET=1;HOM=0;NC=0
> GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR
> 0/1:255:72:72:2:66:91.67%:4.0594E-37:33:32:1:1:40:26
>
>
>
> As you can see, the only differences occur after column 8, but I don’t
> think these should affect the annotation of the deletion.
>
>
>
> When I run these two inputs through VEP using the following command:
>
>
>
> perl /path/to/vep --fork 4 --no_stats --everything --cache -i input.vcf -o
> outpu.VEP.vcf --format vcf --force_overwrite --check_existing
> --check_alleles --vcf --no_progress --pubmed --gmaf --maf_1kg --pick
>
>
>
> I get the following:
>
>
>
> Output 1)
>
> 13      28602226        .     AAG     A .       PASS
> END=28602228;HOMLEN=20;HOMSEQ=AGAGAGAGAGAGAGAGAGAG;SVLEN=-2;SVTYPE=DEL;CSQ=deletion|ENSG00000122025|ENST00000380987|Transcript|intron_variant&NMD_transcript_variant&feature_truncation||||||||||16/24||||||-1|||FLT3|HGNC||||nonsense_mediated_decay|ENSP00000370374|||||||||,deletion|ENSG00000122025|ENST00000241453|Transcript|intron_variant&feature_truncation||||||||||16/23||||||-1||YES|FLT3|HGNC||||protein_coding|ENSP00000241453||CCDS31953.1|||||||,deletion|ENSG00000122025|ENST00000537084|Transcript|intron_variant&feature_truncation||||||||||16/22||||||-1|||FLT3|HGNC||||protein_coding|ENSP00000438139|||||||||,deletion|ENSG00000122025|ENST00000380982|Transcript|intron_variant&feature_truncation||||||||||16/23||||||-1|||FLT3|HGNC||||protein_coding|ENSP00000370369|||||||||
> GT:AD   0/1:49
>
>
>
> Output 2)
>
> 13      28602226        .     AAG     A .       PASS
> ADP=72;WT=0;HET=1;HOM=0;NC=0;CSQ=-|ENSG00000122025|ENST00000241453|Transcript|intron_variant&feature_truncation||||||rs60462219||||16/23||||||-1||YES|FLT3|HGNC||||protein_coding|ENSP00000241453||CCDS31953.1|ENST00000241453.7:c.2053+87_2053+88delCT||||||
> GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR
> 0/1:255:72:72:2:66:91.67%:4.0594E-37:33:32:1:1:40:26
>
>
>
> As you can see, the first output isn’t using --pick as it outputs multiple
> annotations. In addition, the annotations are slightly different from the
> “pick”d variant in output 2. The consequence changes from “deletion” to
> “-“. And in output 2, I get the CDS:
> “ENST00000241453.7:c.2053+87_2053+88delCT” which is not provided in the
> annotation of output 1.
>
>
>
> Is there are reason for this discrepancy? Is there something I can do to
> avoid getting these differences? Is my input or commands incorrect in this
> instance?
>
> Any help would be greatly appreciated.
>
> Thank you!
>
>
>
> Andrew R. Carson, Ph.D.
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140417/824d81ce/attachment.html>


More information about the Dev mailing list