[ensembl-dev] VEP Alleles and ALTs

Nicolas Thierry-Mieg Nicolas.Thierry-Mieg at univ-grenoble-alpes.fr
Thu Jun 1 15:53:26 BST 2017


Hello,

I am trying to systematically match VEP consequences (based on the VEP 
Allele" field) to the correct ALT allele. This is a lot harder than it 
sounds, and gets really tricky with indels and/or when the VCF has 
several ALT alleles (on a single line).

Here is an example input VCF:

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
1	69469	.	ACAATT	A,ACA	.	PASS	
1	69469	.	ACAATT	A,ACA,T	.	PASS	

For the first line, VEP uses "-" and "CA" for "Allele"; but for the 
second line they are "A", "ACA" and "T", although the first two ALT 
alleles are the same as in line 1! This shows that the content of the 
"Allele" field depends on the whole list of ALT alleles in the VCF...


To make a long story short, I end up with the following rule to 
construct the "Allele" field from VEP's CSQ:
if the first nucleotides of the REF allele and of all the ALT alleles 
are the same, then this nucleotide is omitted from VEP's "Allele" field.

Is this correct?

Note that for reverse-engineering and testing this, I used an older VEP 
release (v81). Perhaps my rule is no longer valid... I wanted to test 
with the latest VEP version, but I'm having issues installing it, as 
discussed in another thread.

Regards,
Nicolas





More information about the Dev mailing list