[ensembl-dev] Clarifying VEP outputs for inframe insertions

John M.C. Ma manchunjohn-ma at uiowa.edu
Thu Sep 29 19:14:10 BST 2016


Hi,

I'm reviewing my VEP output from VEP 84 for parsing scripts. I'm
confused regarding the inconsistent annotations for inframe
insertions.

I have encountered, in the output from a single VEP run, two ways how
VEP notates protein-level changes. For your convenience, the fields
names used here are the same ones used in CSQ outputs.

1. The first type would have only a single number in the
Protein_position field, and the Animo_acids field would include the
residue immediately before the insertion. For example,
Protein_position = 114, Animo_acids=A/AILH.

2. The second type includes two numbers in the Protein_position field,
and in the Anime_acids field the ref AA is marked as -. For example,
Protein_position=271-272, Animo_acids=-/APTP.

While I have an ides about how to parse these two types, I'm puzzled
about this inconsistency.

For your reference, I was running with local e84 GRCm38 cache with
fasta file with VEP 84 and perl 5.22.1 in an Linux environment.

Thanks for the assistance.

Cheers,

John




More information about the Dev mailing list