[ensembl-dev] Fwd: Problems in using VEP to convert variants from protein to genomic coordinates

David Tamborero david.tamborero at gmail.com
Wed Dec 5 09:27:16 GMT 2018


Hi there,

I m badly interested in converting protein changes to genomic changes via
the VEP  standalone perl script. Of note, I have the gene (but not the
specific transcript) in which such a change is annotated (eg NRAS:Q61L).

I m experiencing some problems that I cannot figure it out how to solve; I
ve tried to find any documentation to address those in the dev list
archives etc but I failed. So I hope this is a good way to look for help
(thanks in advance!).

I m using VEP 93.3 with hg19 (I do not think that the vep parameters are
relevant, but note that I m using vcf as output format since I need the
coordinates in chr-pos-ref-alt   ---and not HGVS---  format).

-- insertion of stop codons--

so for cases as CDKN2A:p.Q50* or PTEN:p.R233*, VEP works smooth; however,
for other cases (eg BRCA2:p.S662*) VEP does not seem happy and does not
give an output.

My guess is that this has to be with the fact that the ones that work are
stop codons caused by a specific nucleotide change (eg. CDKN2A:p.Q50* ->
chr10 89717672 C>T) whereas those that do not work are caused by
insertions, which I guess it gives a larger universe of possible nucleotide
changes.

However, why not to have a 'more possible' guess? e.g. for BRCA2:p.S662*,
TransVar gives a fair chr13:g.32910477_32910478delCTinsAA try.

--indels--

here I m failing badly. I m not even sure that I m using the correct
nomenclature. For instance, for frameshifts, DTX1:p.P11fs*2 is giving a
'Unable to parse HGVS notation' error. I went to the variant recoder API
REST, and this specific variant in cdna nomenclature (which i retrieved
also by TransVar), is represented as "ENSP00000257600.3:p.Pro11LeufsTer2"
in protein coordinates

https://rest.ensembl.org/variant_recoder/human/DTX1:c.32delC?content-type=application/json

but when I input this representation to the VEP command, I m having the
format error anyway, even after changing ENSP00000257600 for the gene
(DTX1) or transcript (ENST00000257600) name plus some alterntives to
the p.Pro11LeufsTer2
representation.

Am I missing some point?

many thanks in advance (and congratulations for such a great tool!)
br
d
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20181205/72e09365/attachment.html>


More information about the Dev mailing list