[ensembl-dev] Fwd: Problems in using VEP to convert variants from protein to genomic coordinates

Andrew Parton aparton at ebi.ac.uk
Fri Dec 7 17:15:53 GMT 2018


Hi David,

Thanks for your query - while VEP supports HGVS input, due to the complexity of HGVS and the variety of ways in which people use it, then we require that input HGVS is relative to genomic or transcript coordinates. We have some documentation (that we are in the process of improving) on the matter here: http://www.ensembl.org/info/docs/tools/vep/vep_formats.html#hgvs

It may be more appropriate for you to use our variant recoder tool, also contained within the ensembl-vep repository, rather than VEP itself, as this will give you HGVSg output. However, without a particular transcript within the input, then it is possible that the variant could map to multiple locations. Variant Recoder will often guess at a solution (in the BRCA2 example you gave then it suggests NC_000013.11:g.32336340_32336341delinsAA), however as you have noticed, unless the correct genomic/transcript nomenclature is used then a result cannot be guaranteed.

We are currently in the process of improving how we handle HGVS inputs, however this work is still in development. Sorry that I couldn’t give you a more beneficial response.

Kind Regards,
Andrew


> On 5 Dec 2018, at 09:27, David Tamborero <david.tamborero at gmail.com> wrote:
> 
> Hi there,
> 
> I m badly interested in converting protein changes to genomic changes via the VEP  standalone perl script. Of note, I have the gene (but not the specific transcript) in which such a change is annotated (eg NRAS:Q61L).
> 
> I m experiencing some problems that I cannot figure it out how to solve; I ve tried to find any documentation to address those in the dev list archives etc but I failed. So I hope this is a good way to look for help (thanks in advance!). 
> 
> I m using VEP 93.3 with hg19 (I do not think that the vep parameters are relevant, but note that I m using vcf as output format since I need the coordinates in chr-pos-ref-alt   ---and not HGVS---  format). 
> 
> -- insertion of stop codons--
> 
> so for cases as CDKN2A:p.Q50* or PTEN:p.R233*, VEP works smooth; however, for other cases (eg BRCA2:p.S662*) VEP does not seem happy and does not give an output. 
> 
> My guess is that this has to be with the fact that the ones that work are stop codons caused by a specific nucleotide change (eg. CDKN2A:p.Q50* -> chr10 89717672 C>T) whereas those that do not work are caused by insertions, which I guess it gives a larger universe of possible nucleotide changes. 
> 
> However, why not to have a 'more possible' guess? e.g. for BRCA2:p.S662*, TransVar gives a fair chr13:g.32910477_32910478delCTinsAA try.
> 
> --indels--
> 
> here I m failing badly. I m not even sure that I m using the correct nomenclature. For instance, for frameshifts, DTX1:p.P11fs*2 is giving a 'Unable to parse HGVS notation' error. I went to the variant recoder API REST, and this specific variant in cdna nomenclature (which i retrieved also by TransVar), is represented as "ENSP00000257600.3:p.Pro11LeufsTer2" in protein coordinates
> 
> https://rest.ensembl.org/variant_recoder/human/DTX1:c.32delC?content-type=application/json <https://rest.ensembl.org/variant_recoder/human/DTX1:c.32delC?content-type=application/json>
> 
> but when I input this representation to the VEP command, I m having the format error anyway, even after changing ENSP00000257600 for the gene  (DTX1) or transcript (ENST00000257600) name plus some alterntives to the p.Pro11LeufsTer2 representation.
> 
> Am I missing some point?
> 
> many thanks in advance (and congratulations for such a great tool!)
> br
> d
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20181207/8d88b72b/attachment.html>


More information about the Dev mailing list