[ensembl-dev] inconsistency when mapping the same variant from protein to genomics vs genomics to protein

Sarah Hunt seh at ebi.ac.uk
Wed Jan 2 12:14:42 GMT 2019


Hi David,

Thanks for passing this on. Our March release of VEP will include a 
check that the transcript picked to map the variant from gene+protein 
change to genomic sequence is compatible with the reference allele 
supplied. As Andrew says, we aim to add VCF output to VariantRecoder in 
the future too.

Do let us know if you come across any other examples which do not behave 
as you expect. While we may not be able to support some of the stranger 
and more vague HGVS-like descriptions, we are very interested to know 
about and consider common use cases.

Best wishes,

Sarah


On 30/12/2018 02:22, David Tamborero wrote:
> (Sorry for the late response, i m currenrly with no much internet access)
>
> Thanks Andrew for your answer. I m a bit surprised of the general lack 
> of tools addressing these issues. Maybe it is not that required by the 
> community, although i would say the contrary.
>
> In any case, i will be tuned to see whether your next releases can 
> address some of them.
>
> Thanks again!
> Br
> D
>
>
> El vie., 21 dic. 2018 20:57, Andrew Parton <aparton at ebi.ac.uk 
> <mailto:aparton at ebi.ac.uk>> escribió:
>
>     Hi David,
>
>     One of the improvements that we could make that would make this
>     process a little easier would be if variant_recoder gave a VCF
>     output, this is something that we will look into. Thanks.
>
>     VEP could definitely do a better job of predicting HGVSg from
>     HGVSp. Officially, we require that input HGVS is relative to
>     genomic or transcript coordinates. VEP and Variant Recoder will
>     successfully convert from HGVSp to HGVSg sometimes, but as you’ve
>     noticed, there are distinct improvements that we can make. And
>     while the ability of variant recoder to convert from HGVSp will
>     improve over time, and I’ve added your comments to our list of
>     things to look at in the future, but I can’t guarantee when or
>     even if it’ll happen.
>
>     Kind Regards,
>     Andrew
>
>
>
>
>>     On 17 Dec 2018, at 15:24, David Tamborero
>>     <david.tamborero at gmail.com <mailto:david.tamborero at gmail.com>> wrote:
>>
>>     thanks for your answer!
>>
>>     mmm i understand that the protein representation can lead to a
>>     non-univocal genomic mapping, but i m unsure of why VEP tries to
>>     infer the genomic coordinates without considering the passed
>>     aminoacid of reference, (if this is what is happening !). Note
>>     that this particular aminoacid change (TP53:p.E285V) maps to a
>>     unique genomic missense mutation in all TP53 transcripts.
>>
>>     FYI (likely you know it), but when the mapping is not univocal,
>>     is not uncommon for other tools dealing with HGVS to give a guess
>>     --which is normally the 'most probable' based on different
>>     metrics-- as a first 'hit' (and detail the rest). This is
>>     specially needed when dealing with indels.
>>
>>     Although maybe this is too complicated for VEP. However, I m
>>     still not finding a good way for --by using your tools-- passing
>>     from HGVS protein representation to genomic coordinates (in 'vcf
>>     format', meaning chr pos ref alt). This is not an uncommon need
>>     in the field. If I may use this forum to ask, are you planning to
>>     support that in e.g. one of your API (i.e. like the 'hgvs
>>     conversor' but supporting the vcf-like output)?
>>
>>     many thanks for your time (and your work!)
>>     best regards from Stockholm
>>     d
>>
>>     El dom., 16 dic. 2018 a las 14:19, Andrew Parton
>>     (<aparton at ebi.ac.uk <mailto:aparton at ebi.ac.uk>>) escribió:
>>
>>         Hi David,
>>
>>         I’ve taken a look at this issue this morning and I think I
>>         can see what’s going on. I can reproduce this issue with the
>>         query: perl vep -id 'TP53:p.E285V' --database
>>         --force_overwrite --hgvs --port 3337
>>
>>         VEP guesses the genomic location based on this HGVS input
>>         (17:7565261), and identifies that overlapping transcript
>>         ENST00000413465 has a protein product. However, the 285th
>>         amino acid of this transcript is not E, but Y. The alternate
>>         allele is guessed by VEP from a collection of options that it
>>         has. For example, with the input HGVS 'TP53:p.M237I’, then
>>         VEP has 3 potential alternate alleles it can use to do this,
>>         by converting the given ATG to one of ATA, ATC or ATT.
>>
>>         While VEP supports HGVS input, due to the complexity of HGVS
>>         and the variety of ways in which people use it, then we
>>         require that input HGVS is relative to genomic or transcript
>>         coordinates. In protein cases, we give a best guess where we
>>         can, but this is not guaranteed.
>>
>>         Sorry that I couldn’t be of more help.
>>
>>         Kind Regards,
>>         Andrew
>>
>>
>>>         On 14 Dec 2018, at 18:00, David Tamborero
>>>         <david.tamborero at gmail.com
>>>         <mailto:david.tamborero at gmail.com>> wrote:
>>>
>>>         Hi there,
>>>
>>>         regarding the conversion from protein to genomic
>>>         representation supported by VEP, I ve found a funny case; if
>>>         I input
>>>
>>>         TP53:p.E285V
>>>
>>>         VEP gives as output (vcf format)
>>>
>>>         17    7565261    TP53:p.E285V    T  A
>>>
>>>         And then if I input to VEP that vcf entry,  I obtain  two
>>>         TP53 protein annotations:
>>>
>>>         downstream_gene_variant for ENST00000359597
>>>         missense_variant for ENST00000413465
>>>
>>>         However, the missense variant is annotated as 285 Y/F   (and
>>>         not the E/V that I had at the start !)
>>>
>>>         so it looks that some inconsistency happened here, not sure
>>>         why. Am I missing some point ?
>>>
>>>         thanks in advance!
>>>         d
>>>
>>>
>>>
>>>         _______________________________________________
>>>         Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>         Posting guidelines and subscribe/unsubscribe info:
>>>         http://lists.ensembl.org/mailman/listinfo/dev
>>>         Ensembl Blog: http://www.ensembl.info/
>>
>>         _______________________________________________
>>         Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>         Posting guidelines and subscribe/unsubscribe info:
>>         http://lists.ensembl.org/mailman/listinfo/dev
>>         Ensembl Blog: http://www.ensembl.info/
>>
>>     _______________________________________________
>>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>     Posting guidelines and subscribe/unsubscribe info:
>>     http://lists.ensembl.org/mailman/listinfo/dev
>>     Ensembl Blog: http://www.ensembl.info/
>
>     _______________________________________________
>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>     Posting guidelines and subscribe/unsubscribe info:
>     http://lists.ensembl.org/mailman/listinfo/dev
>     Ensembl Blog: http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20190102/5411b47b/attachment.html>


More information about the Dev mailing list