[ensembl-dev] protein position column for indels

David Tamborero david.tamborero at gmail.com
Fri Jul 26 11:48:05 BST 2019


Could not be clearer, thanks a lot!

(and +1 to have the possibility of having the 3'shifted info)

Have a nice weekend
D

El vie., 26 jul. 2019 12:43, Andrew Parton <aparton at ebi.ac.uk> escribió:

> Hi David,
>
> Thank you for your query, there are a couple of reasons for these
> differences.
>
> 1) Insertions/deletions are always described in their most 3’ position in
> HGVS notation. So if, for example, you insert an A into a repeated region
> of As, the HGVS output will be reported at the most 3’ region, whereas the
> protein position column will report the position as it was given to VEP. We
> are currently looking at shifting all variants 3’ by default, and will
> include this in a future release.
>
> 2) The protein position column will cover all input locations (including
> the reference), while the HGVS output will use only a minimal allele
> string. For example, in the sequence
>
> ATG CTG
>
> Then input of an insertion of a T in position [3,4] in the standard VCF
> format of ‘chr 3 varName G GT’ would provide a range of 1-2 for the protein
> position (as it is also considering the reference G that was given), while
> the HGVS would recognise the insertion as only being in position 2.
>
> I think these two cases cover all of your examples. If you have any more
> questions, or any particular examples that you’d like us to take a closer
> look at, please let us know.
>
> Kind Regards,
> Andrew
>
> > On 25 Jul 2019, at 16:11, David Tamborero <david.tamborero at gmail.com>
> wrote:
> >
> > Hi ensembl devs,
> >
> > I m struggling to fully understand how the 'protein position' column is
> calculated when I check the variant hgvsp representation
> >
> > this happens only for indels; some examples (left=hgvsp entry;
> right=protein position entry):
> >
> > frameshift:
> > ENSP00000277541.6:p.Gln2444ThrfsTer34   2444
> > ENSP00000256474.2:p.Lys159ArgfsTer14   158-159
> > ENSP00000324856.6:p.Tyr253SerfsTer32   252-254
> >
> > inframe deletions:
> > ENSP00000356379.4:p.Tyr1373del   1373-1374
> > ENSP00000361824.3:p.Glu2207del    2207
> > ENSP00000339004.3:p.His57del      53-54
> > ENSP00000268125.5:p.Phe96_Phe99del    96-99
> > ENSP00000413720.3:p.Ala171_Ala174del    171-175
> > ENSP00000368332.4:p.Ala114_Ala115del    110-112
> >
> > inframe insertions:
> > ENSP00000369497.3:p.Glu238_Ser239insArg   239
> > ENSP00000339867.2:p.Asp687_Gly688insPhe   687-688
> > ENSP00000445920.1:p.Val188_Ala192dup   188-192
> > ENSP00000361824.3:p.Arg2308_Met2309dup   2308-2310
> >
> > I m guessing that this may be related in part to right/left alignement
> discrepancies in the reported coordinates between these two columns (e.g.
> ENSP00000368332.4:p.Ala114_Ala115del --> 110-112 or
> ENSP00000339004.3:p.His57del  ---> 53-54) ?
> >
> > and that there is certain issue that sometimes makes you report in the
> protein column 'n' or 'n+1' positions -where n is the number of affected
> residues according to the HGVSp (e.g.
> ENSP00000277541.6:p.Gln2444ThrfsTer34-->2444  or
> ENSP00000445920.1:p.Val188_Ala192dup  -- > 188-192  report 'n'  whereas
> ENSP00000413720.3:p.Ala171_Ala174del -->171-175 or
> ENSP00000368332.4:p.Ala114_Ala115del-->110-112 report 'n+1')?
> >
> > apologies if this is documented somewhere, i ve been not able to find
> the details of that entry
> >
> > thanks in advance!
> > d
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> > Ensembl Blog: http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20190726/dcc3f5c2/attachment.html>


More information about the Dev mailing list