[ensembl-dev] Probably incorrect HGVS on GRCh37 RefSeq
Wallace Ko
myko at l3-bioinfo.com
Mon Aug 31 09:19:06 BST 2020
Hi Andrew,
Since VEP 100 (GRCh37) the REFSEQ_MATCH column is filled with content. Is
it reliable to use this column to determine if the HGVS code is
probably incorrect because of RefSeq alignment mismatch?
Or shall I simply use the BAM_EDIT column for the purpose?
And is my understanding of the BAM_EDIT value below correct (according to
this Github issue
<https://github.com/Ensembl/ensembl-vep/issues/265#issuecomment-415416679>)?
- *-*: no mismatch is found. Annotations and HGVS code are both fine.
- *OK*: mismatch is found and fix is applied. Annotations are fine. HGVS
code is fixed too but could still be incorrect in some cases.
- *FAILED*: mismatch is found and fix could not be applied. Both
annotations and HGVS code could be incorrect.
Regards,
Wallace Ko
On Tue, Jan 21, 2020 at 7:50 PM Andrew Parton <aparton at ebi.ac.uk> wrote:
> Hi,
>
> Yep, that’s correct.
>
> One thing to be aware of however is that our HGVS code shifts variants
> reported in repeated regions in the 3’ direction by default, while our CDS
> position is not shifted in such a way. This is the most common cause of CDS
> position and HGVSc position mismatch, although it can also be caused by
> these RefSeq alignment mismatches.
>
> Kind Regards,
> Andrew
>
> On 21 Jan 2020, at 11:08, Wallace Ko <myko at l3-bioinfo.com> wrote:
>
> Hi Andrew,
>
> Thanks for the prompt response.
> May I assume that this is just the problem of HGVS calculation and CDS
> position is already corrected by RefSeq alignment in such case?
>
> Regards,
> Wallace Ko
>
>
> On Tue, Jan 21, 2020 at 6:30 PM Andrew Parton <aparton at ebi.ac.uk> wrote:
>
>> Hi Wallace,
>>
>> Thanks for this report, it is an issue we are aware of. As you
>> identified, not all RefSeq transcripts completely match the reference
>> genome. In cases where they don't, we are now using alignment files
>> provided by NCBI to create a new reference, matching the transcript, and
>> use this for consequence calling.
>>
>> Our HGVS calculation does not currently use this reference modification,
>> but it is something we are working on and aim to release later this year.
>> VEP can report reference miss-matches for GRCh38, but these data are not
>> available for GRCh37.
>>
>> More details on the differences to the reference genome and correcting
>> transcript models using BAM can be found here:
>> https://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#refseq
>>
>> Let us know if there’s anything else we can do to help.
>>
>> Kind Regards,
>> Andrew
>>
>> On 21 Jan 2020, at 09:23, Wallace Ko <myko at l3-bioinfo.com> wrote:
>>
>> Hi Ensembl Developers,
>>
>> The variant NC_000012.11:g.103249104C>A is annotated by online VEP and
>> offline cached VEP (99, RefSeq, GRCh37) as:
>>
>> - HGVSc: NM_000277.1:*c.517*G>T
>> - HGVSp: NP_000268.1:p.Gln172His
>> - CDS Position: 516
>>
>> On the other hand, ClinVar
>> <https://www.ncbi.nlm.nih.gov/clinvar/variation/664621/> reports the
>> variant as NM_000277.3:*c.516*G>T (NP_000268.1:p.Gln172His). Besides,
>> blast result shows that there is a 1-bp gap between c.303 and c.304
>> when NM_000277.1 is aligned to NC_000012.11. And even VEP itself reports
>> the CDS position as 516.
>>
>> All these make me believe that the HGVSc reported should be at c.516
>> instead of c.517.
>>
>> Regards,
>> Wallace Ko
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
>> Ensembl Blog: http://www.ensembl.info/
>>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog: http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20200831/62ebe679/attachment.html>
More information about the Dev
mailing list