[ensembl-dev] Incorrect HGVS nomenclature

Vasisht Tadigotla vasisht.tadigotla at courtagen.com
Wed Feb 4 14:53:01 GMT 2015


Hi Will,

Thanks for the clarification. Is there a list of transcripts where the RefSeq sequence doesn’t match the reference?

Regards,
Vasisht 

On February 4, 2015 at 4:38:56 AM, Will McLaren (wm2 at ebi.ac.uk) wrote:

Hi Vasisht,

Our RefSeq transcript set is imported from a file of coordinates provided by NCBI. In some cases the sequence of a RefSeq transcript does not match the reference sequence over which it is mapped, which causes a problem since Ensembl does not import the original RefSeq sequence, only the coordinates to which it maps.

This means that retrieving and manipulating the sequence in these transcripts can lead to misinterpretations, as is the case here. According to our analysis there is a 1bp insertion in the RefSeq sequence that does not appear in the reference; this would explain the out by 1 error here.

We hope to be able to provide information about these mismatches in the output from the next version of VEP. In the meantime we would always recommend to use the Ensembl transcript set where possible, as (among other good reasons!) transcript sequences always match the underlying reference.

Regards

Will McLaren
Ensembl Variation

On 3 February 2015 at 23:27, Vasisht Tadigotla <vasisht.tadigotla at courtagen.com> wrote:
Hi,

I’m annotating a variant using GRCh37 and the VEP in the v78 release and the HGVS annotation of the refseq transcripts doesn’t seem to match up to the sequences for those transcripts.

The variant is in SLC37A4 (chr11:g.118895980CAG>C), the HGVS annotations are NM_001467.5:c.1043_1044delCT and NP_001458.1:p.Pro348ArgfsTer? and the amino acid is being annotated as CCT/C.  The aa change is the same for all refseq transcripts in the annotation. 

The count seems to be off by one - it’s a CTG/G change. The local sequence context is GCC CTG TTT with the TG being deleted. 

The correct HGVS description is  NM_001467.5:c.1042_1043delCT  and the protein is p.Leu348Valfs*53. This is annotated correctly in the Ensembl transcripts - ENST00000545985.1:c.1042_1043delCT, ENSP00000475241.1:p.Leu348ValfsTer53. 

The following options were used for the annotation:

—offline —everything —merged 

The same issue exists with the web version of VEP:

http://grch37.ensembl.org/Homo_sapiens/Tools/VEP/Results?db=core;tl=Ba08mzoDSO2008gG-584627


Thanks,
Vasisht


_______________________________________________
Dev mailing list    Dev at ensembl.org
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/


_______________________________________________  
Dev mailing list Dev at ensembl.org  
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev  
Ensembl Blog: http://www.ensembl.info/  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150204/827d980e/attachment.html>


More information about the Dev mailing list