[ensembl-dev] possible HGVSc reporting issues

Will McLaren wm2 at ebi.ac.uk
Thu Mar 15 09:41:35 GMT 2012


Hi Reece,

Thanks for spotting these.

We're currently revamping our HGVS code, so we hope to deal with as
many of these issues as we can as we go through the code. There does
seem to be some complications with ins/dels around the CDS stop codon,
and as you can understand this is a tricky bit to get right and
conform exactly to the HGVS specifications.

Cheers

Will McLaren
Ensembl Variation

On 15 March 2012 02:18, Reece Hart <reece at harts.net> wrote:
> Hi-
>
> We've compared a number of hand-curated HGVS-formatted variants against
> results from VEP. Four interesting discrepancies arose in 3 transcripts when
> a deletion region spanned the cds stop. In all cases, the start and end
> positions were swapped in the Ensembl-generated HGVS tag. These issues may
> be related to those already raised by Suresh Surampudi last week.
>
> For example, with the VCF input (long del sequence elided):
>
> 16 89804972 CVID1002968 AC...GC A 60 PASS
> cvid-type=3;CVID1002968(89804973)=del138|ref;vcf-var-code=del;id=CVID1002968;cond=FANCAfanconi;mutation-type-id=3;cvid-class=Pvar;build=GRCh37;rsid=;rnaid=NM_000135.2:c.4267_4368+36;risk-model-name=AREC;risk-model-group=qualitative-risk-model;alleles=del138,ref;curated-loc=16:89804973-89805110;full-hgvs=NM_000135.2:c.4267_4368+36del138;rel-to-dna=;hgvs-protein=;hap-region=
> GT ./. ./.
>
> We get (e!65, vep 2.3):
>
> CVID1002968 16:89804973-89805110 - ENSG00000158805 ENST00000446326
> Transcript 500B_downstream_variant - - - - - - HGNC=ZNF276
> CVID1002968 16:89804973-89805110 - ENSG00000158805 ENST00000289816
> Transcript 3_prime_UTR_variant 2251-2388 - - - - -
> RefSeq=NM_152287.3-RefSeq_mRNA;RefSeqHGVSc=NM_152287.3:c.*319_*456delC...C;HGNC=ZNF276;CCDS=CCDS10986.1
> CVID1002968 16:89804973-89805110 - ENSG00000158805 ENST00000443381
> Transcript 3_prime_UTR_variant 2261-2398 - - - - -
> RefSeq=NM_001113525.1-RefSeq_mRNA;RefSeqHGVSc=NM_001113525.1:c.*319_*456delC...C;HGNC=ZNF276;CCDS=CCDS45554.1
> CVID1002968 16:89804973-89805110 - ENSG00000187741 ENST00000389301
> Transcript
> complex_change_in_transcript,3_prime_UTR_variant,coding_sequence_variant
> 4309-4446 - - - - -
> RefSeq=NM_000135.2-RefSeq_mRNA;RefSeqHGVSc=NM_000135.2:c.*36_4267delG...G;HGVSc=ENST00000389301.2:c.*36_4267delG...G;HGNC=FANCA;CCDS=CCDS32515.1
> CVID1002968 16:89804973-89805110 - ENSG00000187741 ENST00000305699
> Transcript 5KB_downstream_variant - - - - - - HGNC=FANCA
>
> Note the "c.*36_4267del" line. The CDS ends at 4368, this HGVS has end <
> start.
>
>
> Other examples include:
>
> curated: NM_000135.2:c.4267_4368+36del138
> vep: NM_000135.2:c.*36_4267delGCTG...
>
> curated: NM_000135.2:c.4268_4368+37del138
> vep: NM_000135.2:c.*37_4268delCTGA...
>
> curated: NM_004992.3:c.1448_*29del43
> vep: NM_004992.3:c.*29_1448delAGAGAGTTAGCTGACTTTACACGGAGCGGATTGCAAAGCAAAC
>
> curated: NM_130839.1:c.2616_2630del15
> vep: NM_130839.2:c.*11_2616delGTAAAACAAAACAAA
>
>
> Thanks,
> Reece
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>




More information about the Dev mailing list