[ensembl-dev] VEP cannot annotate * alternate alleles

Konrad Karczewski konradk at broadinstitute.org
Wed Nov 18 15:15:32 GMT 2015


Hi Will,




Thanks! If it's not too much trouble to push to 81, that'd be great (I haven't tested 82 too much yet). If it's annoying though, don't worry about it.




Thanks!


-Konrad

On Wed, Nov 18, 2015 at 8:18 AM, Will McLaren <wm2 at ebi.ac.uk> wrote:

> Hi Konrad,
> Thanks, this is fixed on 82.
> I can push to 81 too if you need?
> Will
> On 17 November 2015 at 01:42, Konrad Karczewski <konradk at broadinstitute.org>
> wrote:
>> Hi Will,
>>
>> Along these lines, however, VEP doesn't seem to be writing ALLELE_NUM
>> properly on these lines (particularly when the * is not last in the list).
>>
>> For instance, when annotating this (VEP v81 and 82):
>>
>> 1       874817  rs201898716     C       *,T
>>
>> it gives the same output as if the ALT were T,*
>>
>> Hope it's an easy fix!
>> -Konrad
>>
>> On Nov 16, 2015, at 12:36 PM, JESSICA X. CHONG <jxchong at uw.edu> wrote:
>>
>> Ahhh ok, this makes sense to find the annotation on the previous line.
>>
>> Thanks.
>>
>> On Nov 16, 2015, at 3:00 AM, Will McLaren <wm2 at ebi.ac.uk> wrote:
>>
>> Hi Jessica,
>>
>> The * allele is not annotated because there is nothing to annotate; it
>> represents the absence of any allele due to an upstream deletion. The
>> upstream deletion will of course be annotated.
>>
>> Example:
>>
>> REF:  ACGGTAGC
>> S1 :  ACG----C
>> S2 :  ACGGCAGC
>>
>> There are two mutations here; one deletion in S1, and one SNP in S2.
>> Because the sequence containing the SNP is absent in S1, the genotype calls
>> for this site in individuals with the deletion are otherwise impossible to
>> annotate, unless you use the * notation.
>>
>> So the VCF looks like this:
>>
>> chr1 3 GGTAG G   ... 1/1 0/0
>> chr1 5 T     C,* ... 2,2 1/1
>>
>> VEP will annotate the deletion as normal, but will only annotate the C ALT
>> allele for the SNP; the * represents the absence of an allele, so no
>> annotation is added - the deleted sequence is annotated for the variant on
>> the previous line.
>>
>> Hopefully that makes sense!
>>
>> If you can provide an example of where you think VEP is doing something
>> wrong then of course please do let us know.
>>
>> Regards
>>
>> Will McLaren
>> Ensembl Variation
>>
>> On 13 November 2015 at 21:35, JESSICA X. CHONG <jxchong at uw.edu> wrote:
>>
>>>
>>> http://gatkforums.broadinstitute.org/discussion/6240/what-means-is-in-alternative-allele
>>>
>>> Apparently GATK now uses * to represent long “upstream deletions.”
>>>
>>> It looks like as of v82, VEP cannot handle annotation of * alleles. The
>>> lines in the resulting VCF completely lack VEP’s CSQ field.
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20151118/0d70f2e8/attachment.html>


More information about the Dev mailing list