[ensembl-dev] upstream/downstream changes based on context in vcf?
Matt Wood
matt.wood at codifiedgenomics.com
Mon Jan 14 22:16:12 GMT 2019
I've managed to put together a small example with two variants that show
the behavior. I've attached two vcfs.
The variant we're looking at is 22:35998183TTGTGTC>T.
I've found I can reproduce the bug with just the following options:
perl /space1/software/vep/ensembl-vep/vep --dir /vep/cache/ --offline --vcf
--regulatory -i has_downstream.vcf -o has_downstream.vcf.vep
If you run it on no_downstream.vcf we're getting the following output:
22 35998183 . TTGTGTC T 120 PASS
CSQ=-|TF_binding_site_variant|MODIFIER|||MotifFeature|MA0058.2|||||||||||||-1||||Max:MA0058.2|1|N|,-|intergenic_variant|MODIFIER||||||||||||||||||||||||
GT:VR:RR:DP:GQ 0/1:10:10:20:.
But, if we run it on has_downstream.vcf we get additional downstream
effects for the variant we're looking at:
22 35998183 . TTGTGTC T 120 PASS
CSQ=-|downstream_gene_variant|MODIFIER|MB|ENSG00000198125|Transcript|ENST00000359787|protein_coding|||||||||||4622|-1||HGNC|6915||||,-|downstream_gene_variant|MODIFIER|MB|ENSG00000198125|Transcript|ENST00000397326|protein_coding|||||||||||4622|-1||HGNC|6915||||,-|downstream_gene_variant|MODIFIER|MB|ENSG00000198125|Transcript|ENST00000397328|protein_coding|||||||||||4622|-1||HGNC|6915||||,-|downstream_gene_variant|MODIFIER|MB|ENSG00000198125|Transcript|ENST00000401702|protein_coding|||||||||||4622|-1||HGNC|6915||||,-|TF_binding_site_variant|MODIFIER|||MotifFeature|MA0058.2|||||||||||||-1||||Max:MA0058.2|1|N|
GT:VR:RR:DP:GQ 0/1:10:10:20:.
The only difference being the addition of another variant in the vcf.
Our version info:
Versions:
ensembl : 89.df47f96
ensembl-funcgen : 89.678099a
ensembl-io : 89.feefbc2
ensembl-variation : 89.af9aae5
ensembl-vep : 89.7
Thanks for taking a look,
Matt
On Mon, Jan 14, 2019 at 8:15 AM Andrew Parton <aparton at ebi.ac.uk> wrote:
> Hi Matt,
>
> Upstream/Downstream consequences should be reported on the occasions where
> a variant is found to be 5’/3’ of a gene - it shouldn’t depend on
> surrounding variants within the VCF. Could you please send me an example so
> that I can reproduce it and take a closer look?
>
> Thanks,
> Andrew
>
> > On 11 Jan 2019, at 17:12, Matt Wood <matt.wood at codifiedgenomics.com>
> wrote:
> >
> > I was hoping to get some clarification on how the upstream_gene_variant
> and downstream_gene_variant consequences are determined.
> >
> > We've found several cases where a variant will have downstream or
> upstream consequences when that variant appears in one VCF, but will be
> annotated without those consequences when in another VCF.
> >
> > Our best guess is that the context of surrounding variants in the VCF is
> changing whether those consequences are applied.
> >
> > Is that how it is supposed to work? We're running version 89.7.
> >
> > Matt
> > _______________________________________________
> > Dev mailing list Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20190114/1ec8f0e8/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: has_downstream.vcf
Type: text/vcard
Size: 123 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20190114/1ec8f0e8/attachment.vcf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: no_downstream.vcf
Type: text/vcard
Size: 65 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20190114/1ec8f0e8/attachment-0001.vcf>
More information about the Dev
mailing list