[ensembl-dev] VEP does not annotate all rows in a samtools VCF

Will McLaren wm2 at ebi.ac.uk
Mon Apr 28 10:49:43 BST 2014


I can't think of anything else - perhaps you can send me a snippet of your
input that recreates the problem?

e.g. a file with 1000 lines that the VEP reports annotating only 995

Will


On 28 April 2014 10:01, Marlies Dolezal <marlies.dolezal at gmail.com> wrote:

> hi Will,
> thanks for your quick reply!
> unfortunately using the  --allow_non_variant option gave the exact same
> output.
>
>
> -offline --allow_non_variant --force_overwrite --species bos_taurus --fork
> 16 - --input_file Chr$i.vcf --o Chr$i.vep
>
> Start time 2014-04-28 10:27:05
> End time 2014-04-28 10:36:57
> Run time 592 seconds
>
>
>
> Lines of input read 900857
> Variants processed 900241
> Variants remaining after filtering 900241
>
>
> any other ideas?
> thanks again!
> regards, marlies
>
>
> 2014-04-28 10:24 GMT+02:00 Will McLaren <wm2 at ebi.ac.uk>:
>
> >
> > Hello,
> >
> > It's possible you have some non-variant lines in your VCF; these will
> have a "." as the ALT allele column, something like:
> >
> > 21      26960070        .     G       .       .       .       .
> >
> > By default the VEP ignores these. You can force the VEP to allow them
> through (though they still won't be annotated) using --allow_non_variant.
> >
> > Regards
> >
> > Will McLaren
> > Ensembl Variation
> >
> >
> > On 28 April 2014 08:52, Marlies Dolezal <marlies.dolezal at gmail.com>
> wrote:
> >>
> >> hi all,
> >>
> >> i am using the latest VEP version 75 (API)(75) to annotate samtools VCF
> files.
> >>
> >> -offline --force_overwrite --species bos_taurus --fork 16 --input_file
> Chr$i.vcf --o Chr$i.vep
> >>
> >>
> >> the General statistics section of the VEP_summary.html tells me that
> all lines of my vcfs are read in, but only a subset of these are processed.
> >> eg:
> >> Lines of input read 900857
> >> Variants processed 900241
> >>
> >> the difference in lines does not correspond to header/comment lines
> only.
> >>
> >> where can i find out which variants are not processed to try to figure
> out why they are not processed?
> >>
> >> thanks a lot in advance
> >> regards Marlies
> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Dr. Marlies Dolezal
> >> 1030 Vienna
> >> Austria/Europe
> >>
> >> marlies.dolezal(at)gmail.com
> >>
> >> “The great tragedy of science is the slaying of a beautiful hypothesis
> by an ugly fact.”
> >> Thomas Henry Huxley
> >> (1825-1895)
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Dev mailing list    Dev at ensembl.org
> >> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> >> Ensembl Blog: http://www.ensembl.info/
> >>
> >
> >
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
> >
>
>
>
> --
> Dr. Marlies Dolezal
> 1030 Vienna
> Austria/Europe
>
> marlies.dolezal(at)gmail.com
>
> “The great tragedy of science is the slaying of a beautiful hypothesis by
> an ugly fact.”
> Thomas Henry Huxley
> (1825-1895)
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140428/1a7bbdcc/attachment.html>


More information about the Dev mailing list