[ensembl-dev] VEP - method to verify the result

Will McLaren wm2 at ebi.ac.uk
Wed Mar 6 09:40:22 GMT 2013


A simple way would be to look at the end of the result file using tail on
the command line.

With the default options set, you should see at least one line of output
for each line of input. So, if you compare the identifier or sequence
position from the last line of the result file to the last line of your
input file, if they are the same you can assume that the run has finished

You could also count the number of unique variants in the output file:

grep -v # variant_effect_output.txt | cut -f 1 | uniq | wc -l

And compare that to the number of lines in your input file.

I would also recommend monitoring the status output from the VEP as it runs
- when it finishes successfully you should see something like:

2013-03-04 14:49:29 - Processed 37 total variants (2 vars/sec, 2 vars/sec
2013-03-04 14:49:29 - Finished!


Will McLaren
Ensembl Variation

On 6 March 2013 06:53, binit treesa <binit.treesa at gmail.com> wrote:

> Hi Will McLaren,
> I have 300+ varaiant files, each file contains 10 Lakhs of variants. So I
> automated the VEP run on these files with a small script and it will launch
> 7 VEP run in parallel. Last day, the entire VEP run got finished. Usually I
> get the result file with the size > 1000 MB. But this time some of files
> (nearly 10 ) are of the size 800 / 700 MB. I suspect the VEP run on these
> file is not completed.
> Is there any way to verify these result so as to ensure the run is
> successful.
> Thanks in advance,
> Treesa Binit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130306/3a68f6bf/attachment.html>

More information about the Dev mailing list