[ensembl-dev] VCF input in VEP
Will McLaren
wm2 at ebi.ac.uk
Mon Jan 13 10:59:47 GMT 2014
Hello,
There's no need to convert your data - the VEP will process the file no
problem.
The issue arises if you use the default VEP output format for those lines
that don't have an identifier in the 3rd column of the VCF, where you may
have trouble aligning the results with the input data - the VEP will still
run OK though!
You can get around the issue by either adding in a made-up identifier to
the 3rd column of your VCF (something like CHR_POS_ALT is always good), OR
you can simply have VEP write VCF output, where the output consists of the
input VCF with a "CSQ" field added to the INFO column of the VCF - see
http://www.ensembl.org/info/docs/tools/vep/vep_formats.html#vcfout
Hope that helps
Will McLaren
Ensembl Variation
On 13 January 2014 10:31, Genomeo Dev <genomeodev at gmail.com> wrote:
> Hi,
>
> The following documentation shows that ensembl use a customised VCF format
> at least for VEP which is different to the original VCF used by 1000
> genomes:
>
> http://www.ensembl.org/info/docs/tools/vep/vep_formats.html
>
> This difference seems to affect only balanced variations.
>
> I am working with data straight from 1000 genomes which I want to process
> with VEP. Many of them don't have assigned dbSNP IDs so can't run VEP with
> ID as input. Would it be possible for Ensembl to share their script for
> converting the original VCFs to their customised VCF?
>
> P.S. Using VM 74 and VEP 74.
>
> Thanks,
>
> G.
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140113/9103335f/attachment.html>
More information about the Dev
mailing list