[ensembl-dev] Variant Effect Predictor and VCF output
cj5 at sanger.ac.uk
cj5 at sanger.ac.uk
Thu Nov 10 14:02:39 GMT 2011
H Will,
> Thanks for sharing, that's really useful.
>
> Would it be OK if I implemented a similar thing in the VEP?
Absolutely. Please be aware also of the outstanding sort bug in the VEP :
http://lists.ensembl.org/pipermail/dev/2011-October/001706.html
Until this is fixed, the script needs to be preceded with an awk/ sort to
guarantee that the text file is in chromosome order.
I use :
head -100 $1 | grep '^#' # speed things up, assume less than 100 header
lines
grep -v '^#' $1 | awk '$2 ~ "^[1-9]:"{$2="0"$2} {print $0}' | sort -k2 |
awk 'BEGIN {OFS="\t"} $2~"^0"{$2 = substr($2,2)} {print $0}'
> I could change CSQ to something different if you think that would be
> more sensible, or is there precedence for using this in VCF world?
>
Fine, there is no consensus re the info field name.
Thanks
Chris
> On 10 November 2011 13:41, <cj5 at sanger.ac.uk> wrote:
>> Hi,
>> For the UK10K project we are using the following script, which
>> optionally
>> adds GERP and Grantham Matric scores
>>
>> https://github.com/VertebrateResequencing/vr-codebase/blob/develop/scripts/vcf2consequences_vep
>>
>> regards
>> Chris Joyce
>> Wellcome Trust Sanger Institute
>>
>>
>>> Hi Fedor,
>>>
>>> Currently there is no standard way to describe consequences in VCF;
>>> the main issue to overcome is that our output format provides one line
>>> per variant/allele/transcript, whereas VCF mandates one line per
>>> variant. This means we'd have to squeeze an awful lot of information
>>> into the INFO column of the VCF.
>>>
>>> We should, however, be able to provide at least summary level
>>> information in the INFO field, and this is what we will look into
>>> doing, as we have had several requests for VCF output to be a feature
>>> of the VEP.
>>>
>>> I am not aware of any tools to convert, however, I think a simple perl
>>> script and using the --most_severe or --summary options (both of which
>>> output only one line per variant) in the VEP you should be able to
>>> combine the original VCF with the output.
>>>
>>> Hope this helps
>>>
>>> Will McLaren
>>> Ensembl Variation
>>>
>>> On 9 November 2011 20:03, Fedor Gusev <gusevfe at gmail.com> wrote:
>>>> Hello everyone.
>>>>
>>>> How come it is not possible for VEP to output a vcf file? Are there
>>>> any tools to convert the output to VCF?
>>>>
>>>> --
>>>> Kind regards,
>>>> Fedor Gusev.
>>>>
>>>> _______________________________________________
>>>> Dev mailing list Dev at ensembl.org
>>>> List admin (including subscribe/unsubscribe):
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>>>
>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org
>>> List admin (including subscribe/unsubscribe):
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>
>>
>> --
>>
>>
>>
>>
>>
>>
>> --
>> The Wellcome Trust Sanger Institute is operated by Genome Research
>> Limited, a charity registered in England with number 1021457 and a
>> company registered in England with number 2742969, whose registered
>> office is 215 Euston Road, London, NW1 2BE.
>>
>
More information about the Dev
mailing list