[ensembl-dev] Variant Effect Predictor and VCF output

Will McLaren wm2 at ebi.ac.uk
Thu Nov 10 13:46:20 GMT 2011


Hi Chris,

Thanks for sharing, that's really useful.

Would it be OK if I implemented a similar thing in the VEP?

Basically I will just add the CSQ info field, populated by either all
consequence types as a comma-separated list, or a single consequence
if requested by using the option --most_severe.

I could change CSQ to something different if you think that would be
more sensible, or is there precedence for using this in VCF world?

Thanks

Will

On 10 November 2011 13:41,  <cj5 at sanger.ac.uk> wrote:
> Hi,
> For the UK10K project we are using the following script, which optionally
> adds GERP and Grantham Matric scores
>
> https://github.com/VertebrateResequencing/vr-codebase/blob/develop/scripts/vcf2consequences_vep
>
> regards
> Chris Joyce
> Wellcome Trust Sanger Institute
>
>
>> Hi Fedor,
>>
>> Currently there is no standard way to describe consequences in VCF;
>> the main issue to overcome is that our output format provides one line
>> per variant/allele/transcript, whereas VCF mandates one line per
>> variant. This means we'd have to squeeze an awful lot of information
>> into the INFO column of the VCF.
>>
>> We should, however, be able to provide at least summary level
>> information in the INFO field, and this is what we will look into
>> doing, as we have had several requests for VCF output to be a feature
>> of the VEP.
>>
>> I am not aware of any tools to convert, however, I think a simple perl
>> script and using the --most_severe or --summary options (both of which
>> output only one line per variant) in the VEP you should be able to
>> combine the original VCF with the output.
>>
>> Hope this helps
>>
>> Will McLaren
>> Ensembl Variation
>>
>> On 9 November 2011 20:03, Fedor Gusev <gusevfe at gmail.com> wrote:
>>> Hello everyone.
>>>
>>> How come it is not possible for VEP to output a vcf file? Are there
>>> any tools to convert the output to VCF?
>>>
>>> --
>>> Kind regards,
>>> Fedor Gusev.
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> List admin (including subscribe/unsubscribe):
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe):
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>
>
> --
>
>
>
>
>
>
> --
>  The Wellcome Trust Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
>




More information about the Dev mailing list