[ensembl-dev] bug in the VEP annotation of VCFs with multiple individuals

Will McLaren wm2 at ebi.ac.uk
Tue Aug 27 14:45:03 BST 2013


Hi Duarte,

Thanks for raising this. There's an interesting quirk here which seems to
be what you're looking at. However, I _do_ see lines of output for the
other 8 individuals in the file.

What is it you would expect to see for sample9? Would you expect that line
to be excluded from the output?

The reason it is shown is because you are using most_severe, which forces
the VEP to give the most severe consequence per variant (which I would
generally advise against using!) - when using --individual each
individual/variant combination is considered as an independent variant.

The reason it is intergenic_variant is because that is the "default"
consequence - since the locus is non-variant for sample9, it does not go
through the consequence prediction, but because you are forcing it to be
printed out with most_severe, the VEP has to default to using
intergenic_variant.

I could see two solutions - either excluding the line (since it is
non-variant), or having some sort of "no consequence" type - which I am
loathe to do as this doesn't fit in to our SO schema.

Will


On 27 August 2013 12:16, Duarte Molha <duartemolha at gmail.com> wrote:

> Dear Developers
>
> I believe there is another bug in the VEP when dealing with input VCFs
> with multiple individuals...
> Please take a look at this VCF input and the corresponding output:
>
> INPUT VCF line:
>
> #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT
> sample1 sample2 sample3 sample4 sample5 sample6 sample7 sample8 sample9
>
> 1       876499  .       A       G       2900.87 PASS
> AC=15;AF=0.938;AN=16;BaseQRankSum=1.636;DP=92;Dels=0.00;FS=0.000;HRun=6;HaplotypeScore=0.4159;MQ=59.36;MQ0=0;MQRankSum=1.274;QD=31.53;ReadPosRankSum=-0.482;SB=-1653.87;set=variant2
> GT:AD:DP:GQ:PL  1/1:0,9:9:24.07:303,24,0
> 1/1:0,10:10:27.09:365,27,0      0/1:5,4:9:99:104,0,166
> 1/1:0,7:7:18.04:220,18,0        1/1:0,16:16:39.13:534,39,0
> 1/1:0,12:12:30.10:407,30,0      1/1:0,14:14:39.13:535,39,0
> 1/1:0,15:15:36.12:483,36,0      ./.
>
>
> OUTPUT annotation file:
>
> #Uploaded_variation     Location        Existing_variation      Allele
> ZYG     Gene    Feature Feature_type    Consequence     GMAF    IND
>
> 1_876499_A      1:876499        rs4372192       -       HOM     -
> -       -       intergenic_variant      A:0.0824         sample9
>
> As you can see, the annotation output only contains 1 line and it is for
> the individual that has no genotype call (./.)
>
> Also, the variation name does not contain the ref/alt_allele information
> on the name as all other variations. I would expect if to be called
> 1_876499_A/G
>
> For reference here are the config options I used:
>
> host
> [internalserver]user
> [user]
>
> password                                            [password]
>
> db_version        72
>
> port                                                       3306
>
> species                                                 homo_sapiens
>
>
>
> #######     runtime options  #############
>
> buffer_size                                         40000
>
> most_severe                     1
>
> check_existing                  1
>
> check_alleles                     1
>
> individual                                             all
>
> fork                                                        6
>
>  verbose                                                               1
>
>  gmaf                                                      1
>
> filter_common                  1
>
> fields
> Uploaded_variation,Location,Existing_variation,Allele,ZYG,Gene,Feature,Feature_type,Consequence,GMAF,IND
>
>
> #######     cache stuff   #############
>
> cache                                                    1
>
> dir_plugins
> /NGS_Test/vep_72_testing/Plugins/
>
> dir_cache
> /ReferenceData/vep_cache
>
> # cache_region_size       1MB
>
> #offline                                                1
>
> # skip_db_check                              1
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130827/71fda6f9/attachment.html>


More information about the Dev mailing list