[ensembl-dev] variant_effect_predictor.pl: --pick option selecting lower ranking consequence?

Will McLaren wm2 at ebi.ac.uk
Mon Aug 4 09:44:36 BST 2014


Hi Andrew,

The reason you are seeing this is because ENST00000296930 is the canonical
transcript for that gene.

The sort order for --pick is:

1) canonical status
2) biotype (priority given to protein coding)
3) consequence rank
4) transcript length

The reason --pick exists is because a lot of people want the "the"
consequence for "the" transcript; the canonical status is Ensembl's way of
saying this might be "the" transcript.

However, I can certainly understand other users that want to prioritise
consequence rank above this, but then if we chose this method we may have
other users complaining that --pick had chosen a missense variant in a
transcript which is biologically irrelevant or even not real.

The solution is to allow users to customise this sort order, and it's
something we'll be working on including in the future, probably via a
plugin module.

For release 76 there is a --flag_pick option that just flags the picked
consequence instead of removing all of the others, to allow you to assess
what Ensembl thinks is the pick vs what you might choose.

You could also edit the code yourself; it would be a case of just moving
some lines around, the code is
in ensembl-variation/modules/Bio/EnsEMBL/Variation/Utils/VEP.pm, around
line 1886.

Hope that helps

Will McLaren
Ensembl Variation





On 1 August 2014 18:33, Andrew Carson <acarson at invivoscribe.com> wrote:

> Hi Will,
>
> I’m again seeing instances where running VEP with the --pick filtering
> option (variant_effect_predictor.pl script version 75) is picking lower
> consequence in some instances.
>
> In this case, the most severe consequence is in a minor splice form, but I
> still want to “pick” this since it is the most severe consequence.
>
> Here is an example in .vcf format:
>
>
>
> 5              170833402           .               C
> T              .               PASS
>
>
>
> The unfiltered/unpicked consequence is as follows (separated into the 7
> different consequences):
>
>
> T|ENSG00000181163|ENST00000519955|Transcript|downstream_gene_variant||||||COSM1065794|||||||||666|1|||NPM1|HGNC||||retained_intron||||||||||,
>
>
> T|ENSG00000181163|ENST00000351986|Transcript|intron_variant||||||COSM1065794||||8/9||||||1|||NPM1|HGNC||||protein_coding|ENSP00000341168||CCDS4377.1|ENST00000351986.6:c.684+995C>T||||||,
>
>
> T|ENSG00000181163|ENST00000517671|Transcript|intron_variant||||||COSM1065794||||10/11||||||1|||NPM1|HGNC||||protein_coding|ENSP00000428755||CCDS4376.1|ENST00000517671.1:c.771+995C>T||||||,
>
>
> T|ENSG00000181163|ENST00000393820|Transcript|missense_variant&splice_region_variant|871|773|258|A/V|gCg/gTg|COSM1065794|||10/10|||||||1|||NPM1|HGNC|deleterious(0.01)|benign(0.002)||protein_coding|ENSP00000377408||CCDS43399.1|ENST00000393820.2:c.773C>T|ENSP00000377408.2:p.Ala258Val|||||,
>
>
> T|ENSG00000181163|ENST00000524204|Transcript|upstream_gene_variant||||||COSM1065794|||||||||1095|1|||NPM1|HGNC||||retained_intron||||||||||,
>
>
> T|ENSG00000181163|ENST00000296930|Transcript|intron_variant||||||COSM1065794||||9/10||||||1||YES|NPM1|HGNC||||protein_coding|ENSP00000296930||CCDS4376.1|ENST00000296930.5:c.771+995C>T||||||,
>
>
> T||ENSR00001296571|RegulatoryFeature|regulatory_region_variant||||||COSM1065794||||||||||||||||||||||||||||
>
>
>
> However, when I use the --pick option, I get the following consequence:
>
>
> T|ENSG00000181163|ENST00000296930|Transcript|intron_variant||||||COSM1065794||||9/10||||||1||YES|NPM1|HGNC||||protein_coding|ENSP00000296930||CCDS4376.1|ENST00000296930.5:c.771+995C>T||||||
>
> (Note, here is the command I use:
>
> perl variant_effect_predictor.pl --everything --no_stats --cache -i
> file.vcf -o out.vcf --format vcf --dir /variant_effect_predictor/ --vcf
> --no_progress --pubmed --gmaf --maf_1kg --check_existing --check_alleles
> --pick
>
>
>
> The most damaging consequence, however, is the missense_variant (the 4th
> consequence in the unfiltered/unpicked consequence.
>
>
>
> Note, when I run this same variant through the VEP Web interface, I get
> the same consequence when selecting "Show one selected consequence per
> variant".
>
>
>
> Is there a reason I’m getting the intron_variant using --pick instead of
> the missense_variant?
>
> Any help would be greatly appreciated!
>
>
>
> Andrew R. Carson, Ph.D.
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140804/01c47d51/attachment.html>


More information about the Dev mailing list