[ensembl-dev] variant_effect_predictor.pl: --pick option selecting lower ranking consequence?

Andrew Carson acarson at invivoscribe.com
Wed Mar 19 21:05:32 GMT 2014


Hi,
I'm having an issue with the --pick filtering option using the variant_effect_predictor.pl script (version 75). For a certain variant (see below), it is unclear why this filter is "picking" the specified consequence over seemingly higher ranked consequences.

Here is the variant in .vcf format:
19      33793007        .       TCGC    T       .       PASS

The unfiltered/unpicked consequence is as follows:
CSQ=-|ENSG00000245848|ENST00000498907|Transcript|inframe_deletion|461-463|311-313|104-105|GD/D|gGCGac/gac|COSM18270&TMP_ESP_19_33793008_33793010||||||||-1||YES|CEBPA|HGNC||||protein_coding|Low_complexity_(Seg):Seg&PROSITE_profiles:PS50315&PIRSF_domain:PIRSF005879|CCDS54243.1|ENST00000498907.2:c.311_313delGCG|ENSP00000427514.1:p.Gly104del|||||,-|ENSG00000267130|ENST00000593041|Transcript|upstream_gene_variant||||||COSM18270&TMP_ESP_19_33793008_33793010|||||||2932|1||YES|CTD-2540B15.9|Clone_based_vega_gene||||lincRNA|||||||||,-|ENSG00000267727|ENST00000587312|Transcript|downstream_gene_variant||||||COSM18270&TMP_ESP_19_33793008_33793010|||||||162|1||YES|CTD-2540B15.7|Clone_based_vega_gene||||antisense|||||||||,-|ENSG00000178863|ENST00000320232|Transcript|upstream_gene_variant||||||COSM18270&TMP_ESP_19_33793008_33793010|||||||966|1||YES|CEBPA-AS1|HGNC||||pseudogene|||||||||,-|ENSG00000267580|ENST00000589932|Transcript|downstream_gene_variant||||||COSM18270&TMP_ESP_19_33793008_33793010|||||||934|1||YES|CTD-2540B15.11|Clone_based_vega_gene||||antisense|||||||||,-|ENSG00000267296|ENST00000592982|Transcript|upstream_gene_variant||||||COSM18270&TMP_ESP_19_33793008_33793010|||||||753|1||YES|CEBPA-AS1|Clone_based_vega_gene||||antisense|||||||||,-|ENSG00000230259|ENST00000425420|Transcript|intron_variant&nc_transcript_variant&feature_truncation||||||COSM18270&TMP_ESP_19_33793008_33793010||||||||-1||YES|AC008738.1|Clone_based_ensembl_gene||||pseudogene|||ENST00000425420.2:n.246+175_246+177delGCG||||||,-||ENSR00000345628|RegulatoryFeature|regulatory_region_variant||||||COSM18270&TMP_ESP_19_33793008_33793010|||||||||||||||||||||||||

Although hard to see, the first consequence is inframe_deletion of a protein_coding transcript. This seems to be the consequence that should be highest ranked. However, when I use --pick, I get the following consequence:

CSQ=-|ENSG00000230259|ENST00000425420|Transcript|intron_variant&nc_transcript_variant&feature_truncation||||||COSM18270&TMP_ESP_19_33793008_33793010||||||||-1||YES|AC008738.1|Clone_based_ensembl_gene||||pseudogene|||ENST00000425420.2:n.246+175_246+177delGCG||||||

This is an intron_variant&nc_transcript_variant&feature_truncation of a pseudogene. I believe this should be of lower rank based on biotype and consequence type.

Is there a reason why the --pick filter is choosing this consequence over other (seemingly more relevant) consequences? Is it because there are multiple consequences (intron_variant + nc_transcript_variant + feature_truncation) in the same consequence?

Note, when I run this same variant through the VEP Web interface, I get the same consequence when selecting "Show one selected consequence per variant".

Any help would be greatly appreciated!


Andrew R. Carson, Ph.D.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140319/26cb196d/attachment.html>


More information about the Dev mailing list