[ensembl-dev] Choosing the --per_gene option in VEP

Aravind Sankar as42 at sanger.ac.uk
Thu Oct 20 13:59:27 BST 2016


Hello Will,

Thank you very much, I think that clears it up. It looks like the variant position lies in BAP1-014 and not BAP1-201 but that still isn’t the primary transcript so it was not reported using per_gene.

Thanks again.

Regards,
Aravind

From: <dev-bounces at ensembl.org> on behalf of Will McLaren <wm2 at ebi.ac.uk>
Reply-To: Ensembl developers list <dev at ensembl.org>
Date: Thursday, 20 October 2016 at 13:45
To: Ensembl developers list <dev at ensembl.org>
Subject: Re: [ensembl-dev] Choosing the --per_gene option in VEP

Hi Aravind,

The output selected is not by default based primarily on the consequence ranking, but on the likely relevance of each affected transcript.

The default order of selection criteria is shown in the documentation [1], and you may change this order with --pick_order if you wish to, for example, prioritise consequence rank first.

Without having seen your input, it seems likely that your variant of interest falls in an exon not found in the primary transcript of BAP1; if you study the transcript diagrams [2] you can see several transcripts contain exons (e.g. BAP1-201) that are not found in the primary transcript as determined by CCDS, APPRIS and TSL (BAP1-001). This means it may have a missense effect in BAP1-201 but fall in the intron of BAP1-001.

Hope that helps

Will McLaren
Ensembl Variation

[1] : http://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#pick
[2] : http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000163930;r=3:52401013-52410350

On 20 October 2016 at 13:35, Aravind Sankar <as42 at sanger.ac.uk<mailto:as42 at sanger.ac.uk>> wrote:
Hello,

I have been using the --per_gene option in VEP to get only the most severe consequence of each gene in the CSQ field for my files (as given in the documentation here http://www.ensembl.org/info/docs/tools/vep/script/vep_options.html#opt_per_gene). However, I noticed something going awry today. I was checking all the consequences of a particular position in a gene (BAP1). When I give the --per_gene option in vep, it returns a single consequence for the variant and the consequence is an intron variant. When I don’t give the –per_gene option, it returns all the possible consequences for BAP1, which includes both a missense variant and an intron variant. For some reason, the missense variant is not being reported over the intron variant while using the –per_gene option. I ran this on build 84. The commands I used were as follows:-

Using per_gene :

perl /software/vertres/bin-external/variant_effect_predictor_v84.pl<http://variant_effect_predictor_v84.pl> --offline --vcf --dir_cache /lustre/scratch116/vr/ref/ensembl/vep_cache --species homo_sapiens --assembly GRCh38 --no_progress --everything --cache --per_gene -i bap1_pos_test.vcf -o bap1_84_per_gene.vcf

Without per_gene :

perl /software/vertres/bin-external/variant_effect_predictor_v84.pl<http://variant_effect_predictor_v84.pl> --offline --vcf --dir_cache /lustre/scratch116/vr/ref/ensembl/vep_cache --species homo_sapiens --assembly GRCh38 --no_progress --everything --cache -i bap1_pos_test.vcf -o bap1_84.vcf

The output of the per_gene file doesn’t have the missense variant while the normal one does, for the same gene.
less bap1_84_per_gene.vcf | grep -c "missense"
0

less bap1_84.vcf | grep -c "missense"
1

This is contrary to what is expected from the given documentation. Could someone help clarify what is going on here ?

Thanking you,
Aravind




_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20161020/c306d2dc/attachment.html>


More information about the Dev mailing list