[ensembl-dev] Allele specific custom annotation
Levine, Adam
a.levine at ucl.ac.uk
Mon Mar 10 15:53:11 GMT 2014
Dear Will,
Thank you for your prompt reply.
Sure, I will write a script to do it.
Kind regards,
Adam
Adam P. Levine
From: dev-bounces at ensembl.org [mailto:dev-bounces at ensembl.org] On Behalf Of Will McLaren
Sent: 10 March 2014 15:43
To: Ensembl developers list
Subject: Re: [ensembl-dev] Allele specific custom annotation
Hi Adam,
Currently there is no way to do allele-specific annotation with the --custom flag alone.
If you are not averse to a bit of coding you could write a plugin to do that for you; you could either write the plugin to fetch the scores for you (the VEP is simply piping in tabix output), or have the plugin post-process the scores added as you have them above (this would require writing the least code).
http://www.ensembl.org/info/docs/tools/vep/script/vep_plugins.html
Our dbNSFP plugin does something similar:
https://github.com/ensembl-variation/VEP_plugins/blob/master/dbNSFP.pm
Regards
Will McLaren
Ensembl Variation
On 10 March 2014 15:30, Levine, Adam <a.levine at ucl.ac.uk<mailto:a.levine at ucl.ac.uk>> wrote:
I have a query regarding performing custom annotation using the VEP. I would like to annotate specific allele changes with a score, i.e. a G to T with score X but G to A at the same position with score Y. It seems, however, that the VEP only annotates on the basis of position and does not consider the allele change. Am I correct? If so, is there a way to set it to use custom annotation tracks in an allele specific manner?
The custom annotations are in VCF format, e.g.:
##fileformat=VCFv4.0
#CHROM POS ID REF ALT QUAL FILTER INFO
21 26960070 GT_scoreX G T . . .
21 26960070 GA_scoreY G A . . .
The input file looks like this:
##fileformat=VCFv4.0
#CHROM POS ID REF ALT QUAL FILTER INFO
21 26960070 rs116645811 G A . . .
My command is:
perl variant_effect_predictor.pl<http://variant_effect_predictor.pl> \
--input_file example_single_variant.vcf \
--format vcf \
--custom test_custom.vcf.gz,test_custom,vcf,exact \
--cache
The output looks like this:
## ENSEMBL VARIANT EFFECT PREDICTOR v75
## Output produced at 2014-03-10 14:34:53
## Connected to homo_sapiens_core_75_37 on ensembldb.ensembl.org<http://ensembldb.ensembl.org>
## Using cache in /home/Levine/.vep/homo_sapiens/75
## Using API version 75, DB version 75
## Extra column keys:
## DISTANCE : Shortest distance from variant to transcript
## STRAND : Strand of the feature (1/-1)
## test_custom : test_custom.vcf.gz (exact)
#Uploaded_variation Location Allele Gene Feature Feature_type Consequence cDNA_position CDS_position Protein_position Amino_ac
ids Codons Existing_variation Extra
rs116645811 21:26960070 A ENSG00000260583 ENST00000567517 Transcript upstream_gene_variant - - - - - -
STRAND=-1;test_custom=GT_scoreX,GA_scoreY;DISTANCE=4432
rs116645811 21:26960070 A ENSG00000154719 ENST00000352957 Transcript intron_variant - - - - - - STRAND=-
1;test_custom=G_A,G_T
rs116645811 21:26960070 A ENSG00000154719 ENST00000307301 Transcript missense_variant 1043 1001 334 T/M aCg/aTg -
STRAND=-1;test_custom=GT_scoreX,GA_scoreY
You can see the variant in the input (G>A) is annotated with both G_A and G_T. I can of course, pull out the relevant annotation (score X for G>T, score Y for G>A) myself manually after the fact but it would be great if the VEP could do it directly.
Thank you,
Adam
Adam P. Levine
_______________________________________________
Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140310/dff34ef5/attachment.html>
More information about the Dev
mailing list