[ensembl-dev] Codon called wrong in VEP when using custom build cache

Heidi Viitaniemi hmviit at utu.fi
Mon Mar 18 13:48:00 GMT 2013


Hi,

I'm running version 2.7 on a unix server. I want to create a custom 
cache using my own gtf and fasta with gtf2vep.pl. This works without 
problem and also running VEP seems to go fine. The problem is that, in 
the output it seems that the cDNA_position, CDS_position and 
Protein_position are correct given my input gtf file but the calls for 
Amino_acids and Codons seem completely random. If I run against the 
cache retrieved from ensembl these are all correct. The version of the 
genome didn't have an effect on the output, the gtf's haven't changed. 
The gtf and the fasta that I'm using for the custom originate from the 
ensembl reference so I don't see any reason why the custom cache 
shouldn't perform the same way as the reference from ensembl cache. 
Could there be bug that somehow messes up the link between the custom 
gtf and fasta in my run? Below are the commands I ran and a snippet of 
the output's I got.

Thanks,
Heidi Viitaniemi

For custom cache I'm running (wrong output for Amino_acids and Codons)
perl gtf2vep.pl -i GasAcu1.67_group_xixflip.gtf -f 
gasAcu_group_withoutbac_inv7.fa -d 67 -s 
Gasterosteus_aculeatus_XIXflipped_18032013
perl variant_effect_predictor.pl -offline 1 -dir $HOME/.vep -i 
ens_realigned_AK_F.var.vcf -format vcf -fork 4 -db_version 67 -species 
Gasterosteus_aculeatus_XIXflipped_18032013 -numbers -per_gene 
-buffer_size 10000 -o VEP_18032013_exon_pergene_AK_F.var.vcf.txt

groupXIX_2822477_C/T 	groupXIX:2822477 	T 	ENSGACG00000003129 
ENSGACT00000004109 	Transcript 	missense_variant 	67 	49 	17 	G/R 
Gga/Aga 	- 	EXON=1/2
groupXIX_2822500_T/C 	groupXIX:2822500 	C 	ENSGACG00000003129 
ENSGACT00000004109 	Transcript 	missense_variant 	44 	26 	9 	Y/C 
tAt/tGt 	- 	EXON=1/2
groupXIX_2822523_C/T 	groupXIX:2822523 	T 	ENSGACG00000003129 
ENSGACT00000004109 	Transcript 	synonymous_variant 	21 	3 	1 	R 
cgG/cgA 	- 	EXON=1/2
groupXIX_2822541_T/A 	groupXIX:2822541 	A 	ENSGACG00000003129 
ENSGACT00000004109 	Transcript 	5_prime_UTR_variant 	3 	- 	- 	- 	- 	- 
EXON=1/2



For ensembl cache I'm running (correct output for Amino_acids and Codons)
perl variant_effect_predictor.pl -offline -dir $HOME/.vep -i 
ens_realigned_AK_F.var.vcf -format vcf -fork 4 -db_version 69 -species 
gasterosteus_aculeatus -numbers -per_gene -buffer_size 10000 -o 
ensVEP_18032013_exon_pergene_AK_F.var.vcf.txt

groupXIX_2822477_C/T 	groupXIX:2822477 	T 	ENSGACG00000003129 
ENSGACT00000004109 	Transcript 	missense_variant 	67 	49 	17 	A/T 
Gcg/Acg 	- 	EXON=1/2
groupXIX_2822500_T/C 	groupXIX:2822500 	C 	ENSGACG00000003129 
ENSGACT00000004109 	Transcript 	missense_variant 	44 	26 	9 	D/G 
gAc/gGc 	- 	EXON=1/2
groupXIX_2822523_C/T 	groupXIX:2822523 	T 	ENSGACG00000003129 
ENSGACT00000004109 	Transcript 	initiator_codon_variant 	21 	3 	1 	M/I 
atG/atA 	- 	EXON=1/2
groupXIX_2822541_T/A 	groupXIX:2822541 	A 	ENSGACG00000003129 
ENSGACT00000004109 	Transcript 	5_prime_UTR_variant 	3 	- 	- 	- 	- 	- 
EXON=1/2



-- 
______________________________________________

Heidi Viitaniemi
PhD student
Division of Genetics and Physiology
Department of Biology
Itäinen Pitkäkatu 4A, 7th floor (Pharmacity)
University of Turku
20520 Turku

FINLAND

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130318/968b37aa/attachment.html>


More information about the Dev mailing list