[ensembl-dev] Annotating variants against custom transcripts
Shawn Yost
yostshawn at gmail.com
Tue Jan 16 14:06:27 GMT 2018
Hi,
I would like to annotate my VCF file against a custom transcript database.
I've created both a GFF v3 file and a GTF file (see below) and I have been
unsuccessful in getting VEP to annotate against these transcripts. The
examples below are before using sort + bgzip + tabix (so that is not the
problem).
I'm currently running VEP v85.
The command I used was:
variant_effect_predictor.pl -i IN.vcf --custom test.gff.gz,,gff -fasta
test.fa --cache -o OUT -dir /path/to/cache --hgvs --cache_version 75
--offline --force_overwrite
In the outputted file I only see ENSTs and can't find the transcripts I
inputted along with them. The same thing occurs if I run --custom
test.gtf.gz,,gtf. If I change the command to --custom
test.gtf.gz,,gtf,overlap it will tell me if it overlaps the inputted
transcript but it doesn't annotate against the transcript.
Is there a problem with the command options I am using? Is there a problem
with the inputted files? How do I get VEP to annotate the variant against
my custom transcript (i.e. 412662 2:73613056 A GENE1
TRANSCRIPT1 Transcript synonymous_variant ....)?
Example output:
241004 2:73613032-73613049 - ENSG00000116127 ENST00000264448
Transcript inframe_deletion 147-164 36-53 12-18
LEEEEEE/L ctGGAGGAGGAGGAGGAGGAg/ctg -
IMPACT=MODERATE;STRAND=1;HGVSc=ENST00000264448.6:c.36_
53delNNNNNNNNNNNNNNNNNN;HGVSp=ENSP00000264448.6:p.Glu23_Glu28del
412662 2:73613056 A ENSG00000116127 ENST00000377715 Transcript
synonymous_variant 171 60 20 E gaG/gaA -
IMPACT=LOW;STRAND=1;HGVSc=ENST00000377715.1:c.60N>A;
HGVSp=ENST00000377715.1:c.60N>A(p.%3D)
412662 2:73613056 A ENSG00000116127 ENST00000409009 Transcript
synonymous_variant 171 60 20 E gaG/gaA -
IMPACT=LOW;STRAND=1;HGVSc=ENST00000409009.1:c.60N>A;
HGVSp=ENST00000409009.1:c.60N>A(p.%3D)
412662 2:73613056 A ENSG00000116127 ENST00000264448 Transcript
synonymous_variant 171 60 20 E gaG/gaA -
IMPACT=LOW;STRAND=1;HGVSc=ENST00000264448.6:c.60N>A;
HGVSp=ENST00000264448.6:c.60N>A(p.%3D)
402364 2:73613066-73613071 - ENSG00000116127 ENST00000377715
Transcript inframe_deletion 181-186 70-75 24-25 EE/-
GAGGAA/- - IMPACT=MODERATE;STRAND=1;
HGVSc=ENST00000377715.1:c.70_75delNNNNNN;HGVSp=ENSP00000366944.1:p.Glu27_
Glu28del
GFF v3 file:
15 . transcript 74701625 74726300 . -
. ID=TRANSCRIPT1;Alias=10741;Name=SEMA7A
15 . exon 74726082 74726300 . - .
ID=EXON37A10411.1;Parent=TRANSCRIPT1
15 . exon 74711142 74711293 . - .
ID=EXON37A10411.2;Parent=TRANSCRIPT1
15 . exon 74710609 74710650 . - .
ID=EXON37A10411.3;Parent=TRANSCRIPT1
15 . exon 74710218 74710310 . - .
ID=EXON37A10411.4;Parent=TRANSCRIPT1
15 . exon 74709932 74710016 . - .
ID=EXON37A10411.5;Parent=TRANSCRIPT1
15 . exon 74709676 74709786 . - .
ID=EXON37A10411.6;Parent=TRANSCRIPT1
15 . exon 74708916 74709055 . - .
ID=EXON37A10411.7;Parent=TRANSCRIPT1
15 . exon 74708142 74708326 . - .
ID=EXON37A10411.8;Parent=TRANSCRIPT1
15 . exon 74707179 74707287 . - .
ID=EXON37A10411.9;Parent=TRANSCRIPT1
15 . exon 74706888 74707086 . - .
ID=EXON37A10411.10;Parent=TRANSCRIPT1
15 . exon 74704226 74704353 . - .
ID=EXON37A10411.11;Parent=TRANSCRIPT1
15 . exon 74703897 74704051 . - .
ID=EXON37A10411.12;Parent=TRANSCRIPT1
15 . exon 74703636 74703697 . - .
ID=EXON37A10411.13;Parent=TRANSCRIPT1
15 . exon 74701625 74703326 . - .
ID=EXON37A10411.14;Parent=TRANSCRIPT1
15 . CDS 74726082 74726259 . - 0
ID=CDS37A10411.1;Parent=TRANSCRIPT1
15 . CDS 74711142 74711293 . - 2
ID=CDS37A10411.2;Parent=TRANSCRIPT1
15 . CDS 74710609 74710650 . - 0
ID=CDS37A10411.3;Parent=TRANSCRIPT1
15 . CDS 74710218 74710310 . - 0
ID=CDS37A10411.4;Parent=TRANSCRIPT1
15 . CDS 74709932 74710016 . - 0
ID=CDS37A10411.5;Parent=TRANSCRIPT1
15 . CDS 74709676 74709786 . - 2
ID=CDS37A10411.6;Parent=TRANSCRIPT1
15 . CDS 74708916 74709055 . - 2
ID=CDS37A10411.7;Parent=TRANSCRIPT1
15 . CDS 74708142 74708326 . - 0
ID=CDS37A10411.8;Parent=TRANSCRIPT1
15 . CDS 74707179 74707287 . - 1
ID=CDS37A10411.9;Parent=TRANSCRIPT1
15 . CDS 74706888 74707086 . - 0
ID=CDS37A10411.10;Parent=TRANSCRIPT1
15 . CDS 74704226 74704353 . - 2
ID=CDS37A10411.11;Parent=TRANSCRIPT1
15 . CDS 74703897 74704051 . - 0
ID=CDS37A10411.12;Parent=TRANSCRIPT1
15 . CDS 74703636 74703697 . - 1
ID=CDS37A10411.13;Parent=TRANSCRIPT1
15 . CDS 74702965 74703326 . - 2
ID=CDS37A10411.14;Parent=TRANSCRIPT1
GTF file:
TRANSCRIPT1 15 - 74701624 74726300 74702964
74726259 14 74701624,74703635,74703896,
74704225,74706887,74707178,74708141,74708915,74709675,
74709931,74710217,74710608,74711141,74726081,
74703326,74703697,74704051,74704353,74707086,74707287,
74708326,74709055,74709786,74710016,74710310,74710650,74711293,74726300,
0 HGNC:10741 cmplcmpl 1,2,0,1,0,2,0,1,1,0,0,0,1,0,
TRANSCRIPT2 17 - 8108048 8113944 8108188 8113542 9
8108048,8108533,8109808,8110067,8110493,8110888,8111055,8113494,8113847,
8108362,8108708,8109957,8110206,8110685,8110943,8111158,8113567,8113944,
0 HGNC:11390 cmpl cmpl0,2,0,2,2,1,0,0,-1,
TRANSCRIPT3 11 - 73711325 73720282 73712456
73718087 7 73711325,73714871,73715528,
73716774,73717213,73717961,73720022, 73712571,73715052,73715630,
73716978,73717424,73718182,73720282, 0 HGNC:12519 cmplcmpl
2,1,1,1,0,0,-1,
TRANSCRIPT4 11 - 123594634 123612391 123596704
123601596 9 123594634,123598183,123598840,
123599833,123600322,123601194,123610824,123611124,123612256,
123597699,123598303,123598970,123599922,123600533,
123601693,123610900,123611244,123612391, 0 HGNC:12994 cmpl
cmpl 1,1,0,1,0,0,-1,-1,-1,
TRANSCRIPT5 19 - 51994480 52005043 51994894
52004987 8 51994480,52000133,52000602,
52001271,52002423,52002691,52003173,52004560, 51995083,52000230,52000699,
52001541,52002471,52002970,52003554,52005043, 0 HGNC:15482
cmpl cmpl 0,2,1,1,1,1,1,0,
TRANSCRIPT6 1 + 228194722 228248972 228194829
228247166 4 228194722,228210367,228238356,228246686,
228194900,228210609,228238622,228248972, 0 HGNC:15983
cmpl cmpl 0,2,1,0,
Thank you for your help,
Shawn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180116/6a98486c/attachment.html>
More information about the Dev
mailing list