[ensembl-dev] VEP 84 - question about output from flag_pick/pick_order

Black-Ziegelbein, Elizabeth A elizabeth-black at uiowa.edu
Fri Apr 1 16:41:00 BST 2016


Good morning,

I am using a local install of VEP 84.  We are leveraging the —flag_pick_allele and —pick_order options.

This is an example of how we are running VEP:


perl variant_effect_predictor.pl --offline --flag_pick_allele -pick_order canonical,rank  --merged --dir_cache variant_effect_predictor/cache-dir -i CDH23.1kg.phase3.v5a.EUR.NO-GT.SPLIT-LFT_ALGN.vcf.gz --plugin CADD,whole_genome_SNVs.tsv.gz,InDels.tsv.gz --vcf -o CDH23.1kg.phase3.v5a.EUR.NO-GT.SPLIT-LFT_ALGN.VEP-CADD.vcf --stats_file CDH23.1kg.phase3.v5a.EUR.NO-GT.SPLIT-LFT_ALGN.VEP-CADD.html —force_overwrite



I noticed that in annotating some of the variants, it does not seem to select the transcript  using my pick order as I would expect.  I am assuming that the canonical transcript is defined by: http://www.ensembl.org/Help/Glossary?id=346

Example Variants:


10 73558128 rs41281334 G A . PASS AC=34;AN=1006

10 73558886 rs4747194 G A . PASS AC=280;AN=1006



The annotation provided for 10:73558128 (rs41281334) is as follows.  The picked transcript is NM_022124.5 (which is what I expected since it is the canonical transcript according to the UCSC table query, and had high rank)


A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_022124.5|protein_coding|50/70||||7237|6847|2283|V/I|Gtc/Atc|||1||1||||4.949|0.225802

A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_001171934.1|protein_coding|3/22||||444|127|43|V/I|Gtc/Atc|||1||||||4.949|0.225802

A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_001171933.1|protein_coding|3/23||||444|127|43|V/I|Gtc/Atc|||1||||||4.949|0.225802

A|missense_variant|MODERATE|CDH23|ENSG00000107736|Transcript|ENST00000224721|protein_coding|49/69||||6867|6862|2288|V/I|Gtc/Atc|||1|||HGNC|13733||4.949|0.225802

A|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|CDH23|ENSG00000107736|Transcript|ENST00000475158|processed_transcript|2/21||||383|||||||1|||HGNC|13733||4.949|0.225802

A|missense_variant|MODERATE|CDH23|ENSG00000107736|Transcript|ENST00000398788|protein_coding|3/23||||444|127|43|V/I|Gtc/Atc|||1|||HGNC|13733||4.949|0.225802

The annotation provided for 10: 73558886 (rs4747194) is as follows.  The picked transcript is ENST00000398788.  QUESTION: Why was it not canonical transcript NM_022124.5 which has the same rank?


A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_001171934.1|protein_coding|4/22||||670|353|118|R/Q|cGg/cAg|||1||||||21.7|2.866040

A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_001171933.1|protein_coding|4/23||||670|353|118|R/Q|cGg/cAg|||1||||||21.7|2.866040

A|missense_variant|MODERATE|CDH23|ENSG00000107736|Transcript|ENST00000224721|protein_coding|50/69||||7093|7088|2363|R/Q|cGg/cAg|||1|||HGNC|13733||21.7|2.866040

A|missense_variant|MODERATE|CDH23|ENSG00000107736|Transcript|ENST00000398788|protein_coding|4/23||||670|353|118|R/Q|cGg/cAg|||1||1|HGNC|13733||21.7|2.866040

A|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|CDH23|ENSG00000107736|Transcript|ENST00000475158|processed_transcript|3/21||||609|||||||1|||HGNC|13733||21.7|2.866040

A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_022124.5|protein_coding|51/70||||7463|7073|2358|R/Q|cGg/cAg|||1||||||21.7|2.866040



Thanks so much for your help.  Please let me know if I need to post to an alternate forum.


Ann



Ann Black-Ziegelbein
Senior Application Developer
Molecular Otolaryngology and Renal Research Laboratories
University of Iowa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160401/849ec0e4/attachment.html>


More information about the Dev mailing list