[ensembl-dev] VEP 84 - question about output from flag_pick/pick_order

Will McLaren wm2 at ebi.ac.uk
Wed Apr 6 09:19:00 BST 2016


Hi Ann,

Yes, we can add this in the next VEP release.

Will

On 5 April 2016 at 20:00, Black-Ziegelbein, Elizabeth A <
elizabeth-black at uiowa.edu> wrote:

> One additional question :) Would it be possible to make a feature
> request?  Would it be possible to add the model to the pick order list such
> that in a scenario such as below we could give preference to either refseq
> or ensemble ?
>
> Thanks for your help and consideration,
>
> Ann
>
> Ann Black-Ziegelbein
> Senior Application Developer
> Molecular Otolaryngology and Renal Research Laboratories
> University of Iowa
>
> From: Ann Black-Ziegelbein <elizabeth-black at uiowa.edu>
> Date: Monday, April 4, 2016 at 9:56 AM
> To: Will McLaren <wm2 at ebi.ac.uk>
>
> Cc: "dev at ensembl.org" <dev at ensembl.org>
> Subject: Re: VEP 84 - question about output from flag_pick/pick_order
>
> Thanks so much Will!  That helps explain it & I will try your suggestions.
>
> Take care,
>
> Ann
> Ann Black-Ziegelbein
> Senior Application Developer
> Molecular Otolaryngology and Renal Research Laboratories
> University of Iowa
>
> From: <wmclaren at gmail.com> on behalf of Will McLaren <wm2 at ebi.ac.uk>
> Date: Monday, April 4, 2016 at 6:02 AM
> To: Ann Black-Ziegelbein <elizabeth-black at uiowa.edu>
> Cc: "dev at ensembl.org" <dev at ensembl.org>
> Subject: Re: VEP 84 - question about output from flag_pick/pick_order
>
> Hi Ann,
>
> I notice you are using the merged cache. This contains a merge of two gene
> sets, the one from Ensembl and the one from RefSeq. Both of these sets
> have, per gene, a canonical transcript assigned.
>
> The VEP has no way to determine which of these it is you would prefer to
> see annotated, so the canonical transcript from each of the sets are
> considered equal. This means the next comparator is used to split them
> (rank in your case), and since these will likely be equal too, a random one
> is chosen.
>
> I'd suggest either using a non-merged cache (choose either Ensembl or
> RefSeq, or perhaps run both independently?), or add some other comparators
> to your --pick_order flag to help distinguish.
>
> Hope that helps.
>
> Will McLaren
> Ensembl Variation
>
> On 1 April 2016 at 16:41, Black-Ziegelbein, Elizabeth A <
> elizabeth-black at uiowa.edu> wrote:
>
>> Good morning,
>>
>> I am using a local install of VEP 84.  We are leveraging the
>> —flag_pick_allele and —pick_order options.
>>
>> This is an example of how we are running VEP:
>>
>> perl variant_effect_predictor.pl --offline --flag_pick_allele -pick_order
>> canonical,rank  --merged --dir_cache variant_effect_predictor/cache-dir
>> -i CDH23.1kg.phase3.v5a.EUR.NO-GT.SPLIT-LFT_ALGN.vcf.gz --plugin
>> CADD,whole_genome_SNVs.tsv.gz,InDels.tsv.gz --vcf -o
>> CDH23.1kg.phase3.v5a.EUR.NO-GT.SPLIT-LFT_ALGN.VEP-CADD.vcf --stats_file
>> CDH23.1kg.phase3.v5a.EUR.NO-GT.SPLIT-LFT_ALGN.VEP-CADD.html —force_overwrite
>>
>>
>>
>> I noticed that in annotating some of the variants, it does not seem to
>> select the transcript  using my pick order as I would expect.  I am
>> assuming that the canonical transcript is defined by:
>> http://www.ensembl.org/Help/Glossary?id=346
>>
>> Example Variants:
>>
>> 10 73558128rs41281334GA.PASSAC=34;AN=1006
>>
>> 10 73558886rs4747194GA.PASSAC=280;AN=1006
>>
>>
>> The annotation provided for 10:73558128 (rs41281334) is as follows.  The
>> picked transcript is NM_022124.5 (which is what I expected since it is
>> the canonical transcript according to the UCSC table query, and had high
>> rank)
>>
>>
>>
>> A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_022124.5|protein_coding|50/70||||7237|6847|2283|V/I|Gtc/Atc|||1||1||||4.949|0.225802
>>
>>
>> A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_001171934.1|protein_coding|3/22||||444|127|43|V/I|Gtc/Atc|||1||||||4.949|0.225802
>>
>>
>> A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_001171933.1|protein_coding|3/23||||444|127|43|V/I|Gtc/Atc|||1||||||4.949|0.225802
>>
>>
>> A|missense_variant|MODERATE|CDH23|ENSG00000107736|Transcript|ENST00000224721|protein_coding|49/69||||6867|6862|2288|V/I|Gtc/Atc|||1|||HGNC|13733||4.949|0.225802
>>
>>
>> A|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|CDH23|ENSG00000107736|Transcript|ENST00000475158|processed_transcript|2/21||||383|||||||1|||HGNC|13733||4.949|0.225802
>>
>>
>> A|missense_variant|MODERATE|CDH23|ENSG00000107736|Transcript|ENST00000398788|protein_coding|3/23||||444|127|43|V/I|Gtc/Atc|||1|||HGNC|13733||4.949|0.225802
>>
>> The annotation provided for 10: 73558886 (rs4747194) is as follows.  The
>> picked transcript is ENST00000398788.  *QUESTION: Why was it not
>> canonical transcript **NM_022124.5 which has the same rank?*
>>
>>
>> A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_001171934.1|protein_coding|4/22||||670|353|118|R/Q|cGg/cAg|||1||||||21.7|2.866040
>>
>>
>> A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_001171933.1|protein_coding|4/23||||670|353|118|R/Q|cGg/cAg|||1||||||21.7|2.866040
>>
>>
>> A|missense_variant|MODERATE|CDH23|ENSG00000107736|Transcript|ENST00000224721|protein_coding|50/69||||7093|7088|2363|R/Q|cGg/cAg|||1|||HGNC|13733||21.7|2.866040
>>
>>
>> A|missense_variant|MODERATE|CDH23|ENSG00000107736|Transcript|ENST00000398788|protein_coding|4/23||||670|353|118|R/Q|cGg/cAg|||1||1|HGNC|13733||21.7|2.866040
>>
>>
>> A|non_coding_transcript_exon_variant&non_coding_transcript_variant|MODIFIER|CDH23|ENSG00000107736|Transcript|ENST00000475158|processed_transcript|3/21||||609|||||||1|||HGNC|13733||21.7|2.866040
>>
>>
>> A|missense_variant|MODERATE|CDH23|64072|Transcript|NM_022124.5|protein_coding|51/70||||7463|7073|2358|R/Q|cGg/cAg|||1||||||21.7|2.866040
>>
>>
>> Thanks so much for your help.  Please let me know if I need to post to an
>> alternate forum.
>>
>>
>> Ann
>>
>>
>>
>> Ann Black-Ziegelbein
>> Senior Application Developer
>> Molecular Otolaryngology and Renal Research Laboratories
>> University of Iowa
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160406/993baae0/attachment.html>


More information about the Dev mailing list