[ensembl-dev] Sift scores from VEP

Black-Ziegelbein, Elizabeth A elizabeth-black at uiowa.edu
Thu Feb 2 16:28:55 GMT 2017


Just a quick update.  In a quick query across our variants in about 120 genes, there are over 12,000 HIGH/MODERATE SNPs which have a missing SIFT score (which is present in dbNSFP).  This does not include cases where there are multiple transcripts, and the SIFT score is present on a subset of those transcripts (but present for all transcripts in dbNSFP).

We really appreciate your help & advice. It would really helpful to get an update on when this might be fixed for GRCh37.  We are not ready to move onto GRCh38 in our pipelines yet.

Thanks so much,

Ann

From: Ann Black-Ziegelbein <elizabeth-black at uiowa.edu<mailto:elizabeth-black at uiowa.edu>>
Date: Thursday, February 2, 2017 at 7:18 AM
To: Ensembl developers list <dev at ensembl.org<mailto:dev at ensembl.org>>
Subject: Re: [ensembl-dev] Sift scores from VEP

Thanks Sarah -

I found several other instances as well where there was only one transcript that was annotated with SIFT with others having it missing.  In these scenarios too, dbNSFP as well as the SIFT database have the scores for the VEP missing SIFT score.   Let me know if I should send on a list?  I should have a count on how prevalent this is later today for a targeted set of genes.

When do you foresee the GRCh37 update taking place?

Thanks again for all your help,

Ann

From: <dev-bounces at ensembl.org<mailto:dev-bounces at ensembl.org>> on behalf of Sarah Hunt <seh at ebi.ac.uk<mailto:seh at ebi.ac.uk>>
Reply-To: Ensembl developers list <dev at ensembl.org<mailto:dev at ensembl.org>>
Date: Thursday, February 2, 2017 at 5:16 AM
To: Ensembl developers list <dev at ensembl.org<mailto:dev at ensembl.org>>
Subject: Re: [ensembl-dev] Sift scores from VEP



Hi Ann,

Thanks for reporting this. Checking at this variant in GRCh38, SIFT scores are available for the protein coding transcripts which suggests the lack of scores for GRCh37 may be due to a technical issue. We will look into this when we next update our GRCh37 database.

Best wishes,

Sarah

On 01/02/2017 19:02, Black-Ziegelbein, Elizabeth A wrote:
Good afternoon -

I recently ran across a variant annotation from VEP that did not provide any SIFT annotation.

1: 103343580 C > A  (GRCh 37 coordinates)

In dbNSFP, and by directly querying the most recent download of the SIFT scores from SIFT (http://sift.bii.a-star.edu.sg/sift4g/public/Homo_sapiens/GRCh37.74.zip) show this SNP as having a SIFT score of 0 – damaging.  It is a missense variant.  I found the documentation about how SIFT and Polyphen scores are computed by VEP – but just wanted to check/verify that this should be expected and not some type of error? I have found more than one instance of this.

Thanks for your thoughts -

Ann


From: <dev-bounces at ensembl.org<mailto:dev-bounces at ensembl.org>> on behalf of Ann Black-Ziegelbein <elizabeth-black at uiowa.edu<mailto:elizabeth-black at uiowa.edu>>
Reply-To: Ensembl developers list <dev at ensembl.org<mailto:dev at ensembl.org>>
Date: Tuesday, October 11, 2016 at 11:43 AM
To: Ensembl developers list <dev at ensembl.org<mailto:dev at ensembl.org>>
Subject: Re: [ensembl-dev] Sift/Polyphen scores from VEP cache vs. dbNSFP

Thanks so much :)

Ann

From: <dev-bounces at ensembl.org<mailto:dev-bounces at ensembl.org>> on behalf of Anja Thormann <anja at ebi.ac.uk<mailto:anja at ebi.ac.uk>>
Reply-To: Ensembl developers list <dev at ensembl.org<mailto:dev at ensembl.org>>
Date: Tuesday, October 11, 2016 at 11:35 AM
To: Ensembl developers list <dev at ensembl.org<mailto:dev at ensembl.org>>
Subject: Re: [ensembl-dev] Sift/Polyphen scores from VEP cache vs. dbNSFP

Dear Ann,

you can find version information in the following file of your cache file directory:

~/.vep/homo_sapiens_merged/84_GRCh37/info.txt

Assuming you use ~/.vep to store your cache files.

source_polyphen2.2.2
source_sift sift5.2.2

We provide more information on how we run SIFT and PolyPhen here:
http://www.ensembl.org/info/genome/variation/predicted_data.html#sift

HTH,
Anja


On 11 Oct 2016, at 17:24, Black-Ziegelbein, Elizabeth A <elizabeth-black at uiowa.edu<mailto:elizabeth-black at uiowa.edu>> wrote:

Hello -

We are running a local installation of VEP (v84) and are running it with –offline and –cache parameters. Caches were installed with the VEP script (we did not use custom caches - using homo_sapiens_merged/84_GRCh37).  I have noticed in some cases the SIFT and Polyphen scores for transcripts are different than what is reported in dbNSFP3 for a variant.  I was wondering what version of SIFT and Polyphen scores are included in the pre–built caches or if there are thoughts on why they would be different.  Additionally is it the polyphen v2 score or polyphen?

For example:

Missense variant 6:33143391:G>A  has no SIFT scores available for any transcript from the VEP annotation.  From dbNSFP 3.1, the same variant has the following SIFT scores: 0.285;0.29;0.307  for the following transcripts:  ENST00000374708;ENST00000341947;ENST00000361917.

Thanks so much,

Ann
_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/




_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170202/237a6eb8/attachment.html>


More information about the Dev mailing list