[ensembl-dev] Sift scores from VEP

Sarah Hunt seh at ebi.ac.uk
Thu Feb 2 18:27:06 GMT 2017


Hi Ann,

Thanks for the additional information. We are looking into why some 
protein sequences have no data and can tell you more about timelines 
when we understand the issue. As you are using the VEP script, we would 
be able to supply updated data files outside our release cycle. If you 
send the list we can use it to check the updated data.

Best wishes,

Sarah

On 02/02/2017 16:28, Black-Ziegelbein, Elizabeth A wrote:
> Just a quick update.  In a quick query across our variants in about 
> 120 genes, there are over 12,000 HIGH/MODERATE SNPs which have a 
> missing SIFT score (which is present in dbNSFP).  This does not 
> include cases where there are multiple transcripts, and the SIFT score 
> is present on a subset of those transcripts (but present for all 
> transcripts in dbNSFP).
>
> We really appreciate your help & advice. It would really helpful to 
> get an update on when this might be fixed for GRCh37.  We are not 
> ready to move onto GRCh38 in our pipelines yet.
>
> Thanks so much,
>
> Ann
>
> From: Ann Black-Ziegelbein <elizabeth-black at uiowa.edu 
> <mailto:elizabeth-black at uiowa.edu>>
> Date: Thursday, February 2, 2017 at 7:18 AM
> To: Ensembl developers list <dev at ensembl.org <mailto:dev at ensembl.org>>
> Subject: Re: [ensembl-dev] Sift scores from VEP
>
> Thanks Sarah -
>
> I found several other instances as well where there was only one 
> transcript that was annotated with SIFT with others having it missing. 
>  In these scenarios too, dbNSFP as well as the SIFT database have the 
> scores for the VEP missing SIFT score.   Let me know if I should send 
> on a list?  I should have a count on how prevalent this is later today 
> for a targeted set of genes.
>
> When do you foresee the GRCh37 update taking place?
>
> Thanks again for all your help,
>
> Ann
>
> From: <dev-bounces at ensembl.org <mailto:dev-bounces at ensembl.org>> on 
> behalf of Sarah Hunt <seh at ebi.ac.uk <mailto:seh at ebi.ac.uk>>
> Reply-To: Ensembl developers list <dev at ensembl.org 
> <mailto:dev at ensembl.org>>
> Date: Thursday, February 2, 2017 at 5:16 AM
> To: Ensembl developers list <dev at ensembl.org <mailto:dev at ensembl.org>>
> Subject: Re: [ensembl-dev] Sift scores from VEP
>
>
> Hi Ann,
>
> Thanks for reporting this. Checking at this variant in GRCh38, SIFT 
> scores are available for the protein coding transcripts which suggests 
> the lack of scores for GRCh37 may be due to a technical issue. We will 
> look into this when we next update our GRCh37 database.
>
> Best wishes,
>
> Sarah
>
>
> On 01/02/2017 19:02, Black-Ziegelbein, Elizabeth A wrote:
>> Good afternoon -
>>
>> I recently ran across a variant annotation from VEP that did not 
>> provide any SIFT annotation.
>>
>> 1: 103343580 C > A  (GRCh 37 coordinates)
>>
>> In dbNSFP, and by directly querying the most recent download of the 
>> SIFT scores from SIFT 
>> (http://sift.bii.a-star.edu.sg/sift4g/public/Homo_sapiens/GRCh37.74.zip) 
>> show this SNP as having a SIFT score of 0 – damaging.  It is a 
>> missense variant.  I found the documentation about how SIFT and 
>> Polyphen scores are computed by VEP – but just wanted to check/verify 
>> that this should be expected and not some type of error? I have found 
>> more than one instance of this.
>>
>> Thanks for your thoughts -
>>
>> Ann
>>
>>
>> From: <dev-bounces at ensembl.org <mailto:dev-bounces at ensembl.org>> on 
>> behalf of Ann Black-Ziegelbein <elizabeth-black at uiowa.edu 
>> <mailto:elizabeth-black at uiowa.edu>>
>> Reply-To: Ensembl developers list <dev at ensembl.org 
>> <mailto:dev at ensembl.org>>
>> Date: Tuesday, October 11, 2016 at 11:43 AM
>> To: Ensembl developers list <dev at ensembl.org <mailto:dev at ensembl.org>>
>> Subject: Re: [ensembl-dev] Sift/Polyphen scores from VEP cache vs. dbNSFP
>>
>> Thanks so much :)
>>
>> Ann
>>
>> From: <dev-bounces at ensembl.org <mailto:dev-bounces at ensembl.org>> on 
>> behalf of Anja Thormann <anja at ebi.ac.uk <mailto:anja at ebi.ac.uk>>
>> Reply-To: Ensembl developers list <dev at ensembl.org 
>> <mailto:dev at ensembl.org>>
>> Date: Tuesday, October 11, 2016 at 11:35 AM
>> To: Ensembl developers list <dev at ensembl.org <mailto:dev at ensembl.org>>
>> Subject: Re: [ensembl-dev] Sift/Polyphen scores from VEP cache vs. dbNSFP
>>
>> Dear Ann,
>>
>> you can find version information in the following file of your cache 
>> file directory:
>>
>> ~/.vep/homo_sapiens_merged/84_GRCh37/info.txt
>>
>> Assuming you use ~/.vep to store your cache files.
>>
>> source_polyphen2.2.2
>> source_siftsift5.2.2
>>
>> We provide more information on how we run SIFT and PolyPhen here:
>> http://www.ensembl.org/info/genome/variation/predicted_data.html#sift
>>
>> HTH,
>> Anja
>>
>>
>>> On 11 Oct 2016, at 17:24, Black-Ziegelbein, Elizabeth A 
>>> <elizabeth-black at uiowa.edu <mailto:elizabeth-black at uiowa.edu>> wrote:
>>>
>>> Hello -
>>>
>>> We are running a local installation of VEP (v84) and are running it 
>>> with –offline and –cache parameters. Caches were installed with the 
>>> VEP script (we did not use custom caches - using 
>>> homo_sapiens_merged/84_GRCh37).  I have noticed in some cases the 
>>> SIFT and Polyphen scores for transcripts are different than what is 
>>> reported in dbNSFP3 for a variant.  I was wondering what version of 
>>> SIFT and Polyphen scores are included in the pre–built caches or if 
>>> there are thoughts on why they would be different.  Additionally is 
>>> it the polyphen v2 score or polyphen?
>>>
>>> For example:
>>>
>>> Missense variant 6:33143391:G>A  has no SIFT scores available for 
>>> any transcript from the VEP annotation. From dbNSFP 3.1, the same 
>>> variant has the following SIFT scores: 0.285;0.29;0.307  for the 
>>> following transcripts: ENST00000374708;ENST00000341947;ENST00000361917.
>>>
>>> Thanks so much,
>>>
>>> Ann
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>> Posting guidelines and subscribe/unsubscribe 
>>> info:http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog:http://www.ensembl.info/
>>
>>
>>
>> _______________________________________________
>> Dev mailing listDev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog:http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170202/07456e5c/attachment.html>


More information about the Dev mailing list