[ensembl-dev] Discrepancies between VEP and Condel, PPH2 and SIFT

A. P. Levine a.levine at ucl.ac.uk
Thu Feb 2 14:34:04 GMT 2012


Dear Graham,

Thank you for your clear and helpful reply. I will get back to you if I find more cases where the qualitative predictions differ.

Kind regards,

Adam

Adam P. Levine

-----Original Message-----
From: Graham Ritchie [mailto:grsr at ebi.ac.uk] 
Sent: 02 February 2012 12:55
To: A. P. Levine
Subject: Re: [ensembl-dev] Discrepancies between VEP and Condel, PPH2 and SIFT

Hi Adam,

We are aware that there are frequently differences between the predictions produced by ensembl and the web versions of these various tools. You can find out how we run each tool on the variation documentation page here:

http://www.ensembl.org/info/docs/variation/index.html#nsSNP

Here is a copy paste from an earlier message to this list explaining some of the reasons why these differences may occur:

When pre-computing the SIFT and PolyPhen scores we download the software and run the tools locally using our own copies of the UniProt, Pfam, PDB, DSSP (etc.) databases which we update when we start the pipeline, so it is unlikely we will run the tools using exactly the same source data as the web applications (though we follow the author's instructions as closely as possible to try to ensure that results are reproducible). We also always supply the ensembl translation for each transcript as the reference protein, we don't try to find the closest UniProt protein or anything, and the web versions may sometimes use a different reference transcript. The web applications may also use newer versions of the software than we ran.

All of these factors mean we are unlikely to produce exactly the same scores as the web applications, but if you are finding significant differences then this is something we should look into, in particular if you find that you have lots of cases where the qualitative predictions ('benign', 'tolerated', 'probably damaging' etc.) differ between ensembl and the web applications then we will certainly investigate.

Hope that makes things clearer.

Cheers,

Graham

Ensembl Variation


On 2 Feb 2012, at 12:14, A. P. Levine wrote:

> I have noticed some discrepancies between the PolyPhen, SIFT and Condel scores as reported by the VEP compared with the scores reported by the three programs independently.
> 
> The variant I have been looking at is "chr7: 140501302 T/C" (build 37).
> 
> Using the VEP (either online or with the perl script):
> PolyPhen	benign	0.382
> Condel	neutral	0.418
> SIFT	tolerated	0.09
> 
> Using the Condel server (http://bg.upf.edu/condel/analysis):
> PPH2	0.6
> Condel	0.673 	deleterious
> SIFT	0.09
> 
> Using SIFT (http://sift.jcvi.org/www/SIFT_chr_coords_submit.html):
> SNP Type	Nonsynonymous
> Prediction	DAMAGING
> SIFT Score	0.04
> Median Information Content	2.93
> Gene ID
> 
> And finally, using PolyPhen-2 with both HumDiv and HumVar classifier models (http://genetics.bwh.harvard.edu/pph2/bgi.shtml):
> HumVar	possibly damaging	pph2_prob	0.791
> HumDiv	possibly damaging	pph2_prob	0.942
> 
> What do you think might be happening here? Which versions of the various programs are being used by the VEP?
> 
> Thank you,
> 
> Adam
> 
> Adam P. Levine
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/







More information about the Dev mailing list