[ensembl-dev] new snp effect predictor script

Graham Ritchie grsr at ebi.ac.uk
Sat Apr 23 11:09:30 BST 2011


Hi Andrea,

We (tried to) compute SIFT and PolyPhen predictions for all possible single amino acid substitutions in human - that is we took each human translation and systematically substituted one of the 19 possible alternative amino acids at each position and then ran both tools on this substitution. We then store these predictions in the variation database. This means you should be be able to fetch predictions for novel mutations that cause a single amino acid substitution when you're using the VEP or the API. 

We managed to get a prediction from at least one tool for over 95% of the proteins in ensembl, but there are some cases where we couldn't compute a prediction, generally where there weren't enough sequences in the multiple alignment for the method to make a call, or the where protein sequence was really long and the tool took too long/too much memory to run. In these cases the various methods in the API will return undef, as they do for any variation that does not cause a single amino acid substitution.

There's some documentation on what we do available here:

http://www.ensembl.org/info/docs/variation/index.html#nsSNP

Hope that helps.

Cheers,

Graham


On 23 Apr 2011, at 09:55, Ewan Birney wrote:

> 
> Andrea - I think the Polyphen/sift scores are provided for all coding variations.
> 
> A definitive answer will come from Will/Fiona - be aware that it's the
> Easter weekend here in the UK and there is a strange alignment of bank holidays
> over these weeks, so the response might take some time.
> 
> 
> 
> On 23 Apr 2011, at 03:09, Andrea Edwards wrote:
> 
>> Hello
>> 
>> Looking at the new version of the snp effect predictor script and TranscriptVariationAdaptor module, is it correct to say you only return polyphen and sift scores for existing variations? the readme file for the script does not say you cannot perform polyphen and sift on novel variations but the code appears to return undef if a score does not exist in the database.
>> 
>> thanks
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list