[ensembl-dev] How fetch a prediction matrix using stable ID

Sarah Hunt seh at ebi.ac.uk
Tue Mar 10 15:00:58 GMT 2015


Hi Oriol,

Polyphen (and Sift) results are heavily dependant on sequence 
conservation estimates derived from protein sequence alignments, so 
using different protein databases can result in a substantial number of 
different calls.

We used the latest protein databases available in November 2013, while 
the PolyPhen website uses databases from December 2011/ January 2012, so 
we would expect to see some differences.

As you are no doubt aware, Polyphen provides two different scores, 
HumVar and HumDiv. We supply HumVar as default while the PolyPhen web 
site defaults to HumDiv. If you would like to extract HumDiv results you 
can amend your script:

my $matrix_prediction= $matrix_adaptor-> 
fetch_polyphen_predictions_by_translation_md5($translation_md5, 'humdiv');

Best wishes,

Sarah

On 10/03/2015 14:04, ori wrote:
> Hi,
> I am trying to fetch the prediction matrix for some proteins, and then 
> some polyphen predictions for some particular aminoacid changes. In 
> Ensembl Help Documentation of predicted data, it is found: "Prediction 
> matrices can be fetched and manipulated in a user-friendly manner 
> using the variation API, specifically using the 
> /ProteinFunctionPredictionMatrixAdaptor/ which allows you to fetch a 
> prediction matrix using either a transcript or a translation stable 
> ID. This adaptor returns a /ProteinFunctionPredictionMatrix/ object 
> and you can use the /get_prediction/ method to retrieve a prediction 
> for a given position and amino acid".
>
> However, I am not able to get my matrix using any type of ID. I have 
> been trying fetch_polyphen_predictions_by_translation_md5 (using Perl 
> Digest::MD5 to get md5_hex), and it returns a matrix, but the 
> predictions doesn't match the real ones that polyphen predicts for 
> this change (I checked on their website).
> I would appreciate helps/tips to make things easier or to point out 
> where is my mistake. Here is so far what I wrote:
>
> use warnings;
> use Bio::EnsEMBL::Registry;
> use Digest::MD5 qw(md5 md5_hex md5_base64);
> use Bio::EnsEMBL::Utils::Exception qw(throw warning);
> use Bio::EnsEMBL::Variation::ProteinFunctionPredictionMatrix;
> print "Loading Registry...\n";
> my $registry = "Bio::EnsEMBL::Registry";
> $registry->load_registry_from_db( -host => 'ensembldb.ensembl.org', 
> -user => 'anonymous');
> print "Correctly Loaded Registry\n";
> my $transcript_adaptor = $registry -> get_adaptor ('Human', 'Core', 
> 'Transcript');
> my $matrix_adaptor= $registry-> get_adaptor ('Human', 'variation', 
> 'ProteinFunctionPredictionMatrixAdaptor');
> my $transcript= $transcript_adaptor -> 
> fetch_by_stable_id('ENST00000367429');
> my $translation_md5 = md5_hex($transcript->translate->seq());
> my $matrix_prediction= $matrix_adaptor-> 
> fetch_polyphen_predictions_by_translation_md5($translation_md5);
> my ($prediction)  = $matrix_prediction ->get_prediction(1, V);
> print "$prediction\n";
>
> You will see that the outcome is benign, and it should be probably 
> damaging.
>
> Thanks for the help,
>
> Oriol
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150310/8814377a/attachment.html>


More information about the Dev mailing list