[ensembl-dev] MySQL query to retrieve protein annotation

Charles Joseph Murphy chm2059 at med.cornell.edu
Wed Feb 22 17:58:13 GMT 2017


Thanks mag, your code seems to be what I’ am looking for!

Charlie


On Feb 22, 2017, at 11:50, mag <mr6 at ebi.ac.uk<mailto:mr6 at ebi.ac.uk>> wrote:

Hi Charlie,

You need to add a join between the protein_feature and translation tables, something like protein_feature.translation_id = translation.translation_id

Additionally, it is not recommended to use internal ids for querying, as these are not guaranteed to remain the same from on release to another.
You can use the stable id instead, which will refer to the same Ensembl feature across releases.

For example
SELECT protein_feature.hit_name, protein_feature.hit_start, protein_feature.hit_end, protein_feature.seq_start, protein_feature.seq_end, protein_feature.evalue, protein_feature.perc_ident, protein_feature.hit_description, protein_feature.translation_id FROM protein_feature, translation WHERE protein_feature.translation_id = translation.translation_id and stable_id = 'ENSMUSP00000081003' ;

You can find a more detailed description of the database schema here: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ensembl.org_info_docs_api_core_core-5Fschema.html&d=DwIGaQ&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=6il2B1CnEdca_qIh4G7BpJ42odqxluAeluMwonOBS10&s=J0zCw_hxO950VL1MHad0q2i5z-aFtvtKW2fgYhogh2g&e=

Hope that helps,
mag

On 22/02/2017 16:43, Charles Joseph Murphy wrote:
Hi,

I’ am trying to use the Ensembl public MySQL query to retrieve protein annotation for all of a genome’s genes. While trying to figure out the right MySQL queries to make, I’ am testing it out with just one transcript’s translation (ENSMUST00000120187):

mysql> use mus_musculus_core_75_38;
mysql> SELECT protein_feature.hit_name, protein_feature.hit_start, protein_feature.hit_end, protein_feature.seq_start, protein_feature.seq_end, protein_feature.evalue, protein_feature.perc_ident, protein_feature.hit_description, protein_feature.translation_id FROM protein_feature, translation WHERE protein_feature.translation_id = '291089’;

But this returns around 1.4 million entries. What might be the best way to get all the protein annotation for ENSMUST00000120187? I’ am aware of BioMart and the Perl API, but would rather query via MySQL. I’ am continuing to read into the InterProScan pipeline and Ensembl documentation, but am not quite sure yet how to go about this.

Thanks for any help!
Charlie
_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ensembl.org_mailman_listinfo_dev&d=DwIGaQ&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=6il2B1CnEdca_qIh4G7BpJ42odqxluAeluMwonOBS10&s=u0mLvgk0Et1Dw_2o84WYbifr0J_Bh1aXxkcNRn3AOqc&e= Ensembl Blog: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ensembl.info_&d=DwIGaQ&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=6il2B1CnEdca_qIh4G7BpJ42odqxluAeluMwonOBS10&s=B4i7rpGuR3UBWoz3UE59Sav5oybpdYjTGY6WENr0pXM&e=


_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ensembl.org_mailman_listinfo_dev&d=DwIGaQ&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=6il2B1CnEdca_qIh4G7BpJ42odqxluAeluMwonOBS10&s=u0mLvgk0Et1Dw_2o84WYbifr0J_Bh1aXxkcNRn3AOqc&e= Ensembl Blog: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ensembl.info_&d=DwIGaQ&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=6il2B1CnEdca_qIh4G7BpJ42odqxluAeluMwonOBS10&s=B4i7rpGuR3UBWoz3UE59Sav5oybpdYjTGY6WENr0pXM&e=

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170222/c3924837/attachment.html>


More information about the Dev mailing list