[ensembl-dev] MySQL query to retrieve protein annotation

Wed Feb 22 16:43:14 GMT 2017

Hi,

I’ am trying to use the Ensembl public MySQL query to retrieve protein annotation for all of a genome’s genes. While trying to figure out the right MySQL queries to make, I’ am testing it out with just one transcript’s translation (ENSMUST00000120187):

mysql> use mus_musculus_core_75_38;
mysql> SELECT protein_feature.hit_name, protein_feature.hit_start, protein_feature.hit_end, protein_feature.seq_start, protein_feature.seq_end, protein_feature.evalue, protein_feature.perc_ident, protein_feature.hit_description, protein_feature.translation_id FROM protein_feature, translation WHERE protein_feature.translation_id = '291089’;

But this returns around 1.4 million entries. What might be the best way to get all the protein annotation for ENSMUST00000120187? I’ am aware of BioMart and the Perl API, but would rather query via MySQL. I’ am continuing to read into the InterProScan pipeline and Ensembl documentation, but am not quite sure yet how to go about this.

Thanks for any help!
Charlie