[ensembl-dev] Search HGNC symbol by Ensemble gene IDs through Mysql direct query

Andy Yates ayates at ebi.ac.uk
Mon Sep 10 15:32:23 BST 2012


Hi Tommaso,

HGNC symbols attached to genes by the object_xref hierarchy as you've already started to use but also via the display_xref_id on a gene entry. One possible solution would be to use the following query (note the lack of gene_stable_id which was retired a few releases ago):

select *
from gene g join xref x on (g.display_xref_id = x.xref_id)
join external_db ed using (external_db_id)
where ed.db_name like 'HGNC%'
and g.stable_id in ("ENSG00000197021", "ENSG00000204379");

Should you wish to continue using the current SQL query try removing the join to external_synonym:

select distinct
   G.stable_id,
   X.`display_label`
from
  gene_stable_id as G,
  object_xref as OX,
  xref as X ,
  external_db as D
where
  D.external_db_id=X.external_db_id and
  OX.xref_id=X.xref_id and
  OX.ensembl_object_type="Gene" and
  G.gene_id=OX.ensembl_id and
  G.stable_id in ("ENSG00000197021", "ENSG00000204379") and
  D.db_name like 'HGNC%';

This resulted in the following hits for me:

stable_id	display_label
ENSG00000197021	CXorf40B
ENSG00000204379	XAGE1E


Also is there any reason why you are accessing homo_sapiens_core_47_36i? Also release 47 DBs are hosted on port 3306 not 5306.

Regards,

Andy

Andrew Yates                   Ensembl Core Software Project Leader
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensembl.org/

On 7 Sep 2012, at 00:03, Tommaso Mazza wrote:

> Hi all,
> as in the object, I'm trying to get gene names from a list of Ensembl IDs.
> I would like to do this programmatically, by directly connecting and querying the Mysql db by the following query:
> 
> mysql -h ensembldb.ensembl.org --port 5306  -u anonymous -D homo_sapiens_core_47_36i -
> A
> 
> 
> > select
>  distinct
>    G
> .stable_id,
> 
>    S
> .
> synonym
> 
> from
> 
>   gene_stable_id 
> as G,
> 
>   object_xref 
> as OX,
> 
>   external_synonym 
> as S,
> 
>   xref 
> as X ,
> 
>   external_db 
> as
>  D
> 
> where
> 
>   D
> .external_db_id=X.external_db_id and
> 
>   X
> .xref_id=S.xref_id and
> 
>   OX
> .xref_id=X.xref_id and
> 
>   OX
> .ensembl_object_type="Gene" and
> 
>   G
> .gene_id=OX.ensembl_id and
> 
>   G
> .stable_id in ("ENSG00000197021", "ENSG00000204379");
> This code generally works, but I do not figure out why I do not get any result for some ENS ids.
> 
> For example, this ID: ENSG00000130208 is associated to the APOC1 gene.
> If I use this ID in the query, I did not get any result. in fact, the external_synonymous table does contain the name "APOC1".
> 
> However and obviously, by the Ensembl web interface I can get this information.
> 
> Where am I going wrong? Should I use other and more proper tables?
> 
> Thanks
> Tommaso
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list