[ensembl-dev] Search HGNC symbol by Ensembl gene IDs through Mysql direct query

Tommaso Mazza t.mazza at css-mendel.it
Wed Sep 12 09:40:44 BST 2012


Hi Andy,
many thanks.
I'm now using version 68_37 (the latest?) that seems to not be available 
through port 3306.

I've successfully used your query. Many thanks.
However, if I do the opposite, namely if I look for the EnsId starting 
from the symbol (e.g.) UGP1 with your script (slightly modified)

select g.stable_id
from gene g join xref x on (g.display_xref_id = x.xref_id)
join external_db ed using (external_db_id)
where ed.db_name like 'HGNC%'
and x.display_label = "UGP1";


I do not get any result, even if I can get it by the web site.
The same is for gene HMP19.


It seems that there is not link between table GENE and XREF for them.

How can I get EnsId also for these gene?

Many thanks
Tommaso




On 10/09/2012 17:20, dev-request at ensembl.org wrote:
>
>
> ------------------------------
>
> Message: 3
> Date: Mon, 10 Sep 2012 15:32:23 +0100
> From: Andy Yates <ayates at ebi.ac.uk>
> Subject: Re: [ensembl-dev] Search HGNC symbol by Ensemble gene IDs
> 	through	Mysql direct query
> To: Ensembl developers list <dev at ensembl.org>
> Message-ID: <48048544-6FFD-4DBF-A8B7-DD3E45E3DA54 at ebi.ac.uk>
> Content-Type: text/plain; charset=us-ascii
>
> Hi Tommaso,
>
> HGNC symbols attached to genes by the object_xref hierarchy as you've already started to use but also via the display_xref_id on a gene entry. One possible solution would be to use the following query (note the lack of gene_stable_id which was retired a few releases ago):
>
> select *
> from gene g join xref x on (g.display_xref_id = x.xref_id)
> join external_db ed using (external_db_id)
> where ed.db_name like 'HGNC%'
> and g.stable_id in ("ENSG00000197021", "ENSG00000204379");
>
> Should you wish to continue using the current SQL query try removing the join to external_synonym:
>
> select distinct
>     G.stable_id,
>     X.`display_label`
> from
>    gene_stable_id as G,
>    object_xref as OX,
>    xref as X ,
>    external_db as D
> where
>    D.external_db_id=X.external_db_id and
>    OX.xref_id=X.xref_id and
>    OX.ensembl_object_type="Gene" and
>    G.gene_id=OX.ensembl_id and
>    G.stable_id in ("ENSG00000197021", "ENSG00000204379") and
>    D.db_name like 'HGNC%';
>
> This resulted in the following hits for me:
>
> stable_id	display_label
> ENSG00000197021	CXorf40B
> ENSG00000204379	XAGE1E
>
>
> Also is there any reason why you are accessing homo_sapiens_core_47_36i? Also release 47 DBs are hosted on port 3306 not 5306.
>
> Regards,
>
> Andy
>
> Andrew Yates                   Ensembl Core Software Project Leader
> EMBL-EBI                       Tel: +44-(0)1223-492538
> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
> Cambridge CB10 1SD, UK         http://www.ensembl.org/
>
> On 7 Sep 2012, at 00:03, Tommaso Mazza wrote:
>
>> Hi all,
>> as in the object, I'm trying to get gene names from a list of Ensembl IDs.
>> I would like to do this programmatically, by directly connecting and querying the Mysql db by the following query:
>>
>> mysql -h ensembldb.ensembl.org --port 5306  -u anonymous -D homo_sapiens_core_47_36i -
>> A
>>
>>
>>> select
>>   distinct
>>     G
>> .stable_id,
>>
>>     S
>> .
>> synonym
>>
>> from
>>
>>    gene_stable_id
>> as G,
>>
>>    object_xref
>> as OX,
>>
>>    external_synonym
>> as S,
>>
>>    xref
>> as X ,
>>
>>    external_db
>> as
>>   D
>>
>> where
>>
>>    D
>> .external_db_id=X.external_db_id and
>>
>>    X
>> .xref_id=S.xref_id and
>>
>>    OX
>> .xref_id=X.xref_id and
>>
>>    OX
>> .ensembl_object_type="Gene" and
>>
>>    G
>> .gene_id=OX.ensembl_id and
>>
>>    G
>> .stable_id in ("ENSG00000197021", "ENSG00000204379");
>> This code generally works, but I do not figure out why I do not get any result for some ENS ids.
>>
>> For example, this ID: ENSG00000130208 is associated to the APOC1 gene.
>> If I use this ID in the query, I did not get any result. in fact, the external_synonymous table does contain the name "APOC1".
>>
>> However and obviously, by the Ensembl web interface I can get this information.
>>
>> Where am I going wrong? Should I use other and more proper tables?
>>
>> Thanks
>> Tommaso
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
>

-- 
Tommaso Mazza, Ph.D.
Bioinformatics unit
Casa Sollievo della Sofferenza - Mendel
Viale Regina Margherita 261
00198 Roma IT

Tel: +39 06 44160526
Fax: +39 06 44160538
E-mail: t.mazza at css-mendel.it
Web page: http://www.css-mendel.it/
Personal web: http://www.tommasomazza.eu





More information about the Dev mailing list