[ensembl-dev] Search HGNC symbol by Ensembl gene IDs through Mysqldirect query

Eduardo Andrés León eandres at cnio.es
Wed Sep 12 09:43:55 BST 2012


since ensembl 48, they are using the port : 5306 

http://www.ensembl.org/info/data/mysql.html


On 12 Sep 2012, at 10:40, "Tommaso Mazza" <t.mazza at css-mendel.it> wrote:

> Hi Andy,
> many thanks.
> I'm now using version 68_37 (the latest?) that seems to not be available through port 3306.
> 
> I've successfully used your query. Many thanks.
> However, if I do the opposite, namely if I look for the EnsId starting from the symbol (e.g.) UGP1 with your script (slightly modified)
> 
> select g.stable_id
> from gene g join xref x on (g.display_xref_id = x.xref_id)
> join external_db ed using (external_db_id)
> where ed.db_name like 'HGNC%'
> and x.display_label = "UGP1";
> 
> 
> I do not get any result, even if I can get it by the web site.
> The same is for gene HMP19.
> 
> 
> It seems that there is not link between table GENE and XREF for them.
> 
> How can I get EnsId also for these gene?
> 
> Many thanks
> Tommaso
> 
> 
> 
> 
> On 10/09/2012 17:20, dev-request at ensembl.org wrote:
>> 
>> 
>> ------------------------------
>> 
>> Message: 3
>> Date: Mon, 10 Sep 2012 15:32:23 +0100
>> From: Andy Yates <ayates at ebi.ac.uk>
>> Subject: Re: [ensembl-dev] Search HGNC symbol by Ensemble gene IDs
>> 	through	Mysql direct query
>> To: Ensembl developers list <dev at ensembl.org>
>> Message-ID: <48048544-6FFD-4DBF-A8B7-DD3E45E3DA54 at ebi.ac.uk>
>> Content-Type: text/plain; charset=us-ascii
>> 
>> Hi Tommaso,
>> 
>> HGNC symbols attached to genes by the object_xref hierarchy as you've already started to use but also via the display_xref_id on a gene entry. One possible solution would be to use the following query (note the lack of gene_stable_id which was retired a few releases ago):
>> 
>> select *
>> from gene g join xref x on (g.display_xref_id = x.xref_id)
>> join external_db ed using (external_db_id)
>> where ed.db_name like 'HGNC%'
>> and g.stable_id in ("ENSG00000197021", "ENSG00000204379");
>> 
>> Should you wish to continue using the current SQL query try removing the join to external_synonym:
>> 
>> select distinct
>>    G.stable_id,
>>    X.`display_label`
>> from
>>   gene_stable_id as G,
>>   object_xref as OX,
>>   xref as X ,
>>   external_db as D
>> where
>>   D.external_db_id=X.external_db_id and
>>   OX.xref_id=X.xref_id and
>>   OX.ensembl_object_type="Gene" and
>>   G.gene_id=OX.ensembl_id and
>>   G.stable_id in ("ENSG00000197021", "ENSG00000204379") and
>>   D.db_name like 'HGNC%';
>> 
>> This resulted in the following hits for me:
>> 
>> stable_id	display_label
>> ENSG00000197021	CXorf40B
>> ENSG00000204379	XAGE1E
>> 
>> 
>> Also is there any reason why you are accessing homo_sapiens_core_47_36i? Also release 47 DBs are hosted on port 3306 not 5306.
>> 
>> Regards,
>> 
>> Andy
>> 
>> Andrew Yates                   Ensembl Core Software Project Leader
>> EMBL-EBI                       Tel: +44-(0)1223-492538
>> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
>> Cambridge CB10 1SD, UK         http://www.ensembl.org/
>> 
>> On 7 Sep 2012, at 00:03, Tommaso Mazza wrote:
>> 
>>> Hi all,
>>> as in the object, I'm trying to get gene names from a list of Ensembl IDs.
>>> I would like to do this programmatically, by directly connecting and querying the Mysql db by the following query:
>>> 
>>> mysql -h ensembldb.ensembl.org --port 5306  -u anonymous -D homo_sapiens_core_47_36i -
>>> A
>>> 
>>> 
>>>> select
>>>  distinct
>>>    G
>>> .stable_id,
>>> 
>>>    S
>>> .
>>> synonym
>>> 
>>> from
>>> 
>>>   gene_stable_id
>>> as G,
>>> 
>>>   object_xref
>>> as OX,
>>> 
>>>   external_synonym
>>> as S,
>>> 
>>>   xref
>>> as X ,
>>> 
>>>   external_db
>>> as
>>>  D
>>> 
>>> where
>>> 
>>>   D
>>> .external_db_id=X.external_db_id and
>>> 
>>>   X
>>> .xref_id=S.xref_id and
>>> 
>>>   OX
>>> .xref_id=X.xref_id and
>>> 
>>>   OX
>>> .ensembl_object_type="Gene" and
>>> 
>>>   G
>>> .gene_id=OX.ensembl_id and
>>> 
>>>   G
>>> .stable_id in ("ENSG00000197021", "ENSG00000204379");
>>> This code generally works, but I do not figure out why I do not get any result for some ENS ids.
>>> 
>>> For example, this ID: ENSG00000130208 is associated to the APOC1 gene.
>>> If I use this ID in the query, I did not get any result. in fact, the external_synonymous table does contain the name "APOC1".
>>> 
>>> However and obviously, by the Ensembl web interface I can get this information.
>>> 
>>> Where am I going wrong? Should I use other and more proper tables?
>>> 
>>> Thanks
>>> Tommaso
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>> 
>> 
> 
> -- 
> Tommaso Mazza, Ph.D.
> Bioinformatics unit
> Casa Sollievo della Sofferenza - Mendel
> Viale Regina Margherita 261
> 00198 Roma IT
> 
> Tel: +39 06 44160526
> Fax: +39 06 44160538
> E-mail: t.mazza at css-mendel.it
> Web page: http://www.css-mendel.it/
> Personal web: http://www.tommasomazza.eu
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list