[ensembl-dev] Search HGNC symbol by Ensembl gene IDs through Mysqldirect query
Eduardo Andrés León
eandres at cnio.es
Wed Sep 12 09:43:55 BST 2012
since ensembl 48, they are using the port : 5306
http://www.ensembl.org/info/data/mysql.html
On 12 Sep 2012, at 10:40, "Tommaso Mazza" <t.mazza at css-mendel.it> wrote:
> Hi Andy,
> many thanks.
> I'm now using version 68_37 (the latest?) that seems to not be available through port 3306.
>
> I've successfully used your query. Many thanks.
> However, if I do the opposite, namely if I look for the EnsId starting from the symbol (e.g.) UGP1 with your script (slightly modified)
>
> select g.stable_id
> from gene g join xref x on (g.display_xref_id = x.xref_id)
> join external_db ed using (external_db_id)
> where ed.db_name like 'HGNC%'
> and x.display_label = "UGP1";
>
>
> I do not get any result, even if I can get it by the web site.
> The same is for gene HMP19.
>
>
> It seems that there is not link between table GENE and XREF for them.
>
> How can I get EnsId also for these gene?
>
> Many thanks
> Tommaso
>
>
>
>
> On 10/09/2012 17:20, dev-request at ensembl.org wrote:
>>
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Mon, 10 Sep 2012 15:32:23 +0100
>> From: Andy Yates <ayates at ebi.ac.uk>
>> Subject: Re: [ensembl-dev] Search HGNC symbol by Ensemble gene IDs
>> through Mysql direct query
>> To: Ensembl developers list <dev at ensembl.org>
>> Message-ID: <48048544-6FFD-4DBF-A8B7-DD3E45E3DA54 at ebi.ac.uk>
>> Content-Type: text/plain; charset=us-ascii
>>
>> Hi Tommaso,
>>
>> HGNC symbols attached to genes by the object_xref hierarchy as you've already started to use but also via the display_xref_id on a gene entry. One possible solution would be to use the following query (note the lack of gene_stable_id which was retired a few releases ago):
>>
>> select *
>> from gene g join xref x on (g.display_xref_id = x.xref_id)
>> join external_db ed using (external_db_id)
>> where ed.db_name like 'HGNC%'
>> and g.stable_id in ("ENSG00000197021", "ENSG00000204379");
>>
>> Should you wish to continue using the current SQL query try removing the join to external_synonym:
>>
>> select distinct
>> G.stable_id,
>> X.`display_label`
>> from
>> gene_stable_id as G,
>> object_xref as OX,
>> xref as X ,
>> external_db as D
>> where
>> D.external_db_id=X.external_db_id and
>> OX.xref_id=X.xref_id and
>> OX.ensembl_object_type="Gene" and
>> G.gene_id=OX.ensembl_id and
>> G.stable_id in ("ENSG00000197021", "ENSG00000204379") and
>> D.db_name like 'HGNC%';
>>
>> This resulted in the following hits for me:
>>
>> stable_id display_label
>> ENSG00000197021 CXorf40B
>> ENSG00000204379 XAGE1E
>>
>>
>> Also is there any reason why you are accessing homo_sapiens_core_47_36i? Also release 47 DBs are hosted on port 3306 not 5306.
>>
>> Regards,
>>
>> Andy
>>
>> Andrew Yates Ensembl Core Software Project Leader
>> EMBL-EBI Tel: +44-(0)1223-492538
>> Wellcome Trust Genome Campus Fax: +44-(0)1223-494468
>> Cambridge CB10 1SD, UK http://www.ensembl.org/
>>
>> On 7 Sep 2012, at 00:03, Tommaso Mazza wrote:
>>
>>> Hi all,
>>> as in the object, I'm trying to get gene names from a list of Ensembl IDs.
>>> I would like to do this programmatically, by directly connecting and querying the Mysql db by the following query:
>>>
>>> mysql -h ensembldb.ensembl.org --port 5306 -u anonymous -D homo_sapiens_core_47_36i -
>>> A
>>>
>>>
>>>> select
>>> distinct
>>> G
>>> .stable_id,
>>>
>>> S
>>> .
>>> synonym
>>>
>>> from
>>>
>>> gene_stable_id
>>> as G,
>>>
>>> object_xref
>>> as OX,
>>>
>>> external_synonym
>>> as S,
>>>
>>> xref
>>> as X ,
>>>
>>> external_db
>>> as
>>> D
>>>
>>> where
>>>
>>> D
>>> .external_db_id=X.external_db_id and
>>>
>>> X
>>> .xref_id=S.xref_id and
>>>
>>> OX
>>> .xref_id=X.xref_id and
>>>
>>> OX
>>> .ensembl_object_type="Gene" and
>>>
>>> G
>>> .gene_id=OX.ensembl_id and
>>>
>>> G
>>> .stable_id in ("ENSG00000197021", "ENSG00000204379");
>>> This code generally works, but I do not figure out why I do not get any result for some ENS ids.
>>>
>>> For example, this ID: ENSG00000130208 is associated to the APOC1 gene.
>>> If I use this ID in the query, I did not get any result. in fact, the external_synonymous table does contain the name "APOC1".
>>>
>>> However and obviously, by the Ensembl web interface I can get this information.
>>>
>>> Where am I going wrong? Should I use other and more proper tables?
>>>
>>> Thanks
>>> Tommaso
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org
>>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
> --
> Tommaso Mazza, Ph.D.
> Bioinformatics unit
> Casa Sollievo della Sofferenza - Mendel
> Viale Regina Margherita 261
> 00198 Roma IT
>
> Tel: +39 06 44160526
> Fax: +39 06 44160538
> E-mail: t.mazza at css-mendel.it
> Web page: http://www.css-mendel.it/
> Personal web: http://www.tommasomazza.eu
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
More information about the Dev
mailing list