[ensembl-dev] mysql

Andreas Kahari ak at ebi.ac.uk
Thu Dec 2 10:22:25 GMT 2010


Hi Koray,

It would be helpful to us if you mentioned what database you're looking
at.  I can not see any external_db entry with external_db_id = 3102 in
the current Core databases.

The following is valid for at least the Core databases:

The xref table contains all external entries that were considered by the
xref mapping pipeline.  Some of these will not be mapped to anything.
Others will be mapped to an Ensembl object via the object_xref table.

Also notice that an xref may be linked to a gene, a transcript, or to a
translation.

For example, to find all transcripts that are linked to EntrezGene
entries, we first have to find out what type of Ensembl object
EntrezGene is linked to.

mysql> SELECT DISTINCT ox.ensembl_object_type
    -> FROM xref
    ->   JOIN object_xref ox USING (xref_id)
    ->   JOIN external_db edb USING (external_db_id)
    -> WHERE edb.db_name = 'EntrezGene';
+---------------------+
| ensembl_object_type |
+---------------------+
| Translation         |
+---------------------+
1 row in set (6.04 sec)

So, we now know we need to join to the translation table.  Here's how we
pull out some of the translation stable IDs:

mysql> SELECT pid.stable_id
    -> FROM xref
    ->   JOIN object_xref ox USING (xref_id)
    ->   JOIN external_db edb USING (external_db_id)
    ->   JOIN translation_stable_id pid ON (ox.ensembl_id = pid.translation_id)
    -> WHERE edb.db_name = 'EntrezGene'
    -> LIMIT 10;
+-----------------+
| stable_id       |
+-----------------+
| ENSP00000280684 |
| ENSP00000408321 |
| ENSP00000221444 |
| ENSP00000252321 |
| ENSP00000328511 |
| ENSP00000386541 |
| ENSP00000242776 |
| ENSP00000388769 |
| ENSP00000415238 |
| ENSP00000344674 |
+-----------------+
10 rows in set (0.07 sec)

To find the corresponding transcripts, we need to join the translation
table to the transcript table like this:

mysql> SELECT tid.stable_id
    -> FROM xref
    ->   JOIN external_db edb USING (external_db_id)
    ->   JOIN object_xref ox USING (xref_id)
    ->   JOIN translation p ON (ox.ensembl_id = p.translation_id)
    ->   JOIN transcript_stable_id tid USING (transcript_id)
    -> WHERE edb.db_name = 'EntrezGene'
    -> LIMIT 10;
+-----------------+
| stable_id       |
+-----------------+
| ENST00000280684 |
| ENST00000433855 |
| ENST00000221444 |
| ENST00000252321 |
| ENST00000328224 |
| ENST00000409682 |
| ENST00000242776 |
| ENST00000451994 |
| ENST00000456214 |
| ENST00000344777 |
+-----------------+
10 rows in set (0.13 sec)


Cheers,
Andreas

On Thu, Dec 02, 2010 at 11:46:10AM +0200, Koray Do?an Kaya wrote:
> 
> Hi folk,
> 
> I tried mysql
> mysql> select * from  xref  where external_db_id=3102 limit 5;
> +---------+----------------+---------------+---------------+---------+-------------+-----------+--------------------------+
> | xref_id | external_db_id | dbprimary_acc | display_label | version |
> description | info_type | info_text                |
> +---------+----------------+---------------+---------------+---------+-------------+-----------+--------------------------+
> | 2304380 |           3102 | 216412_x_at   | 216412_x_at   | 1       |
> NULL        | MISC      | probeset_size 11 hits 6  |
> | 2304439 |           3102 | 206171_at     | 206171_at     | 1       |
> NULL        | MISC      | probeset_size 11 hits 11 |
> | 2304559 |           3102 | 214226_at     | 214226_at     | 1       |
> NULL        | MISC      | probeset_size 11 hits 10 |
> | 2304368 |           3102 | 212997_s_at   | 212997_s_at   | 1       |
> NULL        | MISC      | probeset_size 11 hits 6  |
> | 2304361 |           3102 | 208488_s_at   | 208488_s_at   | 1       |
> NULL        | MISC      | probeset_size 11 hits 8  |
> +---------+----------------+---------------+---------------+---------+-------------+-----------+--------------------------+
> 
> How can i combine xref_id to transcript_d?
> This did not work for example:
> 
> mysql> select gene_id from transcript where display_xref_id=2304439;
> Empty set (0.08 sec)
> 
> 
> 
> 
> -- 
> M.Sc. Koray Dogan Kaya
> PhD Candidate
> Bilkent University,
> Department of Molecular Biology and Genetics
> 06800 / Ankara
> Turkey
> Mobile: (+90) 533 526 09 51
> Office: (+90)  312 290 24 03
> e-mail: kkaya at bilkent.edu.tr
> 
> 
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev
> 

-- 
Andreas Kähäri, Ensembl Software Developer
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus
Hinxton, Cambridge CB10 1SD, United Kingdom




More information about the Dev mailing list