[ensembl-dev] Affymetrix Probesets
Nathan Johnson
njohnson at ebi.ac.uk
Tue Jul 12 09:14:24 BST 2011
Hi Alexander
Great stuff! All of our tables are routinely optimized and analyzed throughout production cycle. In fact it's part of our standard practise whenever we copy a database.
I can only assume that the problem arises when loading the data from flat files, as the indexes are generated afresh (i.e. a major change). I would say it would be a good idea to run analyze and optimize over all of the tables, just to be sure.
Glad to be of help.
Nath
On 4 Jul 2011, at 23:09, Alexander Pico wrote:
> Remarkable...
>
> After running both OPTIMIZE statements, I now get the exact same "explain"
> output as you guys (see below). And the formerly problematic lines now
> breeze by in seconds (rather than minutes):
>
> $probe_feature_adaptor->fetch_all_by_linked_transcript_Gene($gene);
> -or-
> $probe_feature_adaptor->fetch_all_by_external_name('ENSMUST00000102049');
>
> This is fantastic. Should I be running OPTIMIZE on these two tables for
> every species and every release? Is it common practice to OPTIMIZE all
> tables over a certain size? A quick Google search seem to suggest that this
> is only needed after making major changes (e.g., deletions), but these
> tables are fresh from your ftp.
>
> - Alex
>
>
> +----+-------------+-------+-------+-------------------+---------+---------+
> -----------------------------------------+------+---------------------------
> --+
> | id | select_type | table | type | possible_keys | key | key_len |
> ref | rows | Extra
> |
> +----+-------------+-------+-------+-------------------+---------+---------+
> -----------------------------------------+------+---------------------------
> --+
> | 1 | SIMPLE | pf | range | PRIMARY,probe_idx | PRIMARY | 4 |
> NULL | 2048 | Using where; Using filesort
> |
> | 1 | SIMPLE | p | ref | PRIMARY | PRIMARY | 4 |
> mus_musculus_funcgen_62_37o.pf.probe_id | 1 |
> |
> +----+-------------+-------+-------+-------------------+---------+---------+
> -----------------------------------------+------+---------------------------
> --+
>
>
>
> On 7/4/11 1:46 PM, "Ian Sealy" <is1 at sanger.ac.uk> wrote:
>
>> Dear Alex,
>>
>>>> Also, have you tried optimising the tables?
>>>
>>> You mentioned optimizing the tables? I simply installed databases from
>>> Ensembl, the corresponding API and then started running my scripts.
>>> Is "optimizing" a command I run on certain tables?
>>
>> Yep, I think Nathan is suggesting you try:
>>
>> OPTIMIZE TABLE probe_feature;
>> OPTIMIZE TABLE probe;
>>
>> See http://dev.mysql.com/doc/refman/5.1/en/optimize-table.html for what
>> it does.
>>
>> Running ANALYZE TABLE might be enough though.
>>
>> Cheers,
>> Ian
>>
>
Nathan Johnson
Senior Scientific Programmer
Ensembl Regulation
European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
http://www.ensembl.info/
http://twitter.com/#!/ensembl
More information about the Dev
mailing list