[ensembl-dev] VEP scaling / MySQL load balancing, anyone?

Andy Yates ayates at ebi.ac.uk
Wed Aug 1 17:54:59 BST 2012


Hey Jan,

One thing I've noticed with MySQL is that it's very hard to make it go faster than what it does normally. If anything throwing hardware at the the problem just seems to let you maintain a level of service for longer. I'd like to suggest a few things though:

1). Take a look at the filesystem you're using. I did a quick google and articles seem to suggest XFS/JFS are good file systems for MySQL (even better for InnoDB servers though).

2). Using MyISAM means you are more reliant on disk & OS caches in order to speed up access rather than pushing more into MySQL's memory. Switching to InnoDB lets you give MySQL lots of memory to cache with which _could_ improve performance ...

3). Try increasing the table cache. Might seem silly but if your processes are hitting a lot of different tables then the table cache could be ejecting tables too early

4). A simple way to load balance is to use MySQL Proxy (http://dev.mysql.com/downloads/mysql-proxy/) which has improved in it's stability over the past few years. It does some automatic round-robin but making it scale automatically with auto-starting VMs could be quite difficult (it can be scripted with Lua so maybe there is a way):

mysql-proxy \
--proxy-backend-addresses=narcissus:3306 \
--proxy-backend-addresses=nostromo:3306

I've not tried any of these suggestions but hopefully they'll help

Andy

p.s. One question is why are you having to go back to the DB if the caches are available? 

On 1 Aug 2012, at 05:46, Jan Vogel wrote:

> 
> Hello Ensembl, 
> 
> I have a question about mysql load balancing and the scaling of the variant effect predictor script - I'm running an in-house mirror of the ensembl mysql server and I'm using the variant effect predictor VEP script.
> 
> Currently, the mySQL server gets pretty high load when I'm running larger number of jobs in parallel. Experiments with tcmalloc improved the performance slightly; sold-state disks did not improve as much as I hoped; 
> 
> I currently see 2 main options to improve the performance/scalability:
> 
> A) The nicest setup is a dynamic scaling of the mysql servers;  (maybe some virtual machines) and some monitoring software, which automatically brings more servers up if needed; of course all servers all listen to the same virtual IP and port; downside is that this needs quite a bit of configuration etc; 
> 
> B) The simplest solution I think is to change the code in the ENSEMBL  REGISTRY so I can add more servers for the same cores and have the code randomly pick a server; 
> 
> Is EnsEMBL using any database load balancing / automatic scaling for it's MySQL databases for read-only data other than the mirrors ? Are there any suggestions on what solutions to focus on or have you tried anything which did not work ? 
> 
> Are other people out there running into the same problems ? (and have solutions ready?) 
> 
> Cheers, 
> 
>   Jan Vogel 
> 
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list