pmysql (multi-server mysql client)

To have more visibility of our systems I needed a method to quickly query large sets of MySQL servers. Though there are few methods like ‘xargs -P’ and dsh, they are overkill for the SQL-only job, and simply not fast enough.

So, I hacked pmysql (274 lines of C) – a tool that is able to run queries on multiple servers in parallel. It can run SQL query on 1000 servers in under 0.1s, and scales well with larger sets of servers.

Use cases can be various – from configuration/schema checks (I really enjoyed using information_schema), to ‘global processlist’ analysis.
Ability to crunch hundreds of megabytes of processlist info in few seconds – priceless, so is schema verification of millions of tables.

It is fast enough to use it as backend for some of real-time health check CGI scripts :-)

To get the source:
bzr co lp:mysqlatfacebook/tools/pmysql

P.S. Oh, I know that nearly nobody needs it, but thats not a reason for keeping it closed :)

21 thoughts on “pmysql (multi-server mysql client)”

  1. Domas,
    This is great!
    I want to argue with your concluding words (“…nearly nobody needs it…”)
    I can think right away of a use many may need this for:
    A web page executes a long query. It’s taking a looong time. But – where is it issued? Queries are load-balanced between master and 6 slaves.
    Your tool can first find the slave on which this query performs on.

    1. if you have a query that is taking a long time and you don’t use some sort of slow query log analysis and you have more than 1 slave you’re doing your reads from, you’re doing it wrong :)

      1. Not really, I prefer global processlist snapshots to slow query logs on thousands of machines (or even aggregated) at any time :)

        It has that awesome feeling of ‘now’

  2. Shlomi, for that one can use simple ‘for i in …’ loop ;-) I mean, the performance needs to scale to thousands of boxes isn’t that necessary :)

  3. Domas,

    What platform are you building this on, and how?

    I tried (on CentOS 5.5):

    gcc pmysql.cc

    But get a raft of errors:

    pmysql.cc:18:19: error: mysql.h: No such file or directory
    pmysql.cc:21:18: error: glib.h: No such file or directory
    pmysql.cc:40: error: ‘gboolean’ does not name a type
    pmysql.cc:41: error: ‘gboolean’ does not name a type
    pmysql.cc:43: error: ‘GOptionEntry’ does not name a type
    pmysql.cc:59: error: ‘GString’ has not been declared
    pmysql.cc: In function ‘void write_g_string(int, int*)’:
    pmysql.cc:61: error: expected initializer before ‘*’ token
    pmysql.cc:64: error: ‘mutex’ was not declared in this scope
    pmysql.cc:65: error: ‘g_mutex_new’ was not declared in this scope
    pmysql.cc:67: error: ‘mutex’ was not declared in this scope
    pmysql.cc:67: error: ‘g_mutex_lock’ was not declared in this scope
    pmysql.cc:68: error: request for member ‘len’ in ‘* data’, which is of non-class type ‘int’
    pmysql.cc:69: error: request for member ‘str’ in ‘* data’, which is of non-class type ‘int’
    pmysql.cc:69: error: request for member ‘len’ in ‘* data’, which is of non-class type ‘int’
    pmysql.cc:71: error: ‘g_critical’ was not declared in this scope
    pmysql.cc:76: error: ‘g_mutex_unlock’ was not declared in this scope
    pmysql.cc: At global scope:
    pmysql.cc:80: error: ‘gulong’ does not name a type
    pmysql.cc:110: error: variable or field ‘run_query’ declared void
    pmysql.cc:110: error: ‘gpointer’ was not declared in this scope
    pmysql.cc:110: error: ‘gpointer’ was not declared in this scope
    pmysql.cc:110: error: initializer expression list treated as compound expression
    pmysql.cc:111: error: expected ‘,’ or ‘;’ before ‘{‘ token

    I have the mysql-devel package installed (which provides mysql.h), but I’m more of a sysadmin than coder so am not sure what the problem is.

    R.

    1. Well, quite different, did you look at OmniSQL?

      Perl (with unpackaged deps) vs C, single-threaded vs multi-threaded, XML configuration-file vs dynamic server lists.

      pmysql does not have concept of shards, though.

  4. I deal with gobs of servers and real estate data / trends.

    Thank you for this.

    ~ Joel

  5. Domas,

    Not a bad idea … I guess that you can run N connections on 1 server too. Can you ???

    I had a version of mysqlgui that could have a GUI interface to N servers.

    I will work on it as well as on my XEmacs mode for MySQL, that can connect to N servers, once when I retire …

    1. Sinisa, if you mention same server twice, it will get the queries twice. I haven’t really optimized for this – it builds the server list in memory, so better not to flood with millions of entries (though currently jobs get passed to threads just as server name strings).

  6. Domas,

    You are very cool for having written this! I’m trying like hell to debug a stored proc that *may* have a race condition. No matter what I’ve used with test programs – forks, threads, timing tricks, I can’t get the clients calling it to execute the proc without there being some sort of order to when they run. Maybe your tool will help!

    –Patirck

  7. Hello Domas, I don’t use bzr client.

    Do you have any binary for pmysql? which i can download.

    Please let me know the exact url to download it. I need it for 32 and 64 bit both.

    Appreciate it.

      1. Is there any other way to download the source besides using bazzar? thanks

Comments are closed.

%d bloggers like this: