I need a new keyboard

Thanks to Jayant, I had to clean coffee off my screen, but keyboard is a bit more tricky. This is Postgres advocacy at its best:

The base timings are almost 5 times more in mysql as compared to pgsql.

Then in comments, Harrison points out:

From the numbers, it is hard to believe that PostgreSQL is really fsync’ing properly there. I doubt a laptop HDD can do 1000 writes per second. Even assuming we have perfect group commit going on, with two threads it is still 500 per second for the first test.

InnoDB’s numbers are about right on for writes per second for a cheap commodity hdd.

Jayant’s answer is:

You may be right. But postgres is a more advanced and reliable database as compared to mysql.

MySQL processlist phrase book

For your convenience, a short phrase book, starting with explanation of process states where MySQL is mostly working to look up data from tables:

  • “Sending data” – reading data from tables (or looking it up)
  • “Copying to tmp table” – reading data from tables (or looking it up)
  • “Copying to tmp table on disk” – query needs a rewrite
  • “statistics” – looking up data from tables
  • “Sorting result” – reading data from tables (or looking it up)

Locking is fun:

  • “update” – There is big chance that it is waiting for a lock, can be updating as well, on busy systems
  • “Locked” – Even bigger chance that it is MyISAM table waiting to be converted to InnoDB

Table opening is even funnier:

  • “Opening tables” – some other idiot thread is holding the LOCK_open or similar mutex
  • “Closing tables” – some other idiot thread is holding the LOCK_open or similar mutex
  • “Opening table” – the idiot thread (or not that smart DBA who thinks that table_cache should be set to current value)
  • “Flushing tables” – someone decided that flushing tables is good idea, but forgot to kill long running queries
  • “Waiting for tables” – same as “Flushing tables”

If you have replication:

  • “Connecting to server” – could not connect to server, waiting to retry
  • “Reconnecting after a failed master event read” – master and slave have same @server_id
  • “Registering slave on master” – master and slave have same @server_id
  • “Waiting to reconnect after a failed master event read” – master and slave have same @server_id
  • “Sending binlog event to slave” – actually, not sending binlog event – it is waiting for binlog mutex.

Few more tricky ones:

  • “Sleep” – last millisecond it was probably not sleeping, due to how process scheduling and kernel level buffering works probably it has finished right before kernel switched to the ‘SHOW PROCESSLIST’ thread.
  • “Writing to net” – usually, not writing to net (client has blocked) – actual writing to net is usually nearly non-blocking operation, as long as client doesn’t sleep.
  • “Killed” – actually, someone tried to kill it, but it doesn’t care, as it doesn’t check for thd->killed flag

And in the user column:

  • “unauthenticated user” – are you sure your DNS is working right?
  • “system user” – it isn’t user, these are (hopefully plural) replication threads, that don’t have any “user” credentials
  • “root” – either that is you running the PROCESSLIST, another session you forgot you are running or you forgot to create separate user for your application

I probably didn’t write quite a few important ones, but some of them are self-evident (such as “init” and “end”), and others probably will never show up :)

On throttling

Seems like nowadays InnoDB crash recovery is much funnier. Please allow me to show you something:

while (buf_pool->n_pend_reads >=
        recv_n_pool_free_frames / 2) {
  os_aio_simulated_wake_handler_threads();
  os_thread_sleep(500000);

Translation – if there’re more than 128 outstanding I/O requests, sleep for half a second. Hehehe. Optimized for floppies!

RESET SLAVE, bash edition

Nearly every job advertisement for MySQL DBA positions asks for ‘shell scripting’, so I decided to investigate what it is. I remembered some performance training, where I was told how forking is bad, and one should attempt to use shell features as much as possible (like, avoid paths to something, what can be used by builtin (e.g. don’t use /usr/bin/[, just pure [ instead )

I tried to automate one MySQL DBA task (reinitializing slave after relay log corruption or after copying in cloned dataset from other server) using just bash – and it kind of worked. From now on I can put ‘Shell scripting’ proudly on my resume :-)

Next step – learn JCL (some people think this is funny :)

What was your most complicated task solved with shell scripts? :)

#!/bin/bash

HOST=$1
RPASS=$(<passwordfile)

IFS="
"

mysql -h $HOST -e "STOP SLAVE"

for line in $(mysql -e "SHOW SLAVE STATUS" -E -h $HOST)
 do
	key=${line%:*}  # Split off what goes before :
	key=${key// }   # And trim spaces
	data=${line#*:} # Split off what goes after :
	data=${data// } # And trim spaces yet again!!!

	case $key in
		Exec_master_log_pos) LOGPOS=$data;;
		Relay_Master_Log_File) LOGFILE=$data;;
		Master_Host) RHOST=$data;;
		Master_User) RUSER=$data;;
	esac

 done

if [ -z $LOGPOS -o -z $LOGFILE ]; then
        echo "OMG FAIL NO POSITIONS KNOWN"; exit;
fi

mysql -f -h $HOST -e "
        RESET SLAVE;
        CHANGE MASTER TO
                MASTER_HOST='$RHOST',
                MASTER_USER='$RUSER',
                MASTER_PASSWORD='$RPASS',
                MASTER_LOG_FILE='$LOGFILE',
                MASTER_LOG_POS=$LOGPOS ;
        START SLAVE
        "

Update: rewrote the last ‘mysql’ command to avoid multiple unnecessary forks! (thanks, Aidai :)

MySQL password security

Simple password authentication schemes are usually guarding against one of two evils – either leaked password tables, or sniffed network traffic. In 4.1 MySQL introduced challenge-response scheme, that is guarding against both, just not both at the same time. How does one obtain the token required to log into the server? There are few methods:

  • Use gdb, dtrace or any other deep-inspection method to grab ‘buf’ in check_scramble()
  • Grab mysql.user table, sniff network traffic, calculate the hash_stage1 value out of public_seed (initial server packet), client’s reply and actual password hash
  • Intercept the password client-side at libmysqlclient level (again, gdb, dtrace, etc ;-)
  • Mix ethyl alcohol with the carbohydrate-based bipedal DBA, until it becomes quadrupedal and tells the password (might not be able to tell anything else at that moment).


MySQL Conference & Expo 2009

P.S. I was asked by MySQL Conference organizers to do some shameless plugs, so… yeah, I’m going to talk about first three methods in my talk on MySQL security, and do live trials of last method during conference evening program.

Rasmus vs me

Rasmus (of PHP fame) and me exchanged these nice words on Freenode’s #php (when discussing some PHP execution efficiency issues):

 
<Rasmus_> if that is your bottleneck, you are the world's best
          PHP developer
<Rasmus_> domas: then you are writing some very odd code.
          I think you will find you spend most of your time in
          syscalls and db stuff

<domas> Rasmus_: I can tell you're the best database developer, if
        you spend most of your time in db stuff :)
 

You can immediately see different application engineering perspectives :)

Many needles in a haystack

This is probably quite useless experiment I’ve been doing, but it all started when someone in #mysql@freenode gave a very simple task:

  • Table A has two million rows
  • Table B has two thousand rows
  • Find all rows in A, which have any B row as substring in some field.

So, my colleague Scott was there, and gave the answer which satisfied the guy:

SELECT * FROM a JOIN b ON a.field LIKE CONCAT('%',b.field,'%');

I started telling Scott, that this will result in too many nested loops, and that better combined pattern matching should be used. Well, my actual words were something like “use RLIKE”. So, folks replaced LIKE in above query with RLIKE, didn’t see too much of improvement and made some fun of me.

So, I thought I should provide some decent response to mocking :-) I downloaded ‘most common American last names of 19th century’ from some website out there, took ~2000 of them, also built a list of all Wikipedia article titles, that have space in them (just to reduce dataset a bit, and make more humans show up there).

My initial poking showed around double speed increase when using combined pattern of RLIKE, and using PCRE UDFs provided double speed over RLIKE. I have no idea what I did wrong back then (or doing wrong now), but simple LIKE with nested row lookup is faster on my current test. Still, there’s something else I wanted to show :)

GNU grep has ‘-F’ functionality, which Interprets PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched. Actually, it is quite well optimized, and uses nice algorithm, located in file kwset.c. This is what some comment in that file tells:

The algorithm implemented by these routines bears a startling resemblence to one discovered by Beate Commentz-Walter, although it is not identical.

See “A String Matching Algorithm Fast on the Average,” Technical Report, IBM-Germany, Scientific Center Heidelberg, Tiergartenstrasse 15, D-6900 Heidelberg, Germany. See also Aho, A.V., and M. Corasick, “Efficient String Matching: An Aid to Bibliographic Search,” CACM June 1975, Vol. 18, No. 6, which describes the failure function used below.

So, let’s try standing on shoulders of giants. Actually, I’m not even that smart to find this, it was actually Tim who researched and implemented this as PHP extension to make some of Wikipedia’s code faster.

So I shamelessly took few files from grep, and wrote some crappy MySQL UDF glue (it is just for demonstration, would need proper engineering to make it usable for general purposes).

So, what kind of performance would a custom-tailored algorithm for the task give…

Simple LIKE matching:

select straight_join * from names join lasts on
binary name like concat("%",last,"%") limit 1000;
1000 rows in set (3.80 sec)

Combined RLIKE matching (I don’t know why it is slower – it was much better on some previous test):

SELECT CONCAT("(",GROUP_CONCAT(last SEPARATOR "|"),")")
from lasts into @re;

SELECT * FROM names WHERE BINARY name RLIKE @re;
1000 rows in set (25.99 sec)

Combined PECL UDF matching:

SELECT CONCAT("/(",GROUP_CONCAT(last SEPARATOR "|"),")/")
from lasts into @re;

select * from names where preg_rlike(@re,name) limit 1000;
1000 rows in set (8.10 sec)

Algorithm theft:

SELECT fss_prepare(last) FROM lasts;
SELECT fss_search(name) FROM names LIMIT 1000;
1000 rows in set (0.02 sec)

Damn it, too fast, this must be some mistake, let’s try more rows:

10000 rows in set (0.07 sec)
100000 rows in set (0.62 sec)
551971 rows in set (3.50 sec)

So, by using customized algorithm we got 600x performance. What does Scott say about this?

domas: my answer took about 10 minutes.
       yours has taken what, week and a half?

.. Bastard… Oh well, priorities priorities :)

MySQL support fun, multiplication

There was a question how to do an aggregate multiplication in MySQL. MySQL does not provide such functionality, so we were looking at various workarounds.

We discussed UDF interface that allows to construct custom aggregates, also did look at @a:=@a*field hack, and how different initializers have results wrapped differently.

Then Scott killed our discussion with this simple query:

select exp(sum(log(c)))

Of course, thats nice mathematical approach to solve the multiplication issue with just SUM() at hands, but while we were still in awe, Scott explained it with this wit:

The Great Flood is over, and as the animals are departing 3×3, Noah is blessing each, saying “Go forth, be fruitful and multiply.” Two snakes come down the ramp and say to Noah, “We can’t. We’re adders.”

Noah groans and says, “That’s the worst pun I’ve heard in 40 days and nights. Go sit in that pile of sticks until I can deal with you!”

After mucking out the ark, Noah returns to the sticks and lifts them up, to find baby snakes everywhere. “What happened?” he asks. “I thought you were having problems!”

The snakes reply, “Even adders can multiply with logs.”

:-)