I need a new keyboard

Thanks to Jayant, I had to clean coffee off my screen, but keyboard is a bit more tricky. This is Postgres advocacy at its best:

The base timings are almost 5 times more in mysql as compared to pgsql.

Then in comments, Harrison points out:

From the numbers, it is hard to believe that PostgreSQL is really fsync’ing properly there. I doubt a laptop HDD can do 1000 writes per second. Even assuming we have perfect group commit going on, with two threads it is still 500 per second for the first test.

InnoDB’s numbers are about right on for writes per second for a cheap commodity hdd.

Jayant’s answer is:

You may be right. But postgres is a more advanced and reliable database as compared to mysql.

MySQL processlist phrase book

For your convenience, a short phrase book, starting with explanation of process states where MySQL is mostly working to look up data from tables:

  • “Sending data” – reading data from tables (or looking it up)
  • “Copying to tmp table” – reading data from tables (or looking it up)
  • “Copying to tmp table on disk” – query needs a rewrite
  • “statistics” – looking up data from tables
  • “Sorting result” – reading data from tables (or looking it up)

Locking is fun:

  • “update” – There is big chance that it is waiting for a lock, can be updating as well, on busy systems
  • “Locked” – Even bigger chance that it is MyISAM table waiting to be converted to InnoDB

Table opening is even funnier:

  • “Opening tables” – some other idiot thread is holding the LOCK_open or similar mutex
  • “Closing tables” – some other idiot thread is holding the LOCK_open or similar mutex
  • “Opening table” – the idiot thread (or not that smart DBA who thinks that table_cache should be set to current value)
  • “Flushing tables” – someone decided that flushing tables is good idea, but forgot to kill long running queries
  • “Waiting for tables” – same as “Flushing tables”

If you have replication:

  • “Connecting to server” – could not connect to server, waiting to retry
  • “Reconnecting after a failed master event read” – master and slave have same @server_id
  • “Registering slave on master” – master and slave have same @server_id
  • “Waiting to reconnect after a failed master event read” – master and slave have same @server_id
  • “Sending binlog event to slave” – actually, not sending binlog event – it is waiting for binlog mutex.

Few more tricky ones:

  • “Sleep” – last millisecond it was probably not sleeping, due to how process scheduling and kernel level buffering works probably it has finished right before kernel switched to the ‘SHOW PROCESSLIST’ thread.
  • “Writing to net” – usually, not writing to net (client has blocked) – actual writing to net is usually nearly non-blocking operation, as long as client doesn’t sleep.
  • “Killed” – actually, someone tried to kill it, but it doesn’t care, as it doesn’t check for thd->killed flag

And in the user column:

  • “unauthenticated user” – are you sure your DNS is working right?
  • “system user” – it isn’t user, these are (hopefully plural) replication threads, that don’t have any “user” credentials
  • “root” – either that is you running the PROCESSLIST, another session you forgot you are running or you forgot to create separate user for your application

I probably didn’t write quite a few important ones, but some of them are self-evident (such as “init” and “end”), and others probably will never show up :)

On throttling

Seems like nowadays InnoDB crash recovery is much funnier. Please allow me to show you something:

while (buf_pool->n_pend_reads >=
        recv_n_pool_free_frames / 2) {
  os_aio_simulated_wake_handler_threads();
  os_thread_sleep(500000);

Translation – if there’re more than 128 outstanding I/O requests, sleep for half a second. Hehehe. Optimized for floppies!

RESET SLAVE, bash edition

Nearly every job advertisement for MySQL DBA positions asks for ‘shell scripting’, so I decided to investigate what it is. I remembered some performance training, where I was told how forking is bad, and one should attempt to use shell features as much as possible (like, avoid paths to something, what can be used by builtin (e.g. don’t use /usr/bin/[, just pure [ instead )

I tried to automate one MySQL DBA task (reinitializing slave after relay log corruption or after copying in cloned dataset from other server) using just bash – and it kind of worked. From now on I can put ‘Shell scripting’ proudly on my resume :-)

Next step – learn JCL (some people think this is funny :)

What was your most complicated task solved with shell scripts? :)

#!/bin/bash

HOST=$1
RPASS=$(<passwordfile)

IFS="
"

mysql -h $HOST -e "STOP SLAVE"

for line in $(mysql -e "SHOW SLAVE STATUS" -E -h $HOST)
 do
	key=${line%:*}  # Split off what goes before :
	key=${key// }   # And trim spaces
	data=${line#*:} # Split off what goes after :
	data=${data// } # And trim spaces yet again!!!

	case $key in
		Exec_master_log_pos) LOGPOS=$data;;
		Relay_Master_Log_File) LOGFILE=$data;;
		Master_Host) RHOST=$data;;
		Master_User) RUSER=$data;;
	esac

 done

if [ -z $LOGPOS -o -z $LOGFILE ]; then
        echo "OMG FAIL NO POSITIONS KNOWN"; exit;
fi

mysql -f -h $HOST -e "
        RESET SLAVE;
        CHANGE MASTER TO
                MASTER_HOST='$RHOST',
                MASTER_USER='$RUSER',
                MASTER_PASSWORD='$RPASS',
                MASTER_LOG_FILE='$LOGFILE',
                MASTER_LOG_POS=$LOGPOS ;
        START SLAVE
        "

Update: rewrote the last ‘mysql’ command to avoid multiple unnecessary forks! (thanks, Aidai :)

MySQL password security

Simple password authentication schemes are usually guarding against one of two evils – either leaked password tables, or sniffed network traffic. In 4.1 MySQL introduced challenge-response scheme, that is guarding against both, just not both at the same time. How does one obtain the token required to log into the server? There are few methods:

  • Use gdb, dtrace or any other deep-inspection method to grab ‘buf’ in check_scramble()
  • Grab mysql.user table, sniff network traffic, calculate the hash_stage1 value out of public_seed (initial server packet), client’s reply and actual password hash
  • Intercept the password client-side at libmysqlclient level (again, gdb, dtrace, etc ;-)
  • Mix ethyl alcohol with the carbohydrate-based bipedal DBA, until it becomes quadrupedal and tells the password (might not be able to tell anything else at that moment).


MySQL Conference & Expo 2009

P.S. I was asked by MySQL Conference organizers to do some shameless plugs, so… yeah, I’m going to talk about first three methods in my talk on MySQL security, and do live trials of last method during conference evening program.

Rasmus vs me

Rasmus (of PHP fame) and me exchanged these nice words on Freenode’s #php (when discussing some PHP execution efficiency issues):

 
<Rasmus_> if that is your bottleneck, you are the world's best
          PHP developer
<Rasmus_> domas: then you are writing some very odd code.
          I think you will find you spend most of your time in
          syscalls and db stuff

<domas> Rasmus_: I can tell you're the best database developer, if
        you spend most of your time in db stuff :)
 

You can immediately see different application engineering perspectives :)