domas mituzas

MySQL password security

Simple password authentication schemes are usually guarding against one of two evils – either leaked password tables, or sniffed network traffic. In 4.1 MySQL introduced challenge-response scheme, that is guarding against both, just not both at the same time. How does one obtain the token required to log into the server? There are few methods:

Use gdb, dtrace or any other deep-inspection method to grab ‘buf’ in check_scramble()
Grab mysql.user table, sniff network traffic, calculate the hash_stage1 value out of public_seed (initial server packet), client’s reply and actual password hash
Intercept the password client-side at libmysqlclient level (again, gdb, dtrace, etc ;-)
Mix ethyl alcohol with the carbohydrate-based bipedal DBA, until it becomes quadrupedal and tells the password (might not be able to tell anything else at that moment).

P.S. I was asked by MySQL Conference organizers to do some shameless plugs, so… yeah, I’m going to talk about first three methods in my talk on MySQL security, and do live trials of last method during conference evening program.

Poor man’s contention profiling

I wrote already about poor man’s query profiling techniques last summer. Today I’m going to share poor man’s contention profiling tool, for all these poor souls which do not have Solaris with dtrace or pstack, don’t want to run tcmalloc contention profiler, and simply need some easy hack to see ‘what the heck is going on in my server’. Here it is:

gdb \
    -ex "set pagination 0" \
    -ex "thread apply all bt" \
    --batch -p $(pidof mysqld)

Run few times, and you will have enough samples to start judging. Do note, this may stop the process execution for a second, so do not spam it in too tight loop.
Once you have results it is just a matter of 20-liner script to extract any useful calculations :)

P.S. I’d love to see efficient pstack implementation for 64-bit Linux :)

update: this now lives at http://poormansprofiler.org

Eyecandy mutexes!

In my quest of making MySQL usable, I managed to hit contention that wasn’t spotted by performance masters before. Meet most useless mutex ever (this is actual contention event, not just a hold):

Count     nsec Lock
 1451   511364 mysqld`ut_list_mutex

      nsec ---- Distribution --- count    Stack
      2048 |@@                 |   132    libc.so.1`mutex_lock_impl
      4096 |@@@                |   186    libc.so.1`mutex_lock
      8192 |                   |     3    ut_malloc_low
     16384 |@@                 |   137    mem_heap_create_block
     32768 |@                  |   105    row_sel_store_mysql_rec
     65536 |@                  |    89    row_search_for_mysql
    131072 |@                  |    99    ha_innobase::general_fetch
    262144 |@@                 |   131    ha_innobase::rnd_next
    524288 |@@@                |   195    rr_sequential
   1048576 |@@@                |   223
   2097152 |@@                 |   137
   4194304 |                   |    14
----------------------------------------------------------------

ut_list_mutex guards a memory structure which has all memory blocks allocated by InnoDB (via ut_malloc/ut_free) in it.
It has two uses:

Printing “Total memory allocated” in SHOW INNODB STATUS (though this can still be implemented lock-free)
Deallocating all memory on shutdown (though, all modern operating systems do that anyway, so this is purely just to shut up valgrind)

If you have any BLOB/TEXT data in your tables, you’re definitely hitting this contention spot (it is #1 contention in such cases).

Fix? Kill the eyecandy, replace ut_malloc and ut_free with direct calls to malloc() and free(), oh and of course, use scalable allocators like tcmalloc or Hoard.

Rasmus vs me

Rasmus (of PHP fame) and me exchanged these nice words on Freenode’s #php (when discussing some PHP execution efficiency issues):

 
<Rasmus_> if that is your bottleneck, you are the world's best
          PHP developer
<Rasmus_> domas: then you are writing some very odd code.
          I think you will find you spend most of your time in
          syscalls and db stuff

<domas> Rasmus_: I can tell you're the best database developer, if
        you spend most of your time in db stuff :)

You can immediately see different application engineering perspectives :)

Many needles in a haystack

This is probably quite useless experiment I’ve been doing, but it all started when someone in #mysql@freenode gave a very simple task:

Table A has two million rows
Table B has two thousand rows
Find all rows in A, which have any B row as substring in some field.

So, my colleague Scott was there, and gave the answer which satisfied the guy:

SELECT * FROM a JOIN b ON a.field LIKE CONCAT('%',b.field,'%');

I started telling Scott, that this will result in too many nested loops, and that better combined pattern matching should be used. Well, my actual words were something like “use RLIKE”. So, folks replaced LIKE in above query with RLIKE, didn’t see too much of improvement and made some fun of me.

So, I thought I should provide some decent response to mocking :-) I downloaded ‘most common American last names of 19th century’ from some website out there, took ~2000 of them, also built a list of all Wikipedia article titles, that have space in them (just to reduce dataset a bit, and make more humans show up there).

My initial poking showed around double speed increase when using combined pattern of RLIKE, and using PCRE UDFs provided double speed over RLIKE. I have no idea what I did wrong back then (or doing wrong now), but simple LIKE with nested row lookup is faster on my current test. Still, there’s something else I wanted to show :)

GNU grep has ‘-F’ functionality, which Interprets PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched. Actually, it is quite well optimized, and uses nice algorithm, located in file kwset.c. This is what some comment in that file tells:

The algorithm implemented by these routines bears a startling resemblence to one discovered by Beate Commentz-Walter, although it is not identical.

See “A String Matching Algorithm Fast on the Average,” Technical Report, IBM-Germany, Scientific Center Heidelberg, Tiergartenstrasse 15, D-6900 Heidelberg, Germany. See also Aho, A.V., and M. Corasick, “Efficient String Matching: An Aid to Bibliographic Search,” CACM June 1975, Vol. 18, No. 6, which describes the failure function used below.

So, let’s try standing on shoulders of giants. Actually, I’m not even that smart to find this, it was actually Tim who researched and implemented this as PHP extension to make some of Wikipedia’s code faster.

So I shamelessly took few files from grep, and wrote some crappy MySQL UDF glue (it is just for demonstration, would need proper engineering to make it usable for general purposes).

So, what kind of performance would a custom-tailored algorithm for the task give…

Simple LIKE matching:

select straight_join * from names join lasts on
binary name like concat("%",last,"%") limit 1000;
1000 rows in set (3.80 sec)

Combined RLIKE matching (I don’t know why it is slower – it was much better on some previous test):

SELECT CONCAT("(",GROUP_CONCAT(last SEPARATOR "|"),")")
from lasts into @re;

SELECT * FROM names WHERE BINARY name RLIKE @re;
1000 rows in set (25.99 sec)

Combined PECL UDF matching:

SELECT CONCAT("/(",GROUP_CONCAT(last SEPARATOR "|"),")/")
from lasts into @re;

select * from names where preg_rlike(@re,name) limit 1000;
1000 rows in set (8.10 sec)

Algorithm theft:

SELECT fss_prepare(last) FROM lasts;
SELECT fss_search(name) FROM names LIMIT 1000;
1000 rows in set (0.02 sec)

Damn it, too fast, this must be some mistake, let’s try more rows:

10000 rows in set (0.07 sec)
100000 rows in set (0.62 sec)
551971 rows in set (3.50 sec)

So, by using customized algorithm we got 600x performance. What does Scott say about this?

domas: my answer took about 10 minutes.
       yours has taken what, week and a half?

.. Bastard… Oh well, priorities priorities :)

plockstat fail!

Solaris has this beautiful tool ‘plockstat’ that can report application lock contention and hold events (hold events turn into contention events at parallelism.. ;-) It is just a frontend to a set of dtrace rules, that monitor mutexes and rwlocks.

Today I was testing an edge case (what happens, when multiple threads are scanning lots of same data) – and plockstat/dtrace indicated that there were zero (0!!!) lock waits. I tried using ‘prstat’ with microstate accounting, and it indeed pointed out that there’s lots of LCK% activity going on (half of CPU usage…). The dtrace profiling oneliner (dtrace -n 'profile-997 {@a[ustack()]=count()}') immediately revealed the culprit:

              libc.so.1`clear_lockbyte+0x10
              libc.so.1`mutex_unlock+0x16a
              mysqld`mutex_exit+0x1d
              mysqld`buf_page_optimistic_get_func+0xa0

So, plenty of CPU time was spent when trying to unlock mutex (what seemed strange), but didn’t seem that strange once I noticed the code:

do {
	old = *lockword64;
	new = old & ~LOCKMASK64;
} while (atomic_cas_64(lockword64, old, new) != old);

So, there’s unaccounted busy loop (it is just part of hold event in dtrace). What is odd, is that nobody expects this place to loop too much, what happens here – it gives away mutex to other thread, which wants it. So, instead of having the new owner spin-lock (where it accounts properly), it has old owner spin-locking. I’m not convinced this kind of behavior is one that should scale on large machines, but I’m not much of a locking expert.

Without proper instrumentation plockstat failed to provide information about locks that were consuming half of CPU time. I hope that really was just an edge case – more real testing will follow soon, will see if plockstat will fail as much. Oh well, will find the information I need anyway :) Lesson learned – treat pretty much everything with grain of salt, especially when OS tells you mysql has no lock contention, haha.

Charsets mutex meets InnoDB

When InnoDB compares data (like, when looking up in indexes), it actually asks help from MySQL – character sets may provide different rules for evaluation. MySQL then looks up character set information. This is where the fun begins – if character set used for comparison is default server character set, or latin1 – there are shortcuts. If not (say, some smart developer decided to use Unicode without forcing DBA to set default server charset, as it isn’t needed) – a very nice internal routine is called, which at the very beginning does exactly this:

 /*
    To make things thread safe we are not
    allowing other threads to interfere
    while we may changing the cs_info_table
  */
  pthread_mutex_lock(&THR_LOCK_charset);

Apparently, ha_innodb.cc has such comment too:

/* Use the charset number to pick the right charset struct for
the comparison. Since the MySQL function get_charset may be
slow before Bar removes the mutex operation there, we first
look at 2 common charsets directly. */

if (charset_number == default_charset_info->number) {
    charset = default_charset_info;
} else if (charset_number == my_charset_latin1.number) {
    charset = &my_charset_latin1;
} else {
    charset = get_charset(charset_number, MYF(MY_WME));
    [...]

I’ll avoid going into discussions why such global lock at every row operation is harmful, but in case anyone is hitting lots of mutex contention there – just set default server character set to what your databases are in (or use binary or latin1, or add a clause for utf8 up here, or remove mutex, nobody is changing your character sets anyway ;-)

Update: Some of the discussion is at bug#42649

Percona performance conference

Heee, Baron announced “Percona Performance Conference”.

How do I feel when somebody schedules that on top of MySQL Conference? Bad. Seriously, this was totally uncool.

I sure understand that Percona folks have to give same talk over and over again (of course, there’re few new things every year), and need venue for that, but… it is incredible work and preparation to come up with new topics too, and that involves lots of work and research. I may sound harsh, but I really don’t feel well, when people we should work together, instead end up blackmailing.

Update: apparently I was seriously misguided back then, Percona seems to have been shunned out of MySQL Conference by organizers and this was their way to get back into the community.

mydumper

Last weekend I ended up working on small pet project – and today I’m kind of releasing it.

So, I had that idea that there’s no good tool to do logical dump of MySQL data for large sites – mysqldump doesn’t provide too much of I/O pressure, mk-parallel-dump is closer, but it doesn’t do consistent snapshots, uses same mysqldump, as well as is written in Perl (haha!), and… I just wanted something new to hack, as proof of concept. For a while to use it one had to edit constants in code, but my colleague Mark contributed options support and it doesn’t need recompiles anymore to run it :)

So, let me introduce mydumper. It doesn’t dump table definitions, all it does is extracting data and writing it to files, fast. Continue reading “mydumper”

best form of backup

Apparently best form of backup is entirely unconfigured slaves (especially distribution packages). After some destructive cron job kicked in, and executed large large data deletion on master, once slave started executing it, ran out of buffer pool for locks and such, and started crashing. Result: it had all data just up to the destructive action.

Fire it up with –skip-slave-start, crash recovery on default transaction logs is immediate, all data is there – what better backup would one expect?*

* proper snapshots with proper logs

	markcallaghan (@mark… on MySQL does not need SQL
	markcallaghan (@mark… on MySQL does not need SQL
	Domas Mituzas on MySQL does not need SQL
	Marc on MySQL does not need SQL
	Nils Meyer on linux memory management for…