Five minutes of MediaWiki performance tuning

MediaWiki is quite complex package, and some even trivial features are not the ones that should be enabled on sites having more load. Though it is quite modular, still lots of code has to be executed, and some of it requires additional steps to be done. Continue reading “Five minutes of MediaWiki performance tuning”

db abstraction: zend framework vs mediawiki

So now as Zend have released their framework I couldn’t resist to look at it’s database classes (there’s much more other stuff in it) and compare with our own bits in mediawiki.

Thumbs up for (what is a must in abstraction layer) plain query building functions. Those are easy to understand and seem to work :)

One of issues with SQL programming is that people queries like text instead of ‘structured queries’. This is where Zend’s $db->quoteInto() is overused – even where there’s no need to people are forced to embed SQL bits, instead of thinking just about data flow and relations.

In order to add additional criteria in MW, you’d do:


Zend Framework’s query builder offers slightly different approach. There’s Select object, which still eats text as parameters:

$select->where('field = ?',$value);

What raised my eyebrows – $select->from('foo', '*');. First of all, SELECT * is evil. On the other hand, it is interesting design decision to put it together with ‘FROM’. How would one issue ‘SELECT 1’ query this way? :)

The $db->insert() calls appeared to be identical in both libraries, you just pass an array of key=>value pairs and a table to put all that into. The $db->update though had same arguments, but different evaluations of those. For Zend you have to pass just a plaintext query in form of $db->quoteInto(‘field=’,$value). MW though allows to pass arrays of conditions:

$where = array ( 'group' => $groupid, 'parent' => array ( 1,2,3,4));

Of course you may pass a plain string to MW’s DB abstraction, type detection helps a lot there :)

Same differences applies to delete method. People used to tell, that you don’t need any query building once you’ve got placeholder replacement in SQL queries. Usually what is missed is that abstractions are not just for portability or security, but also for easier way to do things. If plain text is to be replaced into array passing, doing it completely would help more (I’m not advocating to split strings into arrays of chars though ;-) It still looks better when one appends an item to an array than string concatenation.

‘SELECT’ builder has much more differences. We packed handling of ORDER BY, GROUP BY, HAVING, LIMIT, etc into $options, while Zend provides separate methods for that. I immediately missed ‘SELECT … FOR UPDATE’ in documentation – this is one of most important options in proper web applications.

The OFFSET/LIMIT support (and even recommendation for paging) requires serious attitude readjustment. Scanning result sets _is_ expensive and numeric offsets shouldn’t be shown as a ‘nice way out’.

Lack of outer JOINs (as well as lack of workarounds in documentation) was surprising too.

Of course, providing table as an object was a nice thing to do, especially with possibility to add additional logic to a table object:

class RoundTable extends Zend_Db_Table {
    public function insert($data) {
        if (empty($data['created_on'])) { $data['created_on'] = time(); }
        return parent::insert($data); }

It is nice to see such library pushed – as it is in it’s 0.1.1 state, there’s lots of room for improvement in future :)

There’re other libraries out there, but those are undocumented, unpackaged and not that well maintained. And of course, there’s Ruby On Rails, which was poorly mimicked in Zend’s Active Record implementation. I don’t like ORM myself, but with hype around it, Zend could try to make their ORM better (or rather, existing, as there’re no relations in their objects). Even Django looks much better in this field. So now I’m just waiting for 1.0 ;-)

mediawiki graphic profile

So I used amazing tool named KCacheGrind to parse XDebug output of MediaWiki. Of course, I immediately started refactoring some places in code. It is a random web hit, I don’t know even what it is, but it looks nasty and waiting for a gentle touch. Kill remote interwiki caches! :)

Here is what I got (triple click to zoom):

Here is what I used:


%d bloggers like this: