Just spotted amazing article, how Citizendium built better infrastructure than Wikipedia’s. There lots of fascinating details there, like…
They went with PostgreSQL for a number of reasons, including better scalability. PostgreSQL is an MVCC database. Unlike Wikipedia, Citizendium never has to lock the database for reads and writes. MySQL can do a lot of things quick and replicate them to slave servers, but PostgreSQL excels at complex functions and full features like JOINs and can do complicated categories and full text searches faster than Wikipedia.
If PG can function without locks, it must be definitely more scalable. InnoDB uses mutexes, spinlocks, etc – and that internal locking can be a bottleneck in many cases. Additionally, if a row is updated, a lock on the record is acquired. It is still a question how PG maintains ACID without any locks, got to research on that more.
I’m aware that MySQL isn’t best at full-text search out there – but Wikipedia uses Lucene for full-text search, so it is somewhat strange to hear that Citizendium platform is faster in that regard. And… I’m not sure where JOIN performance is really faster there – especially when we do lots of covering-index based joins. Probably the key word there is ‘complex’, though I’m not sure what that means :-)
The first reason not to use MySQL was:
First, to be different from Wikipedia.
Indeed, I always support critical thinking! Though this one:
Finally, we felt from reading various mailing lists over mediawiki development that mediawiki was hitting the ceiling of the features MySQL can provide as a backend.
IIRC that came from single post on single mailing list from someone who is not running Wikipedia backend. Mhm.
Of course, their monthly traffic is equal to our single minute traffic, so some views might differ…