on wikipedia and mariadb

There’s some media coverage about Wikipedia switching to MariaDB, I just wanted to point out that performance figures cited are somewhat incorrect and don’t attribute gains to correct authors.

Proper performance evaluation should include not just MariaDB 5.5 but Oracle’s MySQL 5.5 version too, because thats where most of performance development happened (multiple buffer pools, rollback segments, change buffering et al).

5.5 is faster for some workloads, 5.1-fb can outperform 5.5 in other workloads (ones with lots of IO), it is good to know that there’s beneficial impact from upgrading (though I’d wait for 5.6), but it is important to state that it is an effort from Oracle as well, not just MariaDB developers.

P.S. As far as I understand, decision to switch is political, and with 5.6 momentum right now it may not be the best one, 5.6 is going to rock :-)

This entry was posted in mysql, wikitech. Bookmark the permalink.

20 Responses to on wikipedia and mariadb

  1. mdcallag says:

    It is too soon for quotes like this take from one of the articles you linked. It is very likely that the MariaDB Foundation will be awesome. But all we have now is a few press releases and not many details.

    “More so, I think it’s in WMF’s and the open source communities interest to coalesce around the MariaDB Foundation as the best route to ensuring a truly open and well supported future for mysql derived database technology.”

    • Henrik Ingo says:

      Personally I think Wikipedia is doing the right thing (given their political motives). Those in the MySQL community that would like to see a strong and community driven MariaDB Foundation should just take the ball and run with it. If the optimism later turns out to be unfounded, you can always go home and migrate back to another fork later.

      (Domas point about attribution and MySQL 5.5 and 5.6 is still valid though.)

      • mdcallag says:

        I predict problems when technology decisions are made for political reasons. Suggesting that MariaDB was selected for political reasons either lets them get away without providing sufficient value or undermines their effort. Because if MariaDB is that good, then claim that it was chosen for being better.

        • Henrik Ingo says:

          You are of course right, but then again historically adoption of open source software has to a large degree been driven by such, shall we call them, “political” decisions. For example, for a long time in the 90′s and early 2000 you could credibly claim that Oracle was the better database or Windows a better desktop, still people chose open source alternatives because they believed it was a better long term strategic decision, with better long term potential. (Or they just chose it because it was free of cost, what do I know…)

          Now that everything is open source anyway, I see that this traditional thinking lives on in the desire to choose the “more open” alternative. For example preferring certain licenses over others, or preferring non-profit controlled community projects over single-vendor projects.

          Still, just because you move your domain name to a non-profit foundation doesn’t mean you will automatically become technically superior (even long term), nor does it even automatically guarantee that a project is truly community governed. So the MariaDB Foundation certainly has a lot of expectations to live up to here.

  2. Owen says:

    Owen at Wikia here… we also run MediaWiki on top of MySQL 5.1+percona, and I am recommending that we hold off until 5.6 goes GA, but we’re looking forward to it!

  3. Twirrim says:

    Don’t forget, there is also a heap of changes in MariaDB that they’ve done with it that may also be attributing to the performance gains (e.g. the completely overhauled query optimizer). Anyone taking anything from this and directly attributing all the performance improvements to one or other is naive and possibly crazy. Apples to Pears is a pointless comparison :D

    Ultimately, though, the final paragraph is most important.
    “The main goal of migrating to MariaDB is not performance driven. More so, I think it’s in WMF’s and the open source communities interest to coalesce around the MariaDB Foundation as the best route to ensuring a truly open and well supported future for mysql derived database technology. Performance gains along the way are icing on the cake.”

    • I used to know query patterns well enough to say that optimizer doesn’t mean much in these cases, but I haven’t looked at it lately. Still, my educated guess is that optimizer doesn’t mean much there.

  4. James Day says:

    It’ll be really interesting to see whether the MariaDB Foundation continues the current MariaDB practice of not contributing changes to the upstream core distro. As most will know, MySQL has used a dual license approach from the days at MySQL AB. But MariaDB doesn’t routinely provide any dual license compatible licensing that lets the upstream use, or even safely look at, their changes.

    It’s amusing in a sad way to see them making the claims they are making given their history of minimal upstream contributions when all but a few specific items done to the Server by Oracle are passed on to all in the regular MySQL Community releases, where they are picked up and used by MariaDB and others.

    The Wikimedia Foundation change pretty much has to be solely political because the list of technical features given is in 5.6 that’s just about to become GA and the comparison of a 5.5 based version of MariaDB with a 5.1 version. 5.6 has something that’s very useful for Wikimedia-type servers, multi-threaded replication that helps slaves keep up when there are different wikis on the same server using different databases on the same server.

    As someone who used to have a heavy involvement with the Wikimedia Foundation database servers I’d hope that that Foundation will be comparing with the Oracle 5.6 version and also insisting on upstream contributions for changes to the core server, so the whole community can benefit from changes made, regardless of their license preferences and needs, not just one part of the community.

    James Day, as an individual with heavy past Wikimedia Foundation database server involvement.

    • Henrik Ingo says:

      James,

      I was kind of done commenting on the foundation thing, but I simply have to address the first part of your comment here:

      current MariaDB practice of not contributing changes to the upstream core distro. As most will know, MySQL has used a dual license approach from the days at MySQL AB. But MariaDB doesn’t routinely provide any dual license compatible licensing that lets the upstream use, or even safely look at, their changes. It’s amusing in a sad way to see them making the claims they are making given their history of minimal upstream contributions when all but a few specific items done to the Server by Oracle are passed on to all in the regular MySQL Community releases, where they are picked up and used by MariaDB and others.

      What you say here seems to be a quite common opinion that I hear from time to time expressed by Oracle engineers, so I want to comment on it here, not just as a reply to you specifically.

      Fact is: Oracle releases MySQL Community Server as GPL, and the MariaDB team uses (most of) that code and then releases their own work also as GPL. Why is it ok for Oracle to use GPL but unfair when MariaDB does it? (And FWIW, Percona does the same too.)

      Ok, so Oracle cannot use GPL code from others because they want to own everything to do a dual licensing business and also to develop closed source features. This is Oracle’s choice, and I don’t see why you would try to blame MariaDB for this? Better yet: I know for a fact that Monty was open to, and did offer both Sun and Oracle to acquire the MariaDB code under some commercial arrangement. Neither Sun or Oracle wanted to do that either – which again is a decision they are free to make.

      MariaDB represents tens of man years of work by some very good engineers. Financially it is maybe 4-5 million Euros of an investment (I don’t have facts on this, just multiplying by headcount). The suggestion that Monty is under some obligation to just donate all of that IPR to Oracle is just naive. What’s worse, when you and others make such comments, it tends to be rather upsetting for some of the MariaDB engineers which are doing some great work.

      Finally, I think you are exaggerating when you say Oracle engineers can’t even look at MariaDB code. Maybe that’s what a lawyer will tell you, but we all know the MySQL multi-threaded replication is based on Kristian Nielsen’s brilliant design, which turned out to be superior to the original Oracle design. It would be nice if Oracle employees could give credit to others when it is deserved, just like we praise MySQL when it is deserved.

      • Henrik, where does ‘we all know’ come from? [citation needed]
        What was original design and how it changed? :)

        • Henrik Ingo says:

          Sorry, I’m confused here. Multi-threaded replication is Oracle’s own design. The group commit fix is based on Kristian’s design. Now, maybe you then agree that for *group commit* this sequence of events is well known, but let me just copy the links here just in case:

          Mats Kindahl’s own design was described here:
          http://mysqlmusings.blogspot.fi/2010/04/binary-log-group-commit-implementation.html
          http://mysqlmusings.blogspot.fi/2011/07/binlog-group-commit-experiments.html
          This is basically based on the approach that each transaction reserves some memory from the binlog, to which it can then freely write on its own time, and when all are done, all of the data is committed / synced to disk.

          Kristian’s design was explained in this series of posts:
          http://kristiannielsen.livejournal.com/12810.html
          The approach here is basically to make sure that the commit order into binlog is the same as into InnoDB, which inherently makes group commit work and also has other benefits.

          The implementation that went into MySQL 5.6 is described here:
          http://mysqlmusings.blogspot.fi/2012/06/binary-log-group-commit-in-mysql-56.html
          It is based on maintaining commit order and the implementation from 2011 has been completely discarded. While it is written to sound different (using terms like “stage leader”) it also copies from Kristian the idea that a single thread calls the commit phase for all threads in the group, rather than waking up each thread one by one. This was a major performance improvement.

          Note that I never much familiarized myself with Mark Callaghan’s solution to the same problem, which predates both of these, but I understand all three are different.

          • mdcallag says:

            Facebook did it first — both the implementation and running it in production.

            MariaDB (Kristian Nielsen) did it best. Whether or not MySQL was aware of his design, it is similar, so perhaps MySQL did it best too, only after Kristian.

            I think I played a big part in getting the community to focus on this problem. I am happy about my role. And I will be even happier when we can remove the group commit code from the FB patch and use the upstream versions.

      • Henrik Ingo says:

        Btw, I just realized I forgot to mention that MariaDB does contribute security fixes to Oracle (and Percona) as a matter of policy. An example was the recent “login as any user without password” vulnerability, which they also found themselves and disclosed with “responsible disclosure” approach.

  5. Any similarity to Kristian’s design is purely coincidental, not being an expert in Kristian’s design I am sure there are similarities as two solutions to the same problem usually are, but I am also sure there are quite a few differences as the chosen design has a few unorthodox solutions.

    Anyone interested of the real truth behind how the code was developed can read my blog about it:
    http://mikaelronstrom.blogspot.se/2012/10/scalability-improvements-in-mysql-56.html

    I particularly liked our idea of splitting it in phases such that it would perform well independent of where the bottleneck resides (could reside in fsync part or in commit part or in write part).

    Mikael, as an individual that took an active part in developing the MySQL 5.6 group commit solution.

    • Thanks Mikael for explanation, I prefer this kind of story to “we all know” stories :-) Sorry, Henrik!

    • Henrik Ingo says:

      Mikael,

      I will admit that the explanation in your blog lists some unique problems with unique solutions. Mats blog reads like a copy of Kristian’s design, and also people who have reviewed the code in MySQL 5.6 say that it is similar to Kristian’s design, one of them in this comment thread. (Thanks for pointing out your part in the group commit work, I had not previously read your blog as it was published too close to my traveling to MySQL Connect.)

      But most of all, it is simply not possible to claim that any similarities are coincidental. Mats is obviously aware of Kristian’s design as he links to it and has compared his own with it. Kristian even points out flaws in Mats original design in the comments to that blog post. (Such dialogue is of course only positive. I’m just saying, you are not ignorant of Kristian’s work.) Human’s are not very good at erasing their memory, so if your design resembles some previous work that you are fully aware of, then it is not a coincidence, even if the implementation is original code.

  6. Henrik,
    You obviously know more about me than myself, so who am I to argue :)

    • Henrik Ingo says:

      Mikael,

      I was speaking about the collaboration between Mats and Kristian. I tried to reread what is written here, and I’m quite puzzled what you are referrring to, but if I have in any way offended you then I’d like to apologize and wish you a peaceful Holiday (Christmas?).

  7. No offence taken, hence the :) Merry Christmas and a Happy New Year and would also like to thank Mark and Domas for driving many new requirements on the MySQL Server and hope for a continuation of this into the next year.

    • Henrik Ingo says:

      +1 !!

      The FB team has been amazing in driving MySQL development, and your team in particular has been amazing in delivering it. I’m also glad to see in recent years the beginnings of a few other “end-user employed” MySQL engineering teams, such as Twitter and Taobao. I wish all of these will grow and do more amazing things in 2013.

Comments are closed.