<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>domas mituzas &#187; inline</title>
	<atom:link href="http://dom.as/tag/inline/feed/" rel="self" type="application/rss+xml" />
	<link>http://dom.as</link>
	<description></description>
	<lastBuildDate>Thu, 02 Feb 2012 21:29:05 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='dom.as' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/6e344c6e0cd7462eb056f8b98eb2cbcd?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>domas mituzas &#187; inline</title>
		<link>http://dom.as</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://dom.as/osd.xml" title="domas mituzas" />
	<atom:link rel='hub' href='http://dom.as/?pushpress=hub'/>
		<item>
		<title>Crashes, complicated edition</title>
		<link>http://dom.as/2008/08/05/complicated-crashes/</link>
		<comments>http://dom.as/2008/08/05/complicated-crashes/#comments</comments>
		<pubDate>Tue, 05 Aug 2008 11:22:00 +0000</pubDate>
		<dc:creator>Domas Mituzas</dc:creator>
				<category><![CDATA[mysql]]></category>
		<category><![CDATA[wikitech]]></category>
		<category><![CDATA[crash]]></category>
		<category><![CDATA[gcc]]></category>
		<category><![CDATA[inline]]></category>
		<category><![CDATA[innodb]]></category>
		<category><![CDATA[opteron]]></category>

		<guid isPermaLink="false">http://dammit.lt/?p=177</guid>
		<description><![CDATA[Usually our 4.0.40 (aka &#8216;four oh forever&#8217;) build doesn&#8217;t crash, and if it does, it is always hardware problem or kernel/filesystem bug, or whatever else. So, we have a very calm life, until crashes start to happen&#8230; As we used &#8230; <a href="http://dom.as/2008/08/05/complicated-crashes/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dom.as&amp;blog=190075&amp;post=177&amp;subd=domasmituzas&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Usually our <a href='http://svn.wikimedia.org/viewvc/mysql/trunk/server/'>4.0.40</a> (aka &#8216;four oh forever&#8217;) build doesn&#8217;t crash, and if it does, it is always hardware problem or kernel/filesystem bug, or whatever else. So, we have a very calm life, until crashes start to happen&#8230;</p>
<p>As we used to run RAID0, a disk failure usually means system wipe and reinstall once fixed &#8211; so our machines all run relatively new kernels and OS (except some boxes which just refuse to die ;-), and we&#8217;re usually way more ahead than all the bunch of conservative RHEL users.</p>
<p>We had one machine which was reporting CPU/northbridge/RAM problems, and every MySQL crash was accompanied by <a href='http://en.wikipedia.org/wiki/Machine_Check_Exception'>MCEs</a>, so after replacing RAM, CPU and motherboard itself, we just sent the machine back to service, and asked them to do whatever it takes to fix it.</p>
<p>So, this machine, with proud name of &#8216;db1&#8242; comes and after entering the service starts crashing every day. I reduced InnoDB log file size, to make recovery faster, and would run it under &#8216;gdb&#8217;. Stacktrace on crash pointed to check-summing (aka folding) bunch of functions, so initial assumption was &#8216;here we get memory errors again&#8217;. So, for a while I thought that &#8216;db1&#8242; needs some more hardware work, and just left it as is, as we were waiting for new database hardware batch to deploy and there was a bit more work around.</p>
<p>We started deploying new database hardware, and it started crashing every few hours instead of every few days. Here again, reduced InnoDB transaction log size and gdb attached allowed to <a href='http://p.defau.lt/?pMUxpWwiGwwDOA1daO3Tiw'>trap the segfault</a>, and it was pointing again to the very same adaptive hash key calculation (folding!).</p>
<p>Unfortunately, it was non-trivial chain of inlined functions (InnoDB is full of these), so I built &#8216;-g -fno-inline&#8217; build, and was keenly waiting for a crash to happen, so I could investigate what and where gets corrupted. It did not. Then I looked at our <a href='http://p.defau.lt/?A6y0ZFUttppM_5_rNlmpmQ'>zoo</a> just to find out we have lots of different builds. On one hand it was a bit messy, on another hand, it showed few conclusions:</p>
<ul>
<li>Only Opterons crashed (though there&#8217;re like three year gap between revisions)</li>
<li>Only Ubuntu 8.04 crashed</li>
<li>Only GCC-4.2 build crashed</li>
</ul>
<p>After thinking a bit that:</p>
<ul>
<li>We have Opterons that don&#8217;t crash (older gcc builds)</li>
<li>Xeons didn&#8217;t crash.</li>
<li>We have Ubuntu 8.04 that don&#8217;t crash (they either are Xeons or run older gcc-4.1 builds)</li>
<li>We have GCC-4.2 builds that run nice (all &#8211; on Xeons, all on 8.04 Ubuntu). </li>
</ul>
<p>The next test was taking gcc-4.1 builds and running them on our new machines. No crash for next two days.<br />
One new machine did have gcc-4.2 build and didn&#8217;t crash for few days of replicate-only load, but once it got some parallel load, it crashed in next few hours.</p>
<p>I tried to chat about it on Freenode&#8217;s #gcc, and I got just:</p>
<pre>
noshadow&gt;	domas: almost everything that fails when
		optimized (as inlining opens many new
		optimisation possibilities)
noshadow&gt;	i.e: const misuse, relying on undefined
		behaviour, breaking aliasing rules, ...
domas&gt;		interesting though, I hit it just with
		gcc 4.2.3 and opterons only
noshadow&gt;	domas: that makes it more likely that
		it is caused by optimisation unveiling
		programming bugs
</pre>
<p>In the end I know, that there&#8217;s programming bug in ancient code using inlined functions, that causes memory corruption in multithreaded load if compiled with gcc-4.2 and ran on Opteron. As for now it is our fork, pretty much everyone will point at each other and won&#8217;t try to fix it :)</p>
<p>And me? I can always do:</p>
<pre>env CC=gcc-4.1 CXX=g++-4.1 ./configure ... </pre>
<p>I&#8217;m too lazy to learn how to disassemble and check compiled code differences, especially when every test takes few hours. I already destroyed my weekend with this :-) I&#8217;m just waiting for people to hit this with stock mysql &#8211; would be one of those things we love debugging ;-)</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/domasmituzas.wordpress.com/177/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/domasmituzas.wordpress.com/177/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/domasmituzas.wordpress.com/177/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/domasmituzas.wordpress.com/177/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/domasmituzas.wordpress.com/177/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/domasmituzas.wordpress.com/177/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/domasmituzas.wordpress.com/177/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/domasmituzas.wordpress.com/177/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/domasmituzas.wordpress.com/177/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/domasmituzas.wordpress.com/177/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/domasmituzas.wordpress.com/177/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/domasmituzas.wordpress.com/177/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/domasmituzas.wordpress.com/177/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/domasmituzas.wordpress.com/177/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/domasmituzas.wordpress.com/177/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/domasmituzas.wordpress.com/177/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dom.as&amp;blog=190075&amp;post=177&amp;subd=domasmituzas&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dom.as/2008/08/05/complicated-crashes/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c660a6eb3a4005232acb111303bef12c?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif&#38;r=G" medium="image">
			<media:title type="html">domasmituzas</media:title>
		</media:content>
	</item>
	</channel>
</rss>
