I wonder if the severity of our crashes are compounded by the fact we have oodles of memory right now and very little activity. Perhaps we are being bitten by either this (from
Performance Tuning Guidelines for Large Deployments - Zimbra :: Wiki):
"Innodb writes out pages in its cache after a certain percent of pages are dirty. The default is 90%. This default setting will minimize the total number of writes, but it would cause a major bottleneck in system performance when 90% is reached and database becomes unresponsive because the disk system is writing out all those changes in one shot. We recommend you set the dirty flush ratio to 10%, which does cause a lot more net total IO, but will avoids spiky write load. "
or this:
"MySQL is configured to store its data in files, and the Linux kernel buffers file IO. The buffering provided by the kernel is not useful to innodb at all because innodb is making its own paging decisions - the kernel gets in the way. Bypass the kernel with"
As I am not eager to keep mangling my system, I am hesitant to test these hypotheses! Can anyone with more experience furnish some guidance?
As someone new to this forum, I do hope it is not bad etiquette to keep such a running post. I am hoping it might help some other poor slob who faces the same issues in the future. Thanks - John