We have solved some but not all of the issues. The database issue was not vserver specific per se. It was more our fairly strict security profile. The bottom line was not enough /tmp space. Here are the details.
As a security practice, we build skeleton systems and only install necessary services. We did not pick up several dependencies for Zimbra including the perl-DBD-MySQL. Either they are not properly documented assuming a fat installation or we were sloppy. This built up a huge backlog of unprocessed data.
As a security and performance practice, we create /tmp in RAM and, because it is a separate partition, flag it noexec,nosuid,nodev to avoid using /tmp as a major attack vector for executing malicious code. Because it is a RAM partition, it was limited to 512 MB in this case. The large backlog needed more space than that to process its data.
We moved /tmp temporarily to disk and processed the backlog using zmlogprocess and zmgengraphs. Now we have stats. As far as I can tell, we are now running smoothly with the RAM based /tmp having cleared the backlog.
We did have one vserver specific issue. zmstat-vm runs vmstat which must have access to /proc/vmstat. This is not enabled by default in vserver. We thus had to create an /etc/vservers/.defaults/apps/vprocunhide/file file with a line "/proc/vmstat" in it. Unfortunately, this is an all or nothing approach so all the vservers on the same host now have access to /proc/vmstat (there are ways around this). Moreover the statistics are not virtualized, i.e., they reflect the data for the entire host system including all guest processes. But, at least it works.
We still do not know why we are getting the zmstat-proc and zmstat-io errors. Any thoughts? Thanks - John |