Yesterday, My Zimbra 5.x NE (running on a pair of CentOS 4.x servers with RHCS) failed over. The error in /var/log/messages is the usual, saying that zmcluctl returned 1:
Code:
May 23 04:02:16 wsl-mx1 clurgmgrd: [5374]: <err> script:zimbra: status of /opt/zimbra-cluster/bin/zmcluctl failed (returned 1)
May 23 04:02:16 wsl-mx1 clurgmgrd[5374]: <notice> status on script "zimbra" returned 1 (generic error)
May 23 04:02:16 wsl-mx1 clurgmgrd[5374]: <notice> Stopping service mx.mydomain.com
May 23 04:03:02 wsl-mx1 clurgmgrd[5374]: <notice> Service mx.mydomain.com is recovering
May 23 04:03:02 wsl-mx1 clurgmgrd[5374]: <notice> Recovering failed service mx.mydomain.com
When I check out zimbra.log, I see that this was possibly due to zmmtaconfig and zmmtaconfigctl not running:
Code:
May 23 04:02:14 wsl-mx1 zmmailboxdmgr[3765]: status requested
May 23 04:02:14 wsl-mx1 zmmailboxdmgr[3765]: status OK
May 23 04:02:14 wsl-mx1 zimbramon[3802]: 3802:info: zmmtaconfig: zmmtaconfig started on mx.mydomain.com with loglevel=3 pid=3802
May 23 04:02:16 wsl-mx1 zimbra-cluster[3255]: status - rc=1 from zmcontrol: output=[Host mx.mydomain.com <EOL>, antispam Running <EOL>, antivirus Running <EOL>, imapproxy Running <EOL>, ldap Running <EOL>, logger Running <EOL>, mailbox Stopped <EOL>, zmmtaconfig is not running. <EOL>, zmmtaconfigctl is not running <EOL>, mailboxd is running. <EOL>, mta Running <EOL>, snmp Running <EOL>, spell Running <EOL>, stats Running ]
May 23 04:02:17 wsl-mx1 zimbra-cluster[3969]: stop - Zimbra stop initiated via zmcluctl
May 23 04:02:17 wsl-mx1 zimbramon[4003]: 4003:info: Stopping services initiated by zmcontrol
First off, I don't understand why zmmtaconfig and zmmtaconfigctl were not running, or what other logs I could check to figure it out.
Second, what's with the "zmmtaconfig: zmmtaconfig started" message that comes in 2 seconds before it shows zmmtaconfig as not running? What is starting this, and why? Maybe this is some kind of automatic restart, and the zmcluctl script ran at exactly the wrong time, so it thought zmmtaconfig was not running? If so, what would have triggered this restart? Could it be a log rotation or something?
Thanks a lot!