Results 1 to 6 of 6

Thread: [SOLVED] Zimbra failover due to zmmtaconfig not running?

  1. #1
    kilbasar is offline Member
    Join Date
    May 2010
    Posts
    12
    Rep Power
    4

    Default [SOLVED] Zimbra failover due to zmmtaconfig not running?

    Yesterday, My Zimbra 5.x NE (running on a pair of CentOS 4.x servers with RHCS) failed over. The error in /var/log/messages is the usual, saying that zmcluctl returned 1:

    Code:
    May 23 04:02:16 wsl-mx1 clurgmgrd: [5374]: <err> script:zimbra: status of /opt/zimbra-cluster/bin/zmcluctl failed (returned 1) 
    May 23 04:02:16 wsl-mx1 clurgmgrd[5374]: <notice> status on script "zimbra" returned 1 (generic error) 
    May 23 04:02:16 wsl-mx1 clurgmgrd[5374]: <notice> Stopping service mx.mydomain.com 
    May 23 04:03:02 wsl-mx1 clurgmgrd[5374]: <notice> Service mx.mydomain.com is recovering 
    May 23 04:03:02 wsl-mx1 clurgmgrd[5374]: <notice> Recovering failed service mx.mydomain.com
    When I check out zimbra.log, I see that this was possibly due to zmmtaconfig and zmmtaconfigctl not running:

    Code:
    May 23 04:02:14 wsl-mx1 zmmailboxdmgr[3765]: status requested
    May 23 04:02:14 wsl-mx1 zmmailboxdmgr[3765]: status OK
    May 23 04:02:14 wsl-mx1 zimbramon[3802]: 3802:info: zmmtaconfig: zmmtaconfig started on mx.mydomain.com with loglevel=3 pid=3802 
    May 23 04:02:16 wsl-mx1 zimbra-cluster[3255]: status - rc=1 from zmcontrol: output=[Host mx.mydomain.com <EOL>, 	antispam                Running <EOL>, 	antivirus               Running <EOL>, 	imapproxy               Running <EOL>, 	ldap                    Running <EOL>, 	logger                  Running <EOL>, 	mailbox                 Stopped <EOL>, 		zmmtaconfig is not running. <EOL>, 	zmmtaconfigctl is not running <EOL>, 		mailboxd is running. <EOL>, 	mta                     Running <EOL>, 	snmp                    Running <EOL>, 	spell                   Running <EOL>, 	stats                   Running ] 
    May 23 04:02:17 wsl-mx1 zimbra-cluster[3969]: stop -  Zimbra stop initiated via zmcluctl 
    May 23 04:02:17 wsl-mx1 zimbramon[4003]: 4003:info: Stopping services initiated by zmcontrol
    First off, I don't understand why zmmtaconfig and zmmtaconfigctl were not running, or what other logs I could check to figure it out.

    Second, what's with the "zmmtaconfig: zmmtaconfig started" message that comes in 2 seconds before it shows zmmtaconfig as not running? What is starting this, and why? Maybe this is some kind of automatic restart, and the zmcluctl script ran at exactly the wrong time, so it thought zmmtaconfig was not running? If so, what would have triggered this restart? Could it be a log rotation or something?

    Thanks a lot!

  2. #2
    brian is offline Project Contributor
    Join Date
    Jul 2006
    Posts
    623
    Rep Power
    9

    Default

    Correct, this is log rotation. Which version of zcs are you running?
    Bugzilla - Wiki - Downloads - Before posting... Search!

  3. #3
    veronica is offline Outstanding Member
    Join Date
    Jun 2008
    Posts
    594
    Rep Power
    7

    Default

    Seems you are hitting Bug 36042 &ndash; Log rotation causes cluster failover You need to upgrade.

  4. #4
    kilbasar is offline Member
    Join Date
    May 2010
    Posts
    12
    Rep Power
    4

    Default

    Thanks guys! As per my signature, I'm running 5.0.20_GA_3128.RHEL4_20091102090733. From the sounds of that bug report, this issue has NOT yet been fixed in the latest version (according to Mike Cathey), so upgrading might not help.

    Any idea how long zmmtaconfig is down during these log rotations? If it's not long, I think the easiest solution might be to just make a wrapper script, something like:

    Code:
    #!/bin/bash
    /opt/zimbra-cluster/bin/zmcluctl
    [ "$?" -eq "0" ] && exit 0
    echo "Failed, trying again in 30 seconds..."
    sleep 30
    /opt/zimbra-cluster/bin/zmcluctl
    exit $?
    This would run zmcluctl, and if it exits with a 1, it will sleep 30 seconds and then try it a second time, only returning 1 if both attempts fail. I feel like this would cut down on a lot of false positives. The only downside would be waiting an extra 30 seconds before failing over, but I can deal with that.

  5. #5
    veronica is offline Outstanding Member
    Join Date
    Jun 2008
    Posts
    594
    Rep Power
    7

    Default

    This perhaps wont work as /opt/zimbra-cluster/bin/zmcluctl requires an argument either start, stop, status. You might want to tweak status subroutine

  6. #6
    kilbasar is offline Member
    Join Date
    May 2010
    Posts
    12
    Rep Power
    4

    Default

    Thanks, yeah, I was basically just coding out loud there. A functional version would be:

    Code:
    #!/bin/bash
    /opt/zimbra-cluster/bin/zmcluctl status
    [ "$?" -eq "0" ] && exit 0
    echo "Failed, trying again in 30 seconds..."
    sleep 30
    /opt/zimbra-cluster/bin/zmcluctl status
    exit $?
    I prefer making a wrapper script over modifying the Zimbra script, as the Zimbra script will get overwritten if I upgrade to a newer version. It also allows me to run the original script unedited if I so desire.

    Moral of the story, this is a bug in 5.x that it appears hasn't been fixed yet. Marking this thread as solved. Thanks everyone!

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. zimbra install with perpetually broken logger/stats
    By jptech in forum Installation
    Replies: 8
    Last Post: 09-29-2008, 02:33 PM
  2. Zimbra spam system
    By rajahd in forum Administrators
    Replies: 9
    Last Post: 04-16-2008, 07:25 PM
  3. zmmailboxdctl is stopped frequently..
    By tamilnandhu in forum Installation
    Replies: 13
    Last Post: 04-12-2008, 08:59 AM
  4. Replies: 22
    Last Post: 12-02-2007, 05:05 PM
  5. Zimbra server crashed
    By goetzi in forum Administrators
    Replies: 6
    Last Post: 03-25-2006, 01:00 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •