Results 1 to 4 of 4

Thread: Zimbra Cluster stopping after 5sec - "Clearing stale status file"

  1. #1
    Smiler is offline New Member
    Join Date
    Apr 2009
    Posts
    4
    Rep Power
    6

    Unhappy Zimbra Cluster stopping after 5sec - "Clearing stale status file"

    Hi,

    I'm trying to get Zimbra to work on a 2 Node Cluster. (Followed Instructions on zimbra.com). It starts without problems when started on a single node with "servive zimbra start", but it doesn't work with Redhat Cluster Manager.

    I think this is not a problem of my cluster.conf (worked fine with apache and other services).

    When Zimbra is startet by the cluster manager everything seem to work fine. clustat -l shows Zimbra as "started" for about 5 seconds and then zimbra.log says:

    Code:
    Apr 24 12:30:50 o1n1 postfix/master[21183]: daemon started -- version 2.4.7, configuration /opt/zimbra/postfix-2.4.7.5z/conf
    Apr 24 12:30:51 o1n1 zimbramon[17285]: 17285:info: Starting stats via zmcontrol
    Apr 24 12:30:51 o1n1 saslauthd[21198]: detach_tty      : master pid is: 21198
    Apr 24 12:30:51 o1n1 saslauthd[21198]: ipc_init        : listening on socket: /opt/zimbra/cyrus-sasl-2.1.22.3z/state/mux
    Apr 24 12:30:54 o1n1 zimbra-cluster[17283]: start - rc=0 from zmcontrol: output=[Host zimbra.mycompany.de ,      Starting ldap...Done. ,    Starting logger...Done. ,      Starting mailbox...Done. ,         Starting antispam...Done. ,        Starting antivirus...Done. ,       Starting mta...Done. ,         Starting stats...Done. ]
    Apr 24 12:31:05 o1n1 zimbra-cluster[21765]: Clearing stale status file.  Old content = zimbra
    Apr 24 12:31:05 o1n1 zimbra-cluster[21765]: status - No Zimbra service is running.
    Why does it say "Clearing stale status file"?

    I'm running zcs-NETWORK-5.0.15_GA_2851.RHEL5_64.20090310170650 on CentOS 5.3 (we have a valid license)

    DNS Records are correct.

  2. #2
    phoenix is offline Zimbra Consultant & Moderator
    Join Date
    Sep 2005
    Location
    Vannes, France
    Posts
    23,470
    Rep Power
    56

    Default

    Quote Originally Posted by Smiler View Post
    I'm running zcs-NETWORK-5.0.15_GA_2851.RHEL5_64.20090310170650 on CentOS 5.3 (we have a valid license)
    You may have a valid licence but I have to state the obvious: CentOS is not a supported platform and the only current certified Cluster support for Zimbra is on RHEL AS/ES 4, Update 5.

    Quote Originally Posted by Smiler View Post
    DNS Records are correct.
    Some diagnostic information to show that's correct might help.

    Is this an upgrade or a new installation? Have you read the ZCS Cluster Guides on this Documentation page?
    Regards


    Bill


    Acompli: A new adventure for Co-Founder KevinH.

  3. #3
    Smiler is offline New Member
    Join Date
    Apr 2009
    Posts
    4
    Rep Power
    6

    Lightbulb

    Ok, I found out what the problem seems to be.

    I tried the following:

    - manually brought up my service ip
    - manually mounted my SAN
    - started zimbra manually by
    /opt/zimbra-cluster/bin/zmcluctl start zimbra

    Everything starts fine... zimbra worked.

    The problem is the status file /opt/zimbra-cluster/status/status.dat
    Zimbra writes it when started (content of this file is "zimbra"), but do not seem to update it. If I do a /opt/zimbra-cluster/bin/zmcluctl status zimbra it says:

    Code:
    [root@o1n1 ~]# /opt/zimbra-cluster/bin/zmcluctl status zimbra
    Clearing stale status file.  Old content = zimbra
    status - No Zimbra service is running.
    So this script does return an error that normally causes Red Hat Cluster Manager to think that the zimbra service has failed.

    If i manually touch the file (and updating the access-time of it, content must be "zimbra"),
    a /opt/zimbra-cluster/bin/zmcluctl status zimbra it says:

    Code:
    [root@o1n1 status]# /opt/zimbra-cluster/bin/zmcluctl status zimbra
    Host zimbra.mycompany.de
            antispam                Running
            antivirus               Running
            imapproxy               Running
            ldap                    Running
            logger                  Running
            mailbox                 Running
            mta                     Running
            snmp                    Running
            spell                   Running
            stats                   Running
    I found out that the script zmcluctl writes the status.dat file. Could increasing the time difference when status.dat is marked as "outdated" (stale) solve the problem?

    I think it could also be a problem of a too slow start. My machine takes a lot of time when starting all services.

    Quote Originally Posted by phoenix View Post
    Is this an upgrade or a new installation? Have you read the ZCS Cluster Guides on this Documentation page?
    It is a new installation. And I followed the install instructions ("Multi Node Cluster Installation Guide").
    Last edited by Smiler; 04-27-2009 at 07:44 AM.

  4. #4
    Smiler is offline New Member
    Join Date
    Apr 2009
    Posts
    4
    Rep Power
    6

    Smile

    I found a solution!!

    The idea is to write the status.dat file just after starting zimbra by zmcluctl if "zu - zimbra -c 'zmcontrol start" returns zero. This solution works fine in my case. The Redhat Cluster Manager now checks the status of Zimbra correctly.


    Code snippet (/opt/zimbra-cluster/bin/zmcluctl Line 326):

    Code:
    if ($action eq 'start') {
        if (!setCurrentService($svcname)) {
            # Another Zimbra service is running.
            logError("start - Another Zimbra service is running.");
            exit(1);
        }
        createServiceDataSymlinks($svcname);
        restoreCrontab();
        my @output = `su - zimbra -c 'zmcontrol start'`;
        my $rc = $?;
        $rc >>= 8;
        if ($rc != 0) {
            clearCurrentService($svcname);
        }
        else {
            setCurrentService($svcname);
        }
        logError("start - rc=$rc from zmcontrol: output=[" . join( ", ", @outpu
        exit($rc);
    } elsif .......
    Notice the else condidion i added.

    I've created a patch for it:

    zmcluctl-patch-stale-status.diff

    Code:
    --- zmcluctl	2009-03-11 01:46:38.000000000 +0100
    +++ zmcluctl.new	2009-04-27 16:39:02.000000000 +0200
    @@ -337,6 +337,9 @@
         if ($rc != 0) {
             clearCurrentService($svcname);
         }
    +    else {
    +	setCurrentService($svcname);
    +    }
         logError("start - rc=$rc from zmcontrol: output=[" . join( ", ", @output ) . "]");
         exit($rc);
     } elsif ($action eq 'stop') {
    Hope this will help other users who have this Problem :-)

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Messages not being delivered
    By buee in forum Administrators
    Replies: 53
    Last Post: 10-23-2009, 10:28 AM
  2. [SOLVED] parts_decode_ext error
    By jsabater in forum Administrators
    Replies: 7
    Last Post: 10-13-2008, 07:24 AM
  3. zmmailboxdctl is stopped frequently..
    By tamilnandhu in forum Installation
    Replies: 13
    Last Post: 04-12-2008, 08:59 AM
  4. mysql.server is not running
    By Oswald-Kolle in forum Installation
    Replies: 27
    Last Post: 05-01-2007, 08:28 AM
  5. Fedora Core 3, Clean Install - Not working!
    By pcjackson in forum Installation
    Replies: 17
    Last Post: 03-05-2006, 07:38 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •