Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: [SOLVED] kill HUP crashes logger

  1. #1
    schulze is offline Intermediate Member
    Join Date
    Feb 2009
    Posts
    19
    Rep Power
    6

    Default [SOLVED] kill HUP crashes logger

    Hi,

    we have ZCS 5.0.13 Network Edition running on a Mac Mini with MacOS Tiger.

    The daily zimbra cronjob /etc/periodic/daily/600.zimbra has the following in it
    Code:
    if [ -f /opt/zimbra/log/logswatch.pid ]; 
      then echo "Sending sighup to zmlogswatch"; 
      kill -HUP $(cat /opt/zimbra/log/logswatch.pid | head -1); 
    fi
    This crashes the logswatch daemon. Instead of reloading the configuration, it is being stopped. Any idea, why this happens, or how to debug this?

    We tried setting up another cronjob, that runs zmlogswatchctl start after the crash. When run manually, this script works fine, but it does not work as part of a cron job.
    Running it as part of a cron job results in perl library path issues:

    Code:
    Can't locate Swatch/Actions.pm in @INC (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level /System/Library/Perl/5.8.6 /Library/Perl/5.8.6/darwin-thread-multi-2level /Library/Perl/5.8.6 /Library/Perl /Network/Library/Perl/5.8.6/darwin-thread-multi-2level /Network/Library/Perl/5.8.6 /Network/Library/Perl /System/Library/Perl/Extras/5.8.6/darwin-thread-multi-2level /System/Library/Perl/Extras/5.8.6 /Library/Perl/5.8.1 .) at /tmp/.swatch_script.2875 line 29.
    BEGIN failed--compilation aborted at /tmp/.swatch_script.2875 line 29.
    Any help is appreciated.

  2. #2
    schulze is offline Intermediate Member
    Join Date
    Feb 2009
    Posts
    19
    Rep Power
    6

    Default

    I'll just jot down the current status:

    The temporary watcher script /tmp/.swatch_script.xxxxx contains:

    Code:
    $SIG{'TERM'} = $SIG{'HUP'} = 'goodbye';
    with goodbye being a perl function, that kills the process.
    So it seems, the swatch is not able to interpret a SIGHUP correctly.

    My untested solution is:
    - Uncomment the HUP line in /etc/periodic/daily/600.zimbra
    - Add a restart-time to zmlogswatchctl:

    Code:
    ${zimbra_home}/libexec/logswatch --config-file=${configfile} \
          --use-cpan-file-tail --pid-file=${pidfile}\
          --restart-time=03:20\
          --script-dir=/tmp -t /var/log/zimbra.log > $logfile 2>&1 &
    I'll report, if that works.

  3. #3
    skenkin is offline Active Member
    Join Date
    May 2008
    Posts
    45
    Rep Power
    7

    Default

    I think that we have been experiencing this issue also.

    The logger mysteriously dies every so often, with no usable output to trace the problem.

    Have you found a resolution?

  4. #4
    schulze is offline Intermediate Member
    Join Date
    Feb 2009
    Posts
    19
    Rep Power
    6

    Default

    Hi skenkin,

    I successfully used the workaround, that I explained above.

    However, you have to make sure, that your logger crashes for the same reason (i.e. always at the same time, because of the mentioned cronjob). I have read in this forum, that there might be other reasons for the logger to crash.

  5. #5
    rs232c is offline Intermediate Member
    Join Date
    Jul 2008
    Posts
    17
    Rep Power
    7

    Default

    This may be why it's failing; the following code from /etc/periodic/daily/600.zimbra is trying to kill the zmlogswatch and zmswatch processes:

    Code:
    if [ -f /opt/zimbra/log/logswatch.pid ]; 
      then echo "Sending sighup to zmlogswatch"; 
      kill -HUP $(cat /opt/zimbra/log/logswatch.pid | head -1); 
    fi
    if [ -f /opt/zimbra/log/swatch.pid ]; 
      then echo "Sending sighup to zmswatch"; 
      kill -HUP $(cat /opt/zimbra/log/swatch.pid | head -1); 
    fi
    But observe the following:

    Code:
    odmxserve:log zimbra$ zmswatchctl start
    Starting swatch...done.
    odmxserve:log zimbra$ ps ax | grep swatch
      694 s000  S      0:00.12 /usr/bin/perl /opt/zimbra/libexec/swatch --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/tmp -t /var/log/zimbra.log
      698 s000  S      0:00.14 /usr/bin/perl /tmp/.swatch_script.694
      701 s000  R+     0:00.00 grep swatch
    odmxserve:log zimbra$ cat /opt/zimbra/log/swatch.pid
    694
    odmxserve:log zimbra$ zmlogswatchctl start
    Starting logswatch...done.
    odmxserve:log zimbra$ ps ax | grep swatch
      694 s000  S      0:00.12 /usr/bin/perl /opt/zimbra/libexec/swatch --config-file=/opt/zimbra/conf/swatchrc --use-cpan-file-tail --script-dir=/tmp -t /var/log/zimbra.log
      698 s000  S      0:00.15 /usr/bin/perl /tmp/.swatch_script.694
      763 s000  S      0:00.10 /usr/bin/perl /opt/zimbra/libexec/logswatch --config-file=/opt/zimbra/conf/logswatchrc --use-cpan-file-tail --pid-file=/opt/zimbra/log/logswatch.pid --script-dir=/tmp -t /var/log/zimbra.log
      765 s000  S      0:00.13 /usr/bin/perl /tmp/.swatch_script.763
      825 s000  R+     0:00.00 grep swatch
    odmxserve:log zimbra$ cat /opt/zimbra/log/logswatch.pid
    765
    Note that swatch.pid(694) is the pid of the parent process; the child process(698) contains the parent pid(694) as part of the script name.

    However, logswatch.pid(765) is the pid of the CHILD process; the child process(765) contains the parent pid(763) as part of the script name.

    Why does logswatch.pid have the pid of the child, not the parent?

    /opt/zimbra/libexec/swatch and /opt/zimbra/libexec/logswatch are identical, the difference is that "--pid-file=/opt/zimbra/log/logswatch.pid" is passed to logswatch.

    Is /etc/periodic/daily/600.zimbra really trying to kill the child or should it be sending the HUP to the parent? zmswatchctl amd zmlogswatchctl are coded to perform in this manner, but swatch successfully restarts and logswatch doesn't.

  6. #6
    gosborne is offline Member
    Join Date
    Jan 2009
    Posts
    11
    Rep Power
    6

    Default

    We are having this same issue, is there any update?

  7. #7
    nufan is offline Senior Member
    Join Date
    Jun 2008
    Posts
    59
    Rep Power
    7

    Default

    Quote Originally Posted by gosborne View Post
    We are having this same issue, is there any update?
    having the same issue here

  8. #8
    rs232c is offline Intermediate Member
    Join Date
    Jul 2008
    Posts
    17
    Rep Power
    7

    Default

    The parent/child pid issue for zmlogswatchctl is not the problem. I still don't see why zmswatch and zmlogswatch are coded slightly differently, but I'll keep digging.

  9. #9
    nufan is offline Senior Member
    Join Date
    Jun 2008
    Posts
    59
    Rep Power
    7

    Default

    FYI the technician assigned to my trouble ticket was able to reproduce and filed bug 36545

    Bug 36545 – logswatch not running after nightly log rotation

  10. #10
    rs232c is offline Intermediate Member
    Join Date
    Jul 2008
    Posts
    17
    Rep Power
    7

    Default

    After much analyzing, it IS the parent/child pid at the root of the problem. Here's some background.

    There are two similar functions that process /var/log/zimbra.log:
    zmswatch sends SNMP traps based on certain lines in zimbra.log;
    zmlogswatch writes lines from zimbra.log to a pipe read by zmlogger that keeps statistics.
    Each function is comprised of multiple processes.

    zmswatch
    Process: user interface - /opt/zimbra/bin/zmswatchctl
    zmswatchctl controls and reports on the status of the control process.
    Writes pid of control process to /opt/zimbra/log/swatch.pid
    Commands:
    Start - start the control process
    Stop - stop the control process by sending it a TERM signal
    Restart - stop and start the control process
    Reload - send the control process a HUP signal to cause it to restart child process
    Status - report whether the control process is running or stopped
    Process: control(parent) - /opt/zimbra/libexec/swatch
    swatch creates, controls and monitors the status of the following process.
    Writes minimal logging to /opt/zimbra/log/zmswatch.out
    Signals:
    INT, QUIT, TERM - send TERM to child to make it stop
    ALRM, HUP - send TERM to child to make it stop, then start child
    Process: watch(child) - /tmp/.swatch_script.${ppid}
    watch tails /opt/log/zimbra.log and processes selected lines
    Signals:
    HUP, TERM - terminate

    zmlogswatch
    Process: user interface - /opt/zimbra/bin/zmlogswatchctl
    zmlogswatchctl controls and reports on the status of the control process.
    Commands:
    Start - start the control process
    Stop - stop the control process by sending it a TERM signal
    Restart & Reload - stop and start the control process
    Status - report whether the control process is running or stopped
    Process: control(parent) - /opt/zimbra/libexec/logswatch
    logswatch creates, controls and monitors the status of the following process.
    Writes minimal logging to /opt/zimbra/log/zmlogswatch.out
    Writes pid of watch(child) process to /opt/zimbra/log/logswatch.pid
    Signals:
    INT, QUIT, TERM - send TERM to child to make it stop
    ALRM, HUP - send TERM to child to make it stop, then start child
    Process: watch(child) - /tmp/.swatch_script.${ppid}
    watch tails /opt/log/zimbra.log and writes lines to a pipe read by zmlogger
    Signals:
    HUP, TERM - terminate

    Note the differences, for zmswatch, the pid file contains the pid of the parent; for zmlogswatch, the pid file contains the pid of the child.
    For zmswatch, Restart and Reload do different things; for zmlogswatch, they do the same thing.

    When /etyc/periodic/daily/600.zimbra is executed, it sends a HUP to the the processes identified swatch.pid and logswatch.pid. For swatch this does exactly
    what we want: the parent gets the HUP, it stops the current child and starts a new one that tails the new zimbra.log. For logswatch this fails, the HUP goes
    to the child, it terminates and then the parent terminates. Bingo, no zmlogswatchctl function.

    Here is a patch file that will make zmlogswatch behave like zmswatch; copy it to /opt/zimbra/bin and execute: patch -b < zmlogswatchctl.patch.

    Code:
    --- zmlogswatchctl.orig 2009-03-10 16:58:38.000000000 -0700
    +++ zmlogswatchctl      2009-03-26 09:28:51.000000000 -0700
    @@ -69,8 +69,12 @@
         fi
    
         ${zimbra_home}/libexec/logswatch --config-file=${configfile} \
    -      --use-cpan-file-tail --pid-file=${pidfile}\
    +      --use-cpan-file-tail\
           --script-dir=/tmp -t /var/log/zimbra.log > $logfile 2>&1 &
    +    pid=$!
    +    if [ "x$pid" != "x" ]; then
    +      echo $pid > $pidfile
    +    fi
         for ((i=0; i < 30; i++)); do
           checkrunning
           if [ $running = 1 ]; then
    @@ -115,7 +119,7 @@
                 fi
               done
             else
    -          kill -9 $pid
    +          kill $pid
             fi
             sleep 1
           done
    @@ -128,10 +132,18 @@
         fi
         exit 0
       ;;
    -  restart|reload)
    +  restart)
         $0 stop
         $0 start
       ;;
    +  reload)
    +    checkrunning
    +    if [ $running = 1 -a "x$pid" != "x" ]; then
    +      echo -n "Reloading logswatch..."
    +      kill -HUP $pid
    +      echo "done."
    +    fi
    +  ;;
       status)
         echo -n "zmlogswatch is "
         checkrunning

Page 1 of 2 12 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Mailbox and logger not starting - SLES9 M3 Beta436
    By DanielP in forum Installation
    Replies: 49
    Last Post: 08-25-2008, 03:23 AM
  2. Logger server installation
    By veronica in forum Installation
    Replies: 2
    Last Post: 06-29-2008, 12:10 PM
  3. logger stopped by itself
    By zzzzsg in forum Administrators
    Replies: 23
    Last Post: 08-15-2007, 03:55 PM
  4. Disabling logger
    By siburny in forum Administrators
    Replies: 0
    Last Post: 12-01-2006, 12:42 PM
  5. Replies: 18
    Last Post: 03-20-2006, 02:22 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •