I got tired of having problems on our servers related to crashed tables in the Zimbra Logger. This last time, the server was spiking to a load of 7 or so and we realized it was happening every time logger tried to auto repair a corrupted table.

So....maybe I'm duplicating efforts, but I went and wrote a small Ruby script that monitors the /opt/zimbra/logger/db/data/SERVER.HOST.NAME.err logfile. I have cron start the script once an hour, and if it finds any crashed table messages in the log in the last 60 minutes, it sends an email to a few of the server administrators to let them know that a manual repair may be necessary.

If anyone has any thoughts or ways to make it better...feel free...

Matt

Code:
#!/usr/local/bin/ruby

require 'net/smtp'
require 'time'

def send_email(line, message)
  to = EMAIL_ADDRESSES
  from = 'help@MYDOMAIN.COM'
  from_alias = 'Zimbra Logger Watcher'
  subject = 'Zimbra Logger Marked As Crashed'
  msg = <<END_OF_MESSAGE
From: #{from_alias} <#{from}>
To: #{to.join(',')}
Subject: #{subject}

#{message}

Log Entry: #{line}

Warning generated by: /zimbra/Local/watch_zimbra_logger.rb

END_OF_MESSAGE
  Net::SMTP.start('smtp.MYDOMAIN.COM') do |smtp|
    smtp.send_message msg, from, to
  end
end


#MAIN

SERVER_LOG = "/opt/zimbra/logger/db/data/SERVER.HOST.NAME.err"
EMAIL_ADDRESSES = ['admin1@mydomain.com','admin2@mydomain.com','admin3@mydomain.com']

# Seek to the end of the file and read the last few lines
f = File.new(SERVER_LOG)
f.seek(-2000, IO::SEEK_END)
log_arr = f.readlines.reverse

log_arr.each do |line|
  line.include?('crashed')
    log_line = line.split(' ')
    log_time = Time.parse("#{log_line[0]} #{log_line[1]}").to_i
    time_diff = (Time.now.to_i - log_time) / 60
    if time_diff <= 60
      message = "Logger marked as crashed #{time_diff} minutes ago!"
      send_email(line, message)
    end
    exit
end