Good evening,
Well, I've been running zimbra for years and have finally found a problem that's got me stumped. I've read every forum post I could find and I'm still coming up blank.
Last night in the middle of the night, there was a hardware problem with the server. I ran downtown and found a bunch of I/O errors on the console and a load avg of about 850(!). It had 437 days of uptime on the clock.
It would not respond to any commands so I gritted my teeth and did a power cycle. The server came back up just fine, and the mirror re-sync'd without an issue.
Everything, that is... except postfix.
I double-checked the name resolution settings, ran a permissions fix (extended), generated new certificates, triple- and quadruple- checked everything. Still no dice. Since I was on 6.0.7, I decided to go ahead and upgrade to 6.0.10 to see if that would knock everything back in line, but sadly I have the same symptoms.
The symptoms are:
MTA says "running" but there is nothing listening on port 25
There are no indications in the log (that I can find) that show any error at all except the very generic "fatal: Queue report unavailable - mail system is down"
Here are the logs of a startup:
Code:
Jan 29 17:25:14 janus zimbramon[11314]: 11314:info: Starting services initiated by zmcontrol
Jan 29 17:25:15 janus slapd[11363]: @(#) $OpenLDAP: slapd 2.4.23 (Jun 30 2010 11:06:45) $ build@build11.lab.zimbra.com:/home/build/p4/GNR/ThirdParty/openldap/openldap-2.4.23.2z/servers/slapd
Jan 29 17:25:17 janus slapd[11364]: slapd starting
Jan 29 17:25:27 janus zimbramon[11314]: 11314:info: Rewriting configs antispam amavis antivirus amavis webxml mailbox amavis antispam antivirus mta sasl
Jan 29 17:26:00 janus zimbramon[11314]: 11314:info: Starting logger via zmcontrol
Jan 29 17:26:07 janus zimbramon[11314]: 11314:info: Starting mailbox via zmcontrol
Jan 29 17:26:08 janus zimbramon[13241]: 13241:info: zmmtaconfig: zmmtaconfig started on janus.astatic.net with loglevel=3 pid=13241
Jan 29 17:26:08 janus zmmailboxdmgr[13340]: status requested
Jan 29 17:26:08 janus zmmailboxdmgr[13340]: file /opt/zimbra/log/zmmailboxd_manager.pid does not exist
Jan 29 17:26:08 janus zmmailboxdmgr[13340]: assuming no other instance is running
Jan 29 17:26:08 janus zmmailboxdmgr[13340]: file /opt/zimbra/log/zmmailboxd.pid does not exist
Jan 29 17:26:08 janus zmmailboxdmgr[13340]: assuming no other instance is running
Jan 29 17:26:08 janus zmmailboxdmgr[13340]: no manager process is running
Jan 29 17:26:55 janus zmmailboxdmgr[15760]: status requested
Jan 29 17:26:55 janus zmmailboxdmgr[15760]: file /opt/zimbra/log/zmmailboxd_manager.pid does not exist
Jan 29 17:26:55 janus zmmailboxdmgr[15760]: assuming no other instance is running
Jan 29 17:26:55 janus zmmailboxdmgr[15760]: file /opt/zimbra/log/zmmailboxd.pid does not exist
Jan 29 17:26:55 janus zmmailboxdmgr[15760]: assuming no other instance is running
Jan 29 17:26:55 janus zmmailboxdmgr[15760]: no manager process is running
Jan 29 17:26:55 janus zmmailboxdmgr[15767]: start requested
Jan 29 17:26:55 janus zmmailboxdmgr[15767]: checking if another instance of manager is already running
Jan 29 17:26:55 janus zmmailboxdmgr[15767]: file /opt/zimbra/log/zmmailboxd_manager.pid does not exist
Jan 29 17:26:55 janus zmmailboxdmgr[15767]: assuming no other instance is running
Jan 29 17:26:55 janus zmmailboxdmgr[15767]: file /opt/zimbra/log/zmmailboxd.pid does not exist
Jan 29 17:26:55 janus zmmailboxdmgr[15767]: assuming no other instance is running
Jan 29 17:26:55 janus zmmailboxdmgr[15768]: wrote manager pid 15768 to /opt/zimbra/log/zmmailboxd_manager.pid
Jan 29 17:26:55 janus zmmailboxdmgr[15768]: manager started mailboxd/JVM with pid 15769
Jan 29 17:26:55 janus zimbramon[11314]: 11314:info: Starting antispam via zmcontrol
Jan 29 17:26:55 janus zmmailboxdmgr[15769]: wrote java pid 15769 to /opt/zimbra/log/zmmailboxd_java.pid
Jan 29 17:27:00 janus amavis[15833]: starting. /opt/zimbra/amavisd/sbin/amavisd at janus.astatic.net amavisd-new-2.6.4 (20090625), Unicode aware, LC_ALL="C", LANG="en_US"
Jan 29 17:27:00 janus amavis[15833]: user=501, EUID: 501 (501); group=, EGID: 501 502 501 5 4 (501 502 501 5 4)
Jan 29 17:27:00 janus amavis[15833]: Perl version 5.008008
Jan 29 17:27:03 janus amavis[15833]: SpamControl: init_pre_chroot on SpamAssassin done
Jan 29 17:27:03 janus amavis[15866]: Net::Server: Process Backgrounded
Jan 29 17:27:03 janus amavis[15866]: Net::Server: 2011/01/29-17:27:03 Amavis (type Net::Server::PreForkSimple) starting! pid(15866)
Jan 29 17:27:03 janus amavis[15866]: Net::Server: Binding to UNIX socket file /opt/zimbra/data/amavisd/amavisd.sock using SOCK_STREAM
Jan 29 17:27:03 janus amavis[15866]: Net::Server: Binding to TCP port 10024 on host 127.0.0.1
Jan 29 17:27:03 janus amavis[15866]: Net::Server: Group Not Defined. Defaulting to EGID '501 502 501 5 4'
Jan 29 17:27:03 janus amavis[15866]: Net::Server: User Not Defined. Defaulting to EUID '501'
Jan 29 17:27:03 janus amavis[15866]: Module Amavis::Conf 2.207
Jan 29 17:27:03 janus amavis[15866]: Module Archive::Zip 1.30
Jan 29 17:27:03 janus amavis[15866]: Module Compress::Zlib 1.42
Jan 29 17:27:03 janus amavis[15866]: Module Convert::TNEF 0.17
Jan 29 17:27:03 janus amavis[15866]: Module Convert::UUlib 1.33
Jan 29 17:27:03 janus amavis[15866]: Module Crypt::OpenSSL::RSA 0.26
Jan 29 17:27:03 janus amavis[15866]: Module DBD::mysql 4.013
Jan 29 17:27:03 janus amavis[15866]: Module DBI 1.609
Jan 29 17:27:03 janus amavis[15866]: Module DB_File 1.82
Jan 29 17:27:03 janus amavis[15866]: Module Digest::MD5 2.36
Jan 29 17:27:03 janus amavis[15866]: Module Digest::SHA 5.48
Jan 29 17:27:03 janus amavis[15866]: Module IO::Socket::INET6 2.59
Jan 29 17:27:03 janus amavis[15866]: Module MIME::Entity 5.427
Jan 29 17:27:03 janus amavis[15866]: Module MIME::Parser 5.427
Jan 29 17:27:03 janus amavis[15866]: Module MIME::Tools 5.427
Jan 29 17:27:03 janus amavis[15866]: Module Mail::DKIM::Signer 0.37
Jan 29 17:27:03 janus amavis[15866]: Module Mail::DKIM::Verifier 0.37
Jan 29 17:27:03 janus amavis[15866]: Module Mail::Header 2.06
Jan 29 17:27:03 janus amavis[15866]: Module Mail::Internet 2.06
Jan 29 17:27:03 janus amavis[15866]: Module Mail::SPF v2.007
Jan 29 17:27:03 janus amavis[15866]: Module Mail::SpamAssassin 3.003001
Jan 29 17:27:03 janus amavis[15866]: Module Net::DNS 0.66
Jan 29 17:27:03 janus amavis[15866]: Module Net::LDAP 0.40
Jan 29 17:27:03 janus amavis[15866]: Module Net::Server 0.97
Jan 29 17:27:03 janus amavis[15866]: Module NetAddr::IP 4.027
Jan 29 17:27:03 janus amavis[15866]: Module Socket6 0.23
Jan 29 17:27:03 janus amavis[15866]: Module Time::HiRes 1.9721
Jan 29 17:27:03 janus amavis[15866]: Module URI 1.53
Jan 29 17:27:03 janus amavis[15866]: Module Unix::Syslog 1.1
Jan 29 17:27:03 janus amavis[15866]: Amavis::DB code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: Amavis::Cache code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: SQL base code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: SQL::Log code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: SQL::Quarantine NOT loaded
Jan 29 17:27:03 janus amavis[15866]: Lookup::SQL code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: Lookup::LDAP code loaded
Jan 29 17:27:03 janus amavis[15866]: AM.PDP-in proto code loaded
Jan 29 17:27:03 janus amavis[15866]: SMTP-in proto code loaded
Jan 29 17:27:03 janus amavis[15866]: Courier proto code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: SMTP-out proto code loaded
Jan 29 17:27:03 janus amavis[15866]: Pipe-out proto code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: BSMTP-out proto code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: Local-out proto code loaded
Jan 29 17:27:03 janus amavis[15866]: OS_Fingerprint code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: ANTI-VIRUS code loaded
Jan 29 17:27:03 janus amavis[15866]: ANTI-SPAM code loaded
Jan 29 17:27:03 janus amavis[15866]: ANTI-SPAM-EXT code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: ANTI-SPAM-C code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: ANTI-SPAM-SA code loaded
Jan 29 17:27:03 janus amavis[15866]: Unpackers code loaded
Jan 29 17:27:03 janus amavis[15866]: DKIM code loaded
Jan 29 17:27:03 janus amavis[15866]: Tools code NOT loaded
Jan 29 17:27:03 janus amavis[15866]: Found $file at /usr/bin/file
Jan 29 17:27:03 janus amavis[15866]: No $altermime, not using it
Jan 29 17:27:03 janus amavis[15866]: Internal decoder for .mail
Jan 29 17:27:03 janus amavis[15866]: Internal decoder for .asc
Jan 29 17:27:03 janus amavis[15866]: Internal decoder for .uue
Jan 29 17:27:03 janus amavis[15866]: Internal decoder for .hqx
Jan 29 17:27:03 janus amavis[15866]: Internal decoder for .ync
Jan 29 17:27:03 janus amavis[15866]: No decoder for .F tried: unfreeze, freeze -d, melt, fcat
Jan 29 17:27:03 janus amavis[15866]: Found decoder for .Z at /usr/bin/gzip -d
Jan 29 17:27:03 janus amavis[15866]: Found decoder for .gz at /usr/bin/gzip -d
Jan 29 17:27:03 janus amavis[15866]: Found decoder for .bz2 at /usr/bin/bzip2 -d
Jan 29 17:27:03 janus amavis[15866]: No decoder for .lzo tried: lzop -d
Jan 29 17:27:03 janus amavis[15866]: Found decoder for .rpm at /usr/bin/rpm2cpio
Jan 29 17:27:03 janus amavis[15866]: Found decoder for .cpio at /usr/bin/pax
Jan 29 17:27:03 janus amavis[15866]: Found decoder for .tar at /usr/bin/pax
Jan 29 17:27:03 janus amavis[15866]: Found decoder for .deb at /usr/bin/ar
Jan 29 17:27:03 janus amavis[15866]: Internal decoder for .zip
Jan 29 17:27:03 janus amavis[15866]: No decoder for .7z tried: 7zr, 7za, 7z
Jan 29 17:27:03 janus amavis[15866]: No decoder for .rar tried: rar, unrar
Jan 29 17:27:03 janus amavis[15866]: No decoder for .arj tried: arj, unarj
Jan 29 17:27:03 janus amavis[15866]: No decoder for .arc tried: nomarch, arc
Jan 29 17:27:03 janus amavis[15866]: No decoder for .zoo tried: zoo, unzoo
Jan 29 17:27:03 janus amavis[15866]: Found decoder for .lha at /usr/bin/lha
Jan 29 17:27:03 janus amavis[15866]: No decoder for .cab tried: cabextract
Jan 29 17:27:03 janus amavis[15866]: No decoder for .tnef tried: tnef
Jan 29 17:27:03 janus amavis[15866]: Internal decoder for .tnef
Jan 29 17:27:03 janus amavis[15866]: Found decoder for .exe at /usr/bin/lha
Jan 29 17:27:03 janus amavis[15866]: Using primary internal av scanner code for ClamAV-clamd
Jan 29 17:27:03 janus amavis[15866]: initializing Mail::SpamAssassin
Jan 29 17:27:03 janus zimbramon[11314]: 11314:info: Starting antivirus via zmcontrol
Jan 29 17:27:05 janus clamd[15940]: clamd daemon 0.96.5-broken-compiler (OS: linux-gnu, ARCH: x86_64, CPU: x86_64)
Jan 29 17:27:05 janus clamd[15940]: Log file size limited to 20971520 bytes.
Jan 29 17:27:05 janus clamd[15940]: Reading databases from /opt/zimbra/data/clamav/db
Jan 29 17:27:05 janus clamd[15940]: Not loading PUA signatures.
Jan 29 17:27:06 janus amavis[15866]: SpamControl: init_pre_fork on SpamAssassin done
Jan 29 17:27:06 janus amavis[15866]: extra modules loaded after daemonizing/chrooting: Mail/SpamAssassin/Plugin/FreeMail.pm
Jan 29 17:27:17 janus clamd[15940]: Loaded 878301 signatures.
Jan 29 17:27:18 janus clamd[15940]: TCP: Bound to address 127.0.0.1 on port 3310
Jan 29 17:27:18 janus clamd[15940]: TCP: Setting connection queue length to 15
Jan 29 17:27:18 janus clamd[15984]: Limits: Global size limit set to 51937280 bytes.
Jan 29 17:27:18 janus clamd[15984]: Limits: File size limit set to 51937280 bytes.
Jan 29 17:27:18 janus clamd[15984]: Limits: Recursion level limit set to 16.
Jan 29 17:27:18 janus clamd[15984]: Limits: Files limit set to 10000.
Jan 29 17:27:18 janus clamd[15984]: Archive support enabled.
Jan 29 17:27:18 janus clamd[15984]: Archive: Blocking encrypted archives.
Jan 29 17:27:18 janus clamd[15984]: Algorithmic detection enabled.
Jan 29 17:27:18 janus clamd[15984]: Portable Executable support enabled.
Jan 29 17:27:18 janus clamd[15984]: ELF support enabled.
Jan 29 17:27:18 janus clamd[15984]: Mail files support enabled.
Jan 29 17:27:18 janus clamd[15984]: OLE2 support enabled.
Jan 29 17:27:18 janus clamd[15984]: PDF support enabled.
Jan 29 17:27:18 janus clamd[15984]: HTML support enabled.
Jan 29 17:27:18 janus clamd[15984]: Self checking every 600 seconds.
Jan 29 17:27:19 janus zimbramon[11314]: 11314:info: Starting snmp via zmcontrol
Jan 29 17:27:19 janus zimbramon[11314]: 11314:info: Starting spell via zmcontrol
Jan 29 17:27:19 janus zimbramon[11314]: 11314:info: Starting mta via zmcontrol
Jan 29 17:27:20 janus saslauthd[16091]: detach_tty : master pid is: 16091
Jan 29 17:27:20 janus saslauthd[16091]: ipc_init : listening on socket: /opt/zimbra/cyrus-sasl-2.1.23.3z/state/mux
Jan 29 17:27:20 janus zimbramon[11314]: 11314:info: Starting stats via zmcontrol
Jan 30 01:27:30 janus postfix/postqueue[16866]: fatal: Queue report unavailable - mail system is down
Jan 30 01:28:00 janus postfix/postqueue[17643]: fatal: Queue report unavailable - mail system is down
Jan 29 17:28:09 janus zmmailboxdmgr[18297]: status requested
Jan 29 17:28:09 janus zmmailboxdmgr[18297]: status OK
Jan 29 17:28:14 janus zmmailboxdmgr[18545]: status requested
Jan 29 17:28:14 janus zmmailboxdmgr[18545]: status OK
Jan 29 17:28:14 janus zmmailboxdmgr[18621]: status requested
Jan 29 17:28:14 janus zmmailboxdmgr[18621]: status OK
Jan 30 01:28:30 janus postfix/postqueue[18670]: fatal: Queue report unavailable - mail system is down
Jan 30 01:29:00 janus postfix/postqueue[18695]: fatal: Queue report unavailable - mail system is down
Jan 30 01:29:30 janus postfix/postqueue[19239]: fatal: Queue report unavailable - mail system is down
Jan 29 17:29:45 janus zmmailboxdmgr[19784]: status requested
Jan 29 17:29:45 janus zmmailboxdmgr[19784]: status OK
Jan 29 17:29:46 janus zmmailboxdmgr[19859]: status requested
Jan 29 17:29:46 janus zmmailboxdmgr[19859]: status OK
Here is the output of the typical diagnostic commands:
Code:
[root@janus conf]# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
10.0.0.2 janus.astatic.net janus
[root@janus conf]# cat /etc/resolv.conf
search astatic.net
nameserver 10.0.0.2
[root@janus conf]# dig astatic.net mx
; <<>> DiG 9.3.4-P1 <<>> astatic.net mx
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40519
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 3
;; QUESTION SECTION:
;astatic.net. IN MX
;; ANSWER SECTION:
astatic.net. 600 IN MX 10 janus.astatic.net.
;; AUTHORITY SECTION:
astatic.net. 600 IN NS ns2.astatic.net.
astatic.net. 600 IN NS ns1.astatic.net.
;; ADDITIONAL SECTION:
janus.astatic.net. 600 IN A 10.0.0.2
ns1.astatic.net. 600 IN A 10.0.0.2
ns2.astatic.net. 600 IN A 10.0.0.2
;; Query time: 0 msec
;; SERVER: 10.0.0.2#53(10.0.0.2)
;; WHEN: Sat Jan 29 17:43:47 2011
;; MSG SIZE rcvd: 135
[root@janus conf]# dig astatic.net any
; <<>> DiG 9.3.4-P1 <<>> astatic.net any
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51914
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 3
;; QUESTION SECTION:
;astatic.net. IN ANY
;; ANSWER SECTION:
astatic.net. 600 IN A 10.0.0.12
astatic.net. 600 IN SOA ns1.astatic.net. hostmaster.astatic.net. 39 28800 7200 604800 600
astatic.net. 600 IN TXT "v=spf1 a mx"
astatic.net. 600 IN NS ns1.astatic.net.
astatic.net. 600 IN NS ns2.astatic.net.
astatic.net. 600 IN MX 10 janus.astatic.net.
;; ADDITIONAL SECTION:
ns1.astatic.net. 600 IN A 10.0.0.2
ns2.astatic.net. 600 IN A 10.0.0.2
janus.astatic.net. 600 IN A 10.0.0.2
;; Query time: 0 msec
;; SERVER: 10.0.0.2#53(10.0.0.2)
;; WHEN: Sat Jan 29 17:43:47 2011
;; MSG SIZE rcvd: 222
[root@janus conf]# host `hostname`
janus.astatic.net has address 10.0.0.2
janus.astatic.net mail is handled by 10 janus.astatic.net.
[root@janus conf]# su - zimbra
[zimbra@janus ~]$ postfix check
postfix/postfix-script: warning: not owned by root: /opt/zimbra/data/postfix/spool
postfix/postfix-script: warning: not owned by root: /opt/zimbra/postfix-2.6.7.2z/conf/main.cf
postfix/postfix-script: warning: not owned by root: /opt/zimbra/postfix-2.6.7.2z/conf/master.cf
postfix/postfix-script: warning: not owned by root: /opt/zimbra/postfix-2.6.7.2z/conf/master.cf.in
Any insight would be greatly appreciated.