Zimbra offers Open Source email server software and shared calendar for Linux and the Mac
Go Back   Zimbra :: Forums > Zimbra Collaboration Suite > Administrators

Welcome to the Zimbra :: Forums!
Welcome, if you would like to post a comment please register. We also encourage you to explore all things Zimbra with our team and members of the community.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 05-18-2010, 02:36 AM
Intermediate Member
 
Posts: 21
Question [SOLVED] Zimbra doesn't start/stop, stats still running

Hello,

after Zimbra worked fine for nearly 8 months I ran into some serious problems for the first time.
As i'm not very experienced with Ziimbra and Linux servers I hope that you can help me.

I tried to open the web client which didn't work, that happens every once in a while but normally it was easily fixed with a server restart. This time the server restart didn't help. So I logged onto the server and ran
Code:
zimbra@srvXXXXx:~$ zmcontrol stop
Host srvXXXX.domain.tld
	Stopping stats...Done.
	Stopping mta...Done.
	Stopping spell...Done.
	Stopping snmp...Done.
	Stopping archiving...Done.
	Stopping antivirus...Done.
	Stopping antispam...Done.
	Stopping imapproxy...Done.
	Stopping memcached...Done.
	Stopping mailbox...Done.
	Stopping logger...Done.
	Stopping ldap...Done.
then zmcontrol start
Code:
zimbra@srvXXXX:~$ zmcontrol start
Host srvXXXX.domain.tld
	Starting ldap...Done.
Failed.
for zmcontrol status the output is this
Code:
zimbra@srvXXXX:~$ zmcontrol status
Host srvXXXX.domain.tld
	imapproxy               Stopped
	ldap                    Stopped
	mailbox                 Stopped
	memcached               Stopped
	mta                     Stopped
	snmp                    Stopped
	spell                   Stopped
	stats                   Running
Then I'll try to manually kill all the processes zimbra is running
Code:
zimbra@srvXXXX:~$ ps aux | grep zimbra
zimbra    1733  0.0  2.2 223588 45752 ?        Ssl  May17   0:00 /opt/zimbra/openldap/sbin/slapd -l LOCAL0 -4 -u zimbra -h ldap://srvXXXX.domain.tld:389 ldapi:/// -F /opt/zimbra/data/ldap/config
zimbra    3528  0.0  0.0  33316  1244 pts/1    S    09:04   0:00 su - zimbra
zimbra    3529  0.0  0.1  18056  2060 pts/1    S    09:04   0:00 -su
zimbra    4129  0.0  0.0  14780  1000 pts/1    R+   09:12   0:00 ps aux
zimbra    4130  0.0  0.0   3940   612 pts/1    S+   09:12   0:00 grep zimbra
after the command "kill 1733":
Code:
zimbra@srvXXXX:~$ ps aux | grep zimbra
zimbra    4144  0.0  0.0  33316  1248 pts/0    S    09:15   0:00 su - zimbra
zimbra    4145  0.0  0.0  18056  2048 pts/0    S    09:15   0:00 -su
zimbra    4159  0.0  0.0  14780  1004 pts/0    R+   09:15   0:00 ps aux
zimbra    4160  0.0  0.0   3940   612 pts/0    S+   09:15   0:00 grep zimbra
trying to start the server again:
Code:
zimbra@srvXXXX:~$ zmcontrol start
Host srvXXXX.domain.tld
	Starting ldap...Done.
Failed.
When I went and looked at /var/log/zimbra.log I realized that the last entry was from last night and there are no entries of my tries today. The last entries are:
Code:
May 17 21:07:10 srvXXXX zimbramon[9045]: 9045:info: Stopping services initiated by zmcontrol
May 17 21:07:10 srvXXXX zimbramon[9045]: 9045:info: Stopping stats via zmcontrol
May 17 21:07:11 srvXXXX zimbramon[9045]: 9045:info: Stopping mta via zmcontrol
May 17 21:07:14 srvXXXX postfix/postfix-script[9239]: fatal: the Postfix mail system is not running
May 17 21:07:14 srvXXXX zimbramon[9045]: 9045:info: Stopping spell via zmcontrol
May 17 21:07:14 srvXXXX zimbramon[9045]: 9045:info: Stopping snmp via zmcontrol
May 17 21:07:14 srvXXXX zimbramon[9045]: 9045:info: Stopping archiving via zmcontrol
May 17 21:07:14 srvXXXX zimbramon[9045]: 9045:info: Stopping antivirus via zmcontrol
May 17 21:07:15 srvXXXX zimbramon[9045]: 9045:info: Stopping antispam via zmcontrol
May 17 21:07:15 srvXXXX zimbramon[9045]: 9045:info: Stopping imapproxy via zmcontrol
May 17 21:07:15 srvXXXX zimbramon[9045]: 9045:info: Stopping memcached via zmcontrol
May 17 21:07:16 srvXXXX zimbramon[9045]: 9045:info: Stopping mailbox via zmcontrol
May 17 21:07:16 srvXXXX zmmailboxdmgr[9356]: status requested
May 17 21:07:16 srvXXXX zmmailboxdmgr[9356]: stale pid 6833 found in /opt/zimbra/log/zmmailboxd_manager.pid: No such process
May 17 21:07:16 srvXXXX zmmailboxdmgr[9356]: assuming no other instance is running
May 17 21:07:16 srvXXXX zmmailboxdmgr[9356]: file /opt/zimbra/log/zmmailboxd.pid does not exist
May 17 21:07:16 srvXXXX zmmailboxdmgr[9356]: assuming no other instance is running
May 17 21:07:16 srvXXXX zmmailboxdmgr[9356]: no manager process is running
May 17 21:07:19 srvXXXX zimbramon[9045]: 9045:info: Stopping convertd via zmcontrol
May 17 21:07:19 srvXXXX zimbramon[9045]: 9045:info: Stopping logger via zmcontrol
May 17 21:07:19 srvXXXX zimbramon[9045]: 9045:info: Stopping ldap via zmcontrol
May 17 21:07:28 srvXXXX zimbramon[1694]: 1694:info: Starting snmp via zmcontrol
May 17 21:07:28 srvXXXX zimbramon[1694]: 1694:info: Starting spell via zmcontrol
May 17 21:07:30 srvXXXX zimbramon[1694]: 1694:info: Starting mta via zmcontrol
May 17 21:07:33 srvXXXX zimbramon[9647]: 9647:info: zmmtaconfig: zmmtaconfig started on srvXXXX.domain.tld with loglevel=3 pid=9647
May 17 21:07:45 srvXXXX zimbramon[9647]: 9647:info: zmmtaconfig: Skipping Global system configuration update.
May 17 21:07:45 srvXXXX zimbramon[9647]: 9647:info: zmmtaconfig: gacf ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException srvXXXX.domain.tld:389)
May
I really don't know what I could do next, in this thread:
Starting ldap...Done. FAILED
I read that I should post the output of
Code:
cat /etc/hosts
cat /etc/resolv.conf
dig yourdomain.com mx
dig yourdomain.com any
host `hostname`  <-- use that exact command with backticks not single quotes
which I will do just in case you need it:

cat /etc/hosts
Code:
zimbra@srvXXXX:~$ cat /etc/hosts
127.0.0.1	localhost.localdomain	localhost
95.xxx.xx.xx	srvXXXX.domain.tld	srvXXXX
cat /etc/resolv.conf
Code:
zimbra@srvXXXX:~$ cat /etc/resolv.conf
nameserver 95.xxx.xx.xx
dig domain.tld mx
Code:
zimbra@srvXXXX:~$ dig domain.tld mx

; <<>> DiG 9.4.2-P2 <<>> domain.tld mx
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 32195
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;domain.tld.			IN	MX

;; Query time: 0 msec
;; SERVER: 95.xxx.xx.xx#53(95.xxx.xx.xx)
;; WHEN: Tue May 18 09:30:13 2010
;; MSG SIZE  rcvd: 28
dig domain.tld any
Code:
zimbra@srvXXXX:~$ dig domain.tld any

; <<>> DiG 9.4.2-P2 <<>> domain.tld any
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 65489
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;domain.tld.			IN	ANY

;; Query time: 0 msec
;; SERVER: 95.xxx.xx.xx#53(95.xxx.xx.xx)
;; WHEN: Tue May 18 09:32:07 2010
;; MSG SIZE  rcvd: 28
hosts `hostname`
Code:
zimbra@srvXXXX:~$ host `hostname`
srvXXXX.domain.tld has address 95.xxx.xx.xx
where all the domains and IP adresses are correct.

I hope that somebody has an idea of what to do next, thanks a lot in advance for any ideas.

Best regards

Last edited by Chodid; 05-18-2010 at 05:34 AM..
Reply With Quote
  #2 (permalink)  
Old 05-18-2010, 02:44 AM
Junior Member
 
Posts: 10
Default

I am no expert in ZIMBRA, before any expert comes in you can try to do 2 more things:

1. Check if there is any other MTA running (eg. exim /sendmail, etc)
Stop them if there is. (presumed you use postfix in zimbra)

2. Run /opt/zimbra/libexec/zmfixperms to fix any permission problem
Try restart zimbra after that.

Hope it helps.
Reply With Quote
  #3 (permalink)  
Old 05-18-2010, 03:22 AM
Intermediate Member
 
Posts: 21
Default

Thanks for your quick reply.

I checked for other MTAs but neither sendmail nor exim are running. Was actually not a surprise because I didn't set them up and zimbra worked fine before

Then I also ran the fixperms command (as root because as zimbra user it gave me permission errors) and tried restarting zimbra, I still get the same error.
Reply With Quote
  #4 (permalink)  
Old 05-18-2010, 03:44 AM
Junior Member
 
Posts: 10
Default

Why I suspect another MTA was running because of this line in your first posting.
May 17 21:07:14 srv2050 postfix/postfix-script[9239]: fatal: the Postfix mail system is not running

What are the processes running after you stop all zimbra stuff?
I suspect it may be due to some ports being used, preventing zimbra services from starting.
Reply With Quote
  #5 (permalink)  
Old 05-18-2010, 03:59 AM
Zimbra Consultant & Moderator
 
Posts: 20,313
Default

As per the post above, check if there's another MTA running and you hould also check the log files for errors when you start zimbra.

There is, however, another problem that's of more concern. In the dig results that you've posted above there are no DNS A or MX records returned for your domain, you need to check that as well.
__________________
Regards


Bill

Last edited by phoenix; 05-18-2010 at 07:40 AM..
Reply With Quote
  #6 (permalink)  
Old 05-18-2010, 05:30 AM
Intermediate Member
 
Posts: 21
Default

Here's a list of all the processes, I checked and the "Flush-202:1" seems to belong to Postfix but when I kill it and recheck the processes its always still running.
Code:
root@srvXXXX:~# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   4016   880 ?        Ss   May17   0:00 /sbin/init
root         2  0.0  0.0      0     0 ?        S    May17   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S    May17   0:00 [migration/0]
root         4  0.0  0.0      0     0 ?        S    May17   0:00 [ksoftirqd/0]
root         5  0.0  0.0      0     0 ?        S    May17   0:00 [events/0]
root         6  0.0  0.0      0     0 ?        S    May17   0:00 [cpuset]
root         7  0.0  0.0      0     0 ?        S    May17   0:00 [khelper]
root        10  0.0  0.0      0     0 ?        S    May17   0:00 [async/mgr]
root        16  0.0  0.0      0     0 ?        S    May17   0:00 [xenwatch]
root        17  0.0  0.0      0     0 ?        S    May17   0:00 [xenbus]
root       237  0.0  0.0      0     0 ?        S    May17   0:00 [sync_supers]
root       239  0.0  0.0      0     0 ?        S    May17   0:00 [bdi-default]
root       241  0.0  0.0      0     0 ?        S    May17   0:00 [kblockd/0]
root       250  0.0  0.0      0     0 ?        S    May17   0:00 [ata/0]
root       251  0.0  0.0      0     0 ?        S    May17   0:00 [ata_aux]
root       259  0.0  0.0      0     0 ?        S    May17   0:00 [khubd]
root       262  0.0  0.0      0     0 ?        S    May17   0:00 [kseriod]
root       284  0.0  0.0      0     0 ?        S    May17   0:00 [rpciod/0]
root       306  0.0  0.0      0     0 ?        S    May17   0:00 [kswapd0]
root       357  0.0  0.0      0     0 ?        S    May17   0:00 [aio/0]
root       375  0.0  0.0      0     0 ?        S    May17   0:00 [nfsiod]
root       381  0.0  0.0      0     0 ?        S<   May17   0:00 [kslowd000]
root       382  0.0  0.0      0     0 ?        S<   May17   0:00 [kslowd001]
root       395  0.0  0.0      0     0 ?        S    May17   0:00 [xfs_mru_cache]
root       396  0.0  0.0      0     0 ?        S    May17   0:00 [xfslogd/0]
root       397  0.0  0.0      0     0 ?        S    May17   0:00 [xfsdatad/0]
root       398  0.0  0.0      0     0 ?        S    May17   0:00 [xfsconvertd/0]
root       400  0.0  0.0      0     0 ?        S    May17   0:00 [crypto/0]
root       491  0.0  0.0      0     0 ?        S    May17   0:00 [khvcd]
root       560  0.0  0.0      0     0 ?        S    May17   0:00 [scsi_tgtd/0]
root       569  0.0  0.0      0     0 ?        S    May17   0:00 [iscsi_eh]
root       584  0.0  0.0      0     0 ?        S    May17   0:00 [bond0]
root       616  0.0  0.0      0     0 ?        S    May17   0:00 [kpsmoused]
root       624  0.0  0.0      0     0 ?        S    May17   0:00 [kstriped]
root       655  0.0  0.0      0     0 ?        S    May17   0:00 [usbhid_resumer]
root       684  0.0  0.0      0     0 ?        S    May17   0:00 [kjournald]
root       815  0.0  0.0  16848   960 ?        S<s  May17   0:00 /sbin/udevd --daemon
root      1283  0.0  0.0      0     0 ?        S    May17   0:00 [flush-202:1]
root      1514  0.0  0.0   3860   592 tty4     Ss+  May17   0:00 /sbin/getty 38400 tty4
root      1515  0.0  0.0   3860   596 tty5     Ss+  May17   0:00 /sbin/getty 38400 tty5
root      1517  0.0  0.0   3860   592 tty2     Ss+  May17   0:00 /sbin/getty 38400 tty2
root      1518  0.0  0.0   3860   596 tty3     Ss+  May17   0:00 /sbin/getty 38400 tty3
root      1519  0.0  0.0   3860   596 tty6     Ss+  May17   0:00 /sbin/getty 38400 tty6
syslog    1557  0.0  0.0  12292   732 ?        Ss   May17   0:00 /sbin/syslogd -u syslog
root      1579  0.0  0.0   8128   592 ?        S    May17   0:00 /bin/dd bs 1 if /proc/kmsg of /var/run/klogd/kmsg
klog      1580  0.0  0.1   6536  3260 ?        Ss   May17   0:00 /sbin/klogd -P /var/run/klogd/kmsg
bind      1605  0.0  0.5  70844 12092 ?        Ssl  May17   0:00 /usr/sbin/named -u bind
root      1629  0.0  0.0  50912  1160 ?        Ss   May17   0:00 /usr/sbin/sshd
root      1656  0.0  0.0  22476  1296 ?        S    May17   0:00 /usr/sbin/vsftpd
root      1766  0.0  0.0   3860   592 tty1     Ss+  May17   0:00 /sbin/getty 38400 tty1
root      2966  0.0  0.0  22476   720 ?        Ss   08:31   0:00 /usr/sbin/vsftpd
nobody    2967  0.0  0.0  28760  1048 ?        S    08:31   0:00 /usr/sbin/vsftpd
root      5822  0.0  0.1  67968  2916 ?        Ss   12:08   0:00 sshd: root@pts/0 
root      5824  0.0  0.0  17600  1860 pts/0    Ss   12:08   0:00 -bash
root      5840  0.0  0.0  14780  1024 pts/0    R+   12:10   0:00 ps aux
The last output of zimbra.log is in my first post, in the other logs there isn't anything standing out exept for mailbox.log where the last entries are from last night as well, no signs of my tries today:
Code:
2010-05-17 20:17:11,936 WARN  [main] [] redolog - There were 33 bytes of junk data at the end of /opt/zimbra/redolog/redo.log.  File will be truncated to 32735 bytes.
2010-05-17 20:17:12,009 INFO  [main] [] redolog - Redoing 1 uncommitted transactions
2010-05-17 20:17:12,009 INFO  [main] [] redolog - REDOING: txn 1274095591.50 [PurgeOldMessages] ver=1.28, tstamp=1274099440901, change=82005, mailbox=5
2010-05-17 20:17:15,037 INFO  [main] [] index - Initialized Index for mailbox 5 directory: LuceneIndex at com.zimbra.cs.index.Z23FSDirectory@/opt/zimbra/index/0/5/index/0 Analyzer=com.zimbra.cs.index.Zimb$
2010-05-17 20:17:15,038 INFO  [main] [] cache - initializing folder and tag caches for mailbox 5
2010-05-17 20:17:16,993 INFO  [main] [] mbxmgr - Mailbox 5 account 6899b6b4-7ecd-455a-8f84-991abc5b00d1 LOADED
2010-05-17 20:17:17,190 INFO  [main] [] redolog - Finished pre-startup crash recovery
2010-05-17 20:17:17,319 WARN  [main] [] dbconn - ignoring error while forcing mysql to flush innodb log to disk
java.sql.SQLException: Error writing file './zimbra/flush_enforcer.frm' (Errcode: 28)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:946)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2985)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1631)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1723)
        at com.mysql.jdbc.Connection.execSQL(Connection.java:3283)
        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1332)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1604)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1519)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:1504)
        at com.zimbra.cs.db.MySQL.flushToDisk(MySQL.java:205)
        at com.zimbra.cs.redolog.RedoLogManager.rollover(RedoLogManager.java:604)
        at com.zimbra.cs.redolog.RedoLogManager.forceRollover(RedoLogManager.java:642)
        at com.zimbra.cs.redolog.RedoLogManager.forceRollover(RedoLogManager.java:638)
        at com.zimbra.cs.redolog.RedoLogManager.start(RedoLogManager.java:268)
        at com.zimbra.cs.redolog.DefaultRedoLogProvider.startup(DefaultRedoLogProvider.java:41)
        at com.zimbra.cs.util.Zimbra.startup(Zimbra.java:209)
        at com.zimbra.cs.util.Zimbra.startup(Zimbra.java:122)
        at com.zimbra.soap.SoapServlet.init(SoapServlet.java:125)
        at javax.servlet.GenericServlet.init(GenericServlet.java:241)
        at org.mortbay.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:440)
        at org.mortbay.jetty.servlet.ServletHolder.doStart(ServletHolder.java:263)
        at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
        at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:685)
        at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
        at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1250)
        at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:517)
        at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:467)
        at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
        at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
        at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
        at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
        at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
        at org.mortbay.comp
I hope this isn't too much, i'll look into the dig results now but since it worked with these settings before I don't think that thats the issue why zimbra doesn't start up anymore.?

Thanks for all the answers so far.

Last edited by Chodid; 05-18-2010 at 05:35 AM..
Reply With Quote
  #7 (permalink)  
Old 05-21-2010, 02:57 PM
Intermediate Member
 
Posts: 21
Default

Well, I was trying to find the cause myself and in that process I realized that the simple reason why zimbra didn't start anymore was that there wasn't any free disk space on the server. Over time without logrotate the log files just got too big, after freeing up disk space and restarting zimbra it suddenly worked again.
Stupid i know but it took me a couple of days before I realized it, always the last thing one thinks of
Thanks a lot for your help!

Best regards
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Similar Threads

Why Join?

Registering let's you ask questions, makes it easier to search, displays any files attached to posts, and notifies you about replies.

blog.zimbra.com




 

SEO by vBSEO ©2011, Crawlability, Inc.