Zimbra offers Open Source email server software and shared calendar for Linux and the Mac
Go Back   Zimbra :: Forums > Zimbra Collaboration Suite > Administrators

Welcome to the Zimbra :: Forums!
Welcome, if you would like to post a comment please register. We also encourage you to explore all things Zimbra with our team and members of the community.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 07-29-2009, 08:20 PM
Junior Member
 
Posts: 8
Default [SOLVED] Zimbra hangs with 100% CPU load

Hii

I have zimbra ldap server which has been run for 2 month. Today, I found that my zimbra ldap suddenly stopped..

file: /var/log/messages
Jul 30 08:21:53 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: gacf ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException svr01.forestindo.local:389)

Notes: the other zimbra service work properly, just zimbra ldap stopped.
It's happened twice..
Do you have any idea about the problem and solution about this?

Thanks before and I appreciate your help..
Reply With Quote
  #2 (permalink)  
Old 07-29-2009, 09:36 PM
Outstanding Member
 
Posts: 594
Default

> Jul 30 08:21:53 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: gacf ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException svr01.forestindo.local:389)


What error message do you see before this error message ?
Reply With Quote
  #3 (permalink)  
Old 07-30-2009, 12:18 AM
Junior Member
 
Posts: 8
Default

sorry, it's zimbra log. In /var/log/zimbra.log

Jul 30 08:20:06 xxx01 zimbramon[19016]: 19016:info: 2009-07-30 08:20:01, STATUS: xxx01.serverku.local: ldap: Running
Jul 30 08:20:06 xxx01 zimbramon[19016]: 19016:info: 2009-07-30 08:20:01, STATUS: xxx01.serverku.local: snmp: Running
Jul 30 08:20:06 xxx01 zimbramon[19016]: 19016:info: 2009-07-30 08:20:01, STATUS: xxx01.serverku.local: stats: Running
Jul 30 08:21:53 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: Skipping Global system configuration update.
Jul 30 08:21:53 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: gacf ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException xxx01.serverku.local:389)
Jul 30 08:21:54 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: Skipping All Reverse Proxy URLs update.
Jul 30 08:21:54 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: Skipping getAllReverseProxyURLs ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException xxx01.serverku.local:389)
Jul 30 08:21:55 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: Skipping All Reverse Proxy Backends update.
Jul 30 08:21:55 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: Skipping getAllReverseProxyBackends ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException xxx01.serverku.local:389)
Jul 30 08:21:56 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: Skipping All Memcached Servers update.
Jul 30 08:21:56 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: Skipping getAllMemcachedServers ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException xxx01.serverku.local:389)
Jul 30 08:21:56 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: Skipping All MTA Authentication Target URLs update.
Jul 30 08:21:56 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: Skipping getAllMtaAuthURLs ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException xxx01.serverku.local:389)
Jul 30 08:21:57 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: Skipping Configuration for server xxx01.serverku.local update.
Jul 30 08:21:57 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: gs:xxx01.serverku.local ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException xxx01.serverku.local:389)
Jul 30 08:21:57 xxx01 zimbramon[28791]: 28791:info: zmmtaconfig: Sleeping...Key lookup failed.
Jul 30 08:22:04 xxx01 zimbramon[19799]: 19799:info: 2009-07-30 08:22:01, STATUS: xxx01.serverku.local: ldap: Stopped
Jul 30 08:22:04 xxx01 zimbramon[19799]: 19799:info: 2009-07-30 08:22:01, STATUS: xxx01.serverku.local: snmp: Running
Jul 30 08:22:04 xxx01 zimbramon[19799]: 19799:info: 2009-07-30 08:22:01, STATUS: xxx01.serverku.local: stats: Running

Last edited by yut4k4; 07-30-2009 at 12:38 AM..
Reply With Quote
  #4 (permalink)  
Old 08-11-2009, 04:27 PM
Intermediate Member
 
Posts: 17
Default Zimbra hangs with 100% CPU load

I am running Zimbra 5.0.18 and on Sunday the server suddenly went into 100% CPU load, refusing even to respond to a terminal. I have to physically switch it off and on to gain control. Zimbra will start fine, but within two to five minutes will start the 100% CPU/hang again.
zmmtaconfig.log contains this repeating pattern:
Code:
Mon Aug 10 18:54:06 2009  Skipping Global system configuration update.
Mon Aug 10 18:54:06 2009  gacf ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException mail.cfaw.info:389) 
Mon Aug 10 18:54:07 2009  Skipping All Reverse Proxy URLs update.
Mon Aug 10 18:54:07 2009  Skipping getAllReverseProxyURLs ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException mail.cfaw.info:389) 
Mon Aug 10 18:54:07 2009  Skipping All Reverse Proxy Backends update.
Mon Aug 10 18:54:07 2009  Skipping getAllReverseProxyBackends ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException mail.cfaw.info:389) 
Mon Aug 10 18:54:08 2009  Skipping All Memcached Servers update.
Mon Aug 10 18:54:08 2009  Skipping getAllMemcachedServers ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException mail.cfaw.info:389) 
Mon Aug 10 18:54:08 2009  Skipping All MTA Authentication Target URLs update.
Mon Aug 10 18:54:08 2009  Skipping getAllMtaAuthURLs ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException mail.cfaw.info:389) 
Mon Aug 10 18:54:09 2009  Skipping Configuration for server mail.cfaw.info update.
Mon Aug 10 18:54:09 2009  gs:mail.cfaw.info ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.naming.CommunicationException mail.cfaw.info:389)
I have spent a day Googling for answers and am at a loss.
Are there any other logs I should be looking at?
One possibility: I have an account with Campaign Monitor which was broken into this week, possibly resulting in a massive amount of email being sent to me. I have tried blocking all the mail ports on the router to isolate the server but this doesn't solve the problem.
Reply With Quote
  #5 (permalink)  
Old 08-12-2009, 12:55 AM
Moderator
 
Posts: 7,911
Default

Well for some reason it appears not to be able to query LDAP. Anything else in /var/log/zimbra.log or /opt/zimbra/log/* ? Has anything else changed on the server eg. software or patches installed ? Does dmesg or /var/log/messages show anything ?
__________________
Reply With Quote
  #6 (permalink)  
Old 08-12-2009, 04:23 AM
Intermediate Member
 
Posts: 17
Default

There does seem to be a problem with LDAP, but that might be a symptom rather than a cause: if the CPU is running at 100% then LDAP can't function. Here is another clip from /var/log/zimbra.log
Code:
Aug 10 13:40:50 mail postfix/trivial-rewrite[16785]: fatal: proxy:ldap:/opt/zimbra/conf/ldap-vad.cf(0,lock|fold_fix): table lookup problem
Aug 10 13:40:50 mail postfix/proxymap[16788]: error: dict_ldap_connect: Unable to set STARTTLS: -1: Can't contact LDAP server
Aug 10 13:40:50 mail last message repeated 2 times
But what is really strange is that zimbra functions fine for up to 5 mins, delivering and receiving email before it dies. It goes out so suddenly that even though I am running top to check which processes are using the CPU, top dies without showing the culprit.
Reply With Quote
  #7 (permalink)  
Old 08-12-2009, 04:24 AM
Moderator
 
Posts: 7,911
Default

How much memory does the server have ?
__________________
Reply With Quote
  #8 (permalink)  
Old 08-12-2009, 04:39 AM
Intermediate Member
 
Posts: 17
Default

2GBytes. I have tracked free memory up to the point of the crash and only around 1GB is being used.
Reply With Quote
  #9 (permalink)  
Old 08-12-2009, 04:58 AM
Moderator
 
Posts: 7,911
Default

What process is hogging the CPU ?
__________________
Reply With Quote
  #10 (permalink)  
Old 08-12-2009, 05:06 AM
Intermediate Member
 
Posts: 17
Default

The problem is that the system goes to 100% so fast that I can't see what process is hogging it. I run top, watching it until it freezes and the killer process never shows.
The machine is running as a VM (XEN) under CentOS, so I can watch the CPU utilization graph from the hypervisor. But the hypervisor won't tell me *why* it has gone to 100%. (b.t.w. the CPU is a pretty powerful AMD6400 dual-core)
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Similar Threads

Why Join?

Registering let's you ask questions, makes it easier to search, displays any files attached to posts, and notifies you about replies.

blog.zimbra.com




 

SEO by vBSEO ©2011, Crawlability, Inc.