So, after doing a full cache flush in the browser, things did speed up a bit, but it was a temporary win.
I'm now able to correlate SLAPD processor utilization spikes, MTA failures to authenticate against the LDAP, and slow/failed loading of the Administrative web interface.
Following, are log and tool outputs pulled during a particularly ugly attempt to load the Administrative web interface - in fact, the interface never loaded on this try.
Excerpt from zimbra.log on my MTA server:
Code:
May 25 15:25:52 mta02 postfix/trivial-rewrite[6304]: warning: dict_ldap_lookup: Search error -5: Timed out
May 25 15:25:52 mta02 postfix/trivial-rewrite[6304]: fatal: ldap:/opt/zimbra/conf/ldap-vmd.cf(0,lock|fold_fix): table lookup problem
May 25 15:25:53 mta02 postfix/master[5436]: warning: process /opt/zimbra/postfix/libexec/trivial-rewrite pid 6304 exit status 1
At the same time, I'm seeing SLAPD run at 99% processor utilization on an otherwise quiet machine:
(output of TOP on server running Zimbra's LDAP)
Code:
Top - 15:25:29 up 11 min, 1 user, load average: 0.22, 0.28, 0.23
Tasks: 81 total, 2 running, 79 sleeping, 0 stopped, 0 zombie
Cpu(s): 27.0% us, 73.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si
Mem: 2075928k total, 698948k used, 1376980k free, 18524k buffers
Swap: 4096440k total, 0k used, 4096440k free, 193512k cached
PID USER PR NI %CPU TIME+ %MEM VIRT RES SHR S COMMAND
3487 zimbra 18 0 98.9 0:21.98 1.4 261m 28m 4740 S slapd
5088 zimbra 20 0 1.0 0:19.80 14.0 917m 284m 38m S java
1 root 16 0 0.0 0:00.55 0.0 2644 548 468 S init
2 root 34 19 0.0 0:00.00 0.0 0 0 0 S ksoftirqd/0 I'm thinking we may have a bug with LDAP, that is causing cascading issues in services that rely on LDAP data connections...
Additional information on my install . . .
- This issue is (and these metrics pulled) from a freshly rebooted stack of servers - uptime between 10-15 minutes when the problem re-occurred.
- This server is running Release 5.0.6_GA_2314.RHEL4_20080522092131 CentOS4 NETWORK edition
- This server was upgraded from 4.5.10_GA last night without any install errors (install log available).
- Minor post install issues encountered with the MTA machine consistantly inheriting an incorrect value for the 'zimbraMtaAuthURL', even though LDAP value is correct (can provide more details, but may be an unrelated bug).
Robert