Detailed below is both the issue and solution when the LDAP database becomes corrupt. Posting here to help others as I had to raise this with Zimbra support for an answer, which worked!
-------------------- Our Issue --------------------
Our Zimbra server lost power recently and as a result performed an unexpected halt. When Zimbra restarted the slapd database was corrupt and LDAP server would not start at all. Investigation revealed that the LDAP log file had become corrupt. So the log file was moved to a temporary directory and Zimbra restarted. Zimbra started up OK however now the following errors were consistently displayed in the Zimbra log.
I ran the recover on the database but this did not solve the problem.
Errors in Zimbra log.
Jun 30 13:47:42 black clamd[6319]: Reading databases from /opt/zimbra/clamav/db
Jun 30 13:47:58 black slapd[2699]: is_entry_objectclass("", "2.5.6.1") no objectClass attribute
Jun 30 13:47:58 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/1503308 past current end-of-log of 1/516427
Jun 30 13:47:58 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database
files imported from another environment
Jun 30 13:47:58 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/3245384 past current end-of-log of 1/516427
Jun 30 13:47:58 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database files imported from another environment
Jun 30 13:47:58 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/1067196 past current end-of-log of 1/516427
Jun 30 13:47:58 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database
files imported from another environment
Jun 30 13:47:59 black slapd[2699]: is_entry_objectclass("", "2.5.6.1") no objectClass attribute
Jun 30 13:48:06 black zmtomcatmgr[6520]: status requested
Jun 30 13:51:09 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/3279665 past current end-of-log of 1/516803
Jun 30 13:51:09 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database
files imported from another environment
Jun 30 13:51:09 black slapd[2699]: bdb(): sn.bdb: unable to flush page: 0
Jun 30 13:51:09 black slapd[2699]: bdb(): txn_checkpoint: failed to flush the buffer cache Invalid argument
Jun 30 13:51:13 black slapd[2699]: is_entry_objectclass("", "2.5.6.1") no objectClass attribute
And errors when trying to recover the LDAP database.
[zimbra@black ~]$ /opt/zimbra/sleepycat/bin/db_recover -h /opt/zimbra/openldap-data/
db_recover: Log sequence error: page LSN 1 512617; previous LSN 2 6235099
db_recover: Recovery function for LSN 1 878 failed on forward pass
db_recover: PANIC: Invalid argument
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: PANIC: fatal region error detected; run recovery
db_recover: DB_ENV->open: DB_RUNRECOVERY: Fatal error, run database recovery
-------------------- Solutions --------------------
The commands below should get ldap back to a stable state. The lines begin with # or $ to signify whether run by root or zimbra.
If you have a significant sized ldap database, it will speed things up to tune ldap performance a little as specified in this wiki guide (specifically adding set_cachesize to DB_CONFIG): Performance Tuning Guidelines for Large Deployments - ZimbraWiki
Solution 1 - Try this first
The following is based on this post, http://www.zimbra.com/forums/adminis...html#post91501
This will *recover* the database and checkpoint out the logs. If at that point, they still remain corrupt, then yes you would have to take the steps in your post, but that should only be done as a last resort.Code:# su - zimbra $ ldap stop $ cd /opt/zimbra/openldap-data $ /opt/zimbra/sleepycat/bin/db_recover
Also, if it is a master, the accesslog database will also need to be recovered:
If you are trying to restore from a backup on a master, you'll need to make sure the accesslog directory structure exists first (see the zmldapenablereplica script), and you'll need to make sure you select the correct database when doing slapadd (The -b '' flag).Code:$ cd /opt/zimbra/openldap-data/accesslog/db $ /opt/zimbra/sleepycat/bin/db_recover
Just moving aside the log files and starting slapd will forever destroy the database when it may otherwise have been recoverable without resorting to backups.
Solution 2 - Last resort (as provided by Zimbra support)
Look for the latest ldap backup. On my system it's from this morning; you may want to use the one from yesterday if the system was already down by backup time this morning. For the example I'm using my ldap backup filename: /opt/zimbra/backup/ldap/incr-20070704.080005.554/ldap.bak.
Thanks Zimbra support!Code:# su - zimbra $ ldap stop $ exit # mv /opt/zimbra/openldap-data /opt/zimbra/openldap-data-0704-crash # mkdir /opt/zimbra/openldap-data # cp /opt/zimbra/openldap-data-0704-crash/DB_CONFIG /opt/zimbra/openldap-data/DB_CONFIG # chown -R zimbra:zimbra /opt/zimbra/openldap-data # su - zimbra $ ~/openldap/sbin/slapadd -w -q -f ~/conf/slapd.conf -l /opt/zimbra/backup/ldap/incr-20070704.080005.554/ldap.bak $ ~/openldap/sbin/slapindex -f ~/conf/slapd.conf $ ldap start


LinkBack URL
About LinkBacks





