Page 1 of 3 123 LastLast
Results 1 to 10 of 24

Thread: [SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

  1. #1
    greenrenault's Avatar
    greenrenault is offline Partner (VAR/HSP)
    Join Date
    Jul 2006
    Location
    Australia, ACT
    Posts
    197
    Rep Power
    9

    Talking [SOLVED] LDAP / slapd - Database environment corrupt (Issue & Solution)

    Detailed below is both the issue and solution when the LDAP database becomes corrupt. Posting here to help others as I had to raise this with Zimbra support for an answer, which worked!

    -------------------- Our Issue --------------------

    Our Zimbra server lost power recently and as a result performed an unexpected halt. When Zimbra restarted the slapd database was corrupt and LDAP server would not start at all. Investigation revealed that the LDAP log file had become corrupt. So the log file was moved to a temporary directory and Zimbra restarted. Zimbra started up OK however now the following errors were consistently displayed in the Zimbra log.

    I ran the recover on the database but this did not solve the problem.

    Errors in Zimbra log.

    Jun 30 13:47:42 black clamd[6319]: Reading databases from /opt/zimbra/clamav/db
    Jun 30 13:47:58 black slapd[2699]: is_entry_objectclass("", "2.5.6.1") no objectClass attribute
    Jun 30 13:47:58 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/1503308 past current end-of-log of 1/516427
    Jun 30 13:47:58 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database
    files imported from another environment
    Jun 30 13:47:58 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/3245384 past current end-of-log of 1/516427
    Jun 30 13:47:58 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database files imported from another environment
    Jun 30 13:47:58 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/1067196 past current end-of-log of 1/516427
    Jun 30 13:47:58 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database
    files imported from another environment
    Jun 30 13:47:59 black slapd[2699]: is_entry_objectclass("", "2.5.6.1") no objectClass attribute
    Jun 30 13:48:06 black zmtomcatmgr[6520]: status requested
    Jun 30 13:51:09 black slapd[2699]: bdb(): DB_ENV->log_flush: LSN of 1/3279665 past current end-of-log of 1/516803
    Jun 30 13:51:09 black slapd[2699]: bdb(): Database environment corrupt; the wrong log files may have been removed or incompatible database
    files imported from another environment
    Jun 30 13:51:09 black slapd[2699]: bdb(): sn.bdb: unable to flush page: 0
    Jun 30 13:51:09 black slapd[2699]: bdb(): txn_checkpoint: failed to flush the buffer cache Invalid argument
    Jun 30 13:51:13 black slapd[2699]: is_entry_objectclass("", "2.5.6.1") no objectClass attribute

    And errors when trying to recover the LDAP database.

    [zimbra@black ~]$ /opt/zimbra/sleepycat/bin/db_recover -h /opt/zimbra/openldap-data/
    db_recover: Log sequence error: page LSN 1 512617; previous LSN 2 6235099
    db_recover: Recovery function for LSN 1 878 failed on forward pass
    db_recover: PANIC: Invalid argument
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: PANIC: fatal region error detected; run recovery
    db_recover: DB_ENV->open: DB_RUNRECOVERY: Fatal error, run database recovery


    -------------------- Solutions --------------------
    The commands below should get ldap back to a stable state. The lines begin with # or $ to signify whether run by root or zimbra.

    If you have a significant sized ldap database, it will speed things up to tune ldap performance a little as specified in this wiki guide (specifically adding set_cachesize to DB_CONFIG): Performance Tuning Guidelines for Large Deployments - ZimbraWiki

    Solution 1 - Try this first
    The following is based on this post, http://www.zimbra.com/forums/adminis...html#post91501
    Code:
    # su - zimbra
    $ ldap stop
    $ cd /opt/zimbra/openldap-data
    $ /opt/zimbra/sleepycat/bin/db_recover
    This will *recover* the database and checkpoint out the logs. If at that point, they still remain corrupt, then yes you would have to take the steps in your post, but that should only be done as a last resort.

    Also, if it is a master, the accesslog database will also need to be recovered:

    Code:
    $ cd /opt/zimbra/openldap-data/accesslog/db
    $ /opt/zimbra/sleepycat/bin/db_recover
    If you are trying to restore from a backup on a master, you'll need to make sure the accesslog directory structure exists first (see the zmldapenablereplica script), and you'll need to make sure you select the correct database when doing slapadd (The -b '' flag).

    Just moving aside the log files and starting slapd will forever destroy the database when it may otherwise have been recoverable without resorting to backups.

    Solution 2 - Last resort (as provided by Zimbra support)
    Look for the latest ldap backup. On my system it's from this morning; you may want to use the one from yesterday if the system was already down by backup time this morning. For the example I'm using my ldap backup filename: /opt/zimbra/backup/ldap/incr-20070704.080005.554/ldap.bak.

    Code:
    # su - zimbra
    $ ldap stop
    $ exit
    # mv /opt/zimbra/openldap-data /opt/zimbra/openldap-data-0704-crash
    # mkdir /opt/zimbra/openldap-data
    # cp /opt/zimbra/openldap-data-0704-crash/DB_CONFIG /opt/zimbra/openldap-data/DB_CONFIG
    # chown -R zimbra:zimbra /opt/zimbra/openldap-data
    # su - zimbra
    $ ~/openldap/sbin/slapadd -w -q -f ~/conf/slapd.conf -l /opt/zimbra/backup/ldap/incr-20070704.080005.554/ldap.bak
    $ ~/openldap/sbin/slapindex -f ~/conf/slapd.conf
    $ ldap start
    Thanks Zimbra support!
    Last edited by greenrenault; 05-15-2008 at 12:50 PM.

  2. #2
    PeterH is offline Senior Member
    Join Date
    Oct 2005
    Location
    Netherlands
    Posts
    55
    Rep Power
    9

    Default

    Yesterday i upgraded from 4.5.10 to 5.0.1_GA_NE.
    Upgrade went smoothly..
    Today I tried to import a huge mailbox from Domino (2.5Gig) with the newest importwizard (1900).
    Lateron the day the server threw soap-errors on the client and then suddenly didn't want to start anymore...

    I noticed my diskspace dropped from >20G free to ZERO..
    That crashed slapd thus server wouldn't start.

    Using the above mentioned guidelines i managed to get it up and running again. (had to change location of logfile but could read that from console) Seems ok...

    So: Thnx for these instructions!!!

    Still remains to be answered what's eating my diskspace and how to stop that...
    Could it be some temp-files from the importwizard, if so, where are they located?
    Any other suggestions for where to look welcome..
    I'll post my findings looking further into this.. Just hope my server 'll be running ok 2morrow when my users come in..
    regards,
    Peter
    Using ZCS Network-edition 5.0.16 on Ubuntu 6.06.2 LTS and 8.04 LTS

  3. #3
    PeterH is offline Senior Member
    Join Date
    Oct 2005
    Location
    Netherlands
    Posts
    55
    Rep Power
    9

    Default

    ok, so it was the full-backup I ran right after the upgrade as per the instructions... could have figured that out before..
    Now moved backup location to other mount via global settings=>backup/restore=>backup location.

    Maybe this helps someone else in the future
    Using ZCS Network-edition 5.0.16 on Ubuntu 6.06.2 LTS and 8.04 LTS

  4. #4
    bubarooni is offline Advanced Member
    Join Date
    Mar 2007
    Location
    Indiana
    Posts
    185
    Rep Power
    8

    Default

    ok, i had a power failure over the weekend. i was trying to replicate the solution offered here in order to fix the same problem.

    root@mail ~]# su zimbra
    [zimbra@mail root]$ ldap stop
    slapd not running
    [zimbra@mail root]$ mv /opt/zimbra/openldap-data /opt/zimbra/openldap-data-0511-crash
    mv: cannot move `/opt/zimbra/openldap-data' to `/opt/zimbra/openldap-data-0511-crash': Permission denied
    [zimbra@mail root]$ exit
    exit
    [root@mail ~]# mv /opt/zimbra/openldap-data /opt/zimbra/openldap-data-0511-crash
    [root@mail ~]# mkdir /opt/zimbra/openldap-data
    [root@mail ~]# cp /opt/zimbra/openldap-data-0511-crash/DB_CONFIG /opt/zimbra/openldap-data/DB_CONFIG
    [root@mail ~]# chown -R zimbra:zimbra /opt/zimbra/openldap-data
    [root@mail ~]# su zimbra
    [zimbra@mail root]$ ~/openldap/sbin/slapadd -w -q -f ~/conf/slapd.conf -l /opt/zimbra/backup/sessions/incr-20080511.060003.318/ldap/ldap.bak
    The first database does not allow slapadd; using the first available one (2)
    [zimbra@mail root]$ ~/openldap/sbin/slapindex -f ~conf/slapd.conf
    could not stat config file "~conf/slapd.conf": Permission denied (13)
    slapindex: bad configuration file!
    [zimbra@mail root]$ ldap start
    Failed to start slapd. Attempting debug start to determine error.
    bdb(): PANIC: DB_RUNRECOVERY: Fatal error, run database recovery
    bdb_db_close: txn_checkpoint failed: Invalid argument (22)
    backend_startup_one: bi_db_open failed! (-30978)
    bdb_db_close: alock_close failed

    any ideas would be greatly appreciated!!!!

  5. #5
    quanah is offline Zimbra Employee
    Join Date
    May 2007
    Location
    Zimbra
    Posts
    1,271
    Rep Power
    10

    Default

    Quote Originally Posted by greenrenault View Post
    Detailed below is both the issue and solution when the LDAP database becomes corrupt. Posting here to help others as I had to raise this with Zimbra support for an answer, which worked!

    -------------------- Our Issue --------------------

    Our Zimbra server lost power recently and as a result performed an unexpected halt. When Zimbra restarted the slapd database was corrupt and LDAP server would not start at all. Investigation revealed that the LDAP log file had become corrupt. So the log file was moved to a temporary directory and Zimbra restarted. Zimbra started up OK however now the following errors were consistently displayed in the Zimbra log.
    The instructions here are wrong, and what you did was incorrect. As a result, you forced yourself into a situation requiring you to restore from backup.

    What you should have done was made sure slapd wasn't running (which of course it likely wasn't), and then
    Code:
     cd /opt/zimbra/openldap-data
    /opt/zimbra/sleepycat/bin/db_recover
    This will *recover* the database and checkpoint out the logs. If at that point, they still remain corrupt, then yes you would have to take the steps in your post, but that should only be done as a last resort.

    Also, if it is a master, the accesslog database will also need to be recovered:

    Code:
    cd /opt/zimbra/openldap-data/accesslog/db
    /opt/zimbra/sleepycat/bin/db_recover
    in that case.

    If you are trying to restore from a backup on a master, you'll need to make sure the accesslog directory structure exists first (see the zmldapenablereplica script), and you'll need to make sure you select the correct database when doing slapadd (The -b '' flag).

    Just moving aside the log files and starting slapd will forever destroy the database when it may otherwise have been recoverable without resorting to backups.

    --Quanah
    Quanah Gibson-Mount
    Server Architect
    Zimbra, Inc
    --------------------
    Zimbra :: the leader in open source messaging and collaboration

  6. #6
    bubarooni is offline Advanced Member
    Join Date
    Mar 2007
    Location
    Indiana
    Posts
    185
    Rep Power
    8

    Default

    soooo....

    can i move everything back and take the steps you outline?

  7. #7
    bubarooni is offline Advanced Member
    Join Date
    Mar 2007
    Location
    Indiana
    Posts
    185
    Rep Power
    8

    Default

    apparently not. this is crazy. i'm gonna take this thing live in two weeks and i can't get the darn thing working.

    i'm just gonna wipe it and start from scratch.

  8. #8
    greenrenault's Avatar
    greenrenault is offline Partner (VAR/HSP)
    Join Date
    Jul 2006
    Location
    Australia, ACT
    Posts
    197
    Rep Power
    9

    Default

    Quote Originally Posted by quanah View Post
    The instructions here are wrong, and what you did was incorrect. --Quanah
    Instructions updated based on your comments Dude. If these are wrong please update

  9. #9
    quanah is offline Zimbra Employee
    Join Date
    May 2007
    Location
    Zimbra
    Posts
    1,271
    Rep Power
    10

    Default

    Quote Originally Posted by bubarooni View Post
    apparently not. this is crazy. i'm gonna take this thing live in two weeks and i can't get the darn thing working.

    i'm just gonna wipe it and start from scratch.
    What release are you running? What platform? What type of disks? Is /opt/zimbra/openldap-data in NFS or on a SAN rather than local disk? Are you running Xen?
    Quanah Gibson-Mount
    Server Architect
    Zimbra, Inc
    --------------------
    Zimbra :: the leader in open source messaging and collaboration

  10. #10
    jsilence is offline Intermediate Member
    Join Date
    Jan 2008
    Posts
    16
    Rep Power
    7

    Default Tried to recover following these instruction. No success.

    I am having a corrupt LDAP Database due to a Server freeze last week.
    Previously I enabled Replication using the zmldapenablereplica script, but I did not finish the replica server before the problems occured.

    Right now I am in the situation where I can restore from an almost three week old backup which unfortunately becomes corrupt again after a while. This might be due to the fact that I upgraded from 5.0.6 to 5.0.9 after that backup. After a while I get errors like this and the admin interface can not change anything in the LDAP any more.

    Code:
    Sep  8 09:06:02 zimbra slapd[4207]: bdb(): Ignoring log file: /opt/zimbra/openldap-data/logs/log.0000000191: magic number 0, not 40988 
    Sep  8 09:06:02 zimbra slapd[4207]: bdb(): Invalid log file: log.0000000191: Invalid argument 
    Sep  8 09:06:02 zimbra slapd[4207]: bdb(): First log record not found 
    Sep  8 09:06:02 zimbra slapd[4207]: bdb(): PANIC: Invalid argument 
    Sep  8 09:06:02 zimbra slapd[4207]: bdb_db_open: Database cannot be recovered, err -30978. Restore from backup! 
    Sep  8 09:06:02 zimbra slapd[4207]: bdb(): DB_ENV->lock_id_free interface requires an environment configured for the locking subsystem 
    Sep  8 09:06:02 zimbra slapd[4207]: bdb(): txn_checkpoint interface requires an environment configured for the transaction subsystem 
    Sep  8 09:06:02 zimbra slapd[4207]: bdb_db_close: txn_checkpoint failed: Invalid argument (22) 
    Sep  8 09:06:02 zimbra slapd[4207]: backend_startup_one: bi_db_open failed! (-30978) 
    Sep  8 09:06:02 zimbra slapd[4207]: bdb_db_close: alock_close failed 
    Sep  8 09:06:02 zimbra slapd[4207]: slapd stopped.
    When I trie to recover following recipe #2 from the original poster, I get the following error:

    Code:
    zimbra@zimbra:~$ ~/openldap/sbin/slapadd -w -q -f ~/conf/slapd.conf -l /opt/zimbra/backup/ldap.bak                
    The first database does not allow slapadd; using the first available one (2)
    slapadd: empty dn="" (line=5)
    greenrenault writes something about
    and you'll need to make sure you select the correct database when doing slapadd (The -b '' flag).
    But I don't know whether that is related and if so, how to select the correct database.

    Any help would be appreciated.

    -jsl

Page 1 of 3 123 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. LDAP Cannot bind on migration to new server
    By neekster in forum Migration
    Replies: 23
    Last Post: 03-09-2009, 02:08 AM
  2. Replies: 8
    Last Post: 08-07-2008, 05:18 AM
  3. Upgrade: 4.5.5 -> 4.5.6 failed, LDAP/slapd issues
    By Daimyo in forum Installation
    Replies: 7
    Last Post: 08-04-2007, 09:23 PM
  4. Bad 5.0b2 upgrade
    By JoshuaPrismon in forum Installation
    Replies: 1
    Last Post: 07-26-2007, 07:34 PM
  5. 3 testing: LDAP: 389 Failed when restore zimbra
    By victorLeong in forum Administrators
    Replies: 15
    Last Post: 05-24-2007, 06:45 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •