Results 1 to 7 of 7

Thread: Disaster Recovery Backup of Zimbra

  1. #1
    dbruce is offline Junior Member
    Join Date
    Jul 2008
    Posts
    5
    Rep Power
    7

    Default Disaster Recovery Backup of Zimbra

    We are having a devil of a time getting the Zimbra backup store copied to another device for DR purposes.
    We run Veritas NetBackup 5.1 (yes, I know it's old) and have been unable to successfully run a backup. We run into a number of files limitation (or it could be a related memory issue) when NBU backs up the 500 Million files in the backup store (which contains four full backups and their incrementals). We have successfully run a raw backup that took forever to finish that sure makes it look like a number of files issue.
    We have also copied (a la NDMP) one of the backup stores (we run a three node cluster, each with its own backup store) to our Data Domain box but do not have enough space currently to hold all three backup stores.

    What do other institutions do regarding Disaster Recovery for Zimbra?

  2. #2
    natrixgli's Avatar
    natrixgli is offline Loyal Member
    Join Date
    Jul 2006
    Location
    Milwaukee, Wisconsin
    Posts
    81
    Rep Power
    9

    Default

    We're not an edu, we're a nonprofit but I'm gonna chime in anyhow

    First off, if Netbackup isn't working for you, check out Bacula. I've been using it for some time now and I love the #&@% out of it.

    I highly recommend checking out the FOSS Backup guide in the Zimbra WIKI. Some of the info is a tad outdated but you can piece together a good solution for your needs.

    Open Source Edition Backup Procedure - Zimbra :: Wiki

    Basically what we do is keep it very simple:

    (Our Zimbra server is hosted off site, btw but we still own the box. So we're fully responsible for our own backups.)

    1: Stop Zimbra services
    2: Kill leftover processes run by zimbra user
    3: Use rsync to take a incremental copy of the /opt/zimbra folder
    4: Restart Zimbra services
    5: Make a gzipped tar of the backup folder
    6: Transfer the gzipped tar file via SFTP to our main backup server on site to be written to tape.

    Zimbra is never down for very long, usually only a few minutes. Transferring the tar is time consuming, but we do it only once per day. (Our backups are for disaster recovery, not archiving.) Sometime in the future I plan to implement LVM on the ZCS server to eliminate the need to shut down, but that also makes backups more complicated.


    Cheers,

    -n8
    Last edited by natrixgli; 07-25-2008 at 05:09 PM.

  3. #3
    Rich Graves is offline Outstanding Member
    Join Date
    Jan 2007
    Location
    Minnesota
    Posts
    718
    Rep Power
    9

    Default

    Why copy the data twice or more? NFS-mount a directory on the DataDomain (should be geographically isolated from your production server) at /opt/zimbra/backup. You shouldn't use NFS for live data, but for backup I believe it's ok.

    To get storage diversity, mount different hardware (local disk, iscsi, fc...) on /opt/zimbra/backup on alternate days, and double up the zmbackup -del line in crontab so that it will trim both backup sets.

    I would keep doing /opt/zimbra/conf, /, /var/log, etc. with NetBackup, but especially if your target is DataDomain disk, I see no point in copying /opt/zimbra/(store,db/data,index,backup) again. Zimbra's own d2d backup does a better job.
    Last edited by Rich Graves; 07-26-2008 at 07:36 AM.

  4. #4
    dbruce is offline Junior Member
    Join Date
    Jul 2008
    Posts
    5
    Rep Power
    7

    Default

    We use our backup store locally as an archive to retrieve account content for up to two months. Rather that run the backup twice to create the archive backup and the DR backup we could snapshot the DR backup. We would still have to copy it via NFS to get it to the Data Domain. The NDMP backup took 18 hours the NFS copy will take longer to copy the 500 million files and 3 TB of data. We are thinking NDMP is the better way to go. If we create a smaller volume to hold the used portion of the backup store then we could shrink the 18 hours to something a little more reasonable. Then there are always iSCSI disks.

  5. #5
    hillman's Avatar
    hillman is offline Moderator
    Join Date
    May 2007
    Location
    Vancouver, Canada
    Posts
    75
    Rep Power
    8

    Default

    We're still in the fairly early stages of rolling out our Zimbra deployment, so we don't have that much data yet (< 100gb), but the Netbackup copy of the backup volume is already running out of window. We're at Netbackup 6.5, so a newer version isn't going to help you.

    I've been thinking about ways to get rid of the Zimbra backups entirely - we use NetApp via iSCSI for all of our primary volumes and will use a Thumper running ZFS and iSCSI for the HSM storage. Both of these support snapshots. Now that Zimbra has a command-line tool to replay the redolog, it should be possible to restore a snapshot and then run the redolog against it to bring it up to a point in time. It's on my "to do" list to test this thoroughly in our dev environment next month. Of course, this only works as long as you ensure your redologs are stored separately so that they can't be taken out at the same time should you suffer a storage failure.

    On the NetApp/Thumper side, the iSCSI LUNs are just files, so they're easy to send off to tape quickly if a spinning-disk backup isn't sufficient.

    The Zimbra engineers indicated to me that they're looking at a mechanism to quiesce the server so that a snapshot can be taken, ensuring that what's on disk is consistent. A simple script could then signal zimbra and mysql to flush buffers and quiesce, then trigger snapshots on all storage, then restart the servers. That would be a lot quicker than actually shutting Zimbra down
    Steve Hillman
    IT Architect
    Simon Fraser University

  6. #6
    shan is offline Active Member
    Join Date
    Feb 2008
    Posts
    26
    Rep Power
    7

    Default

    Sorry to use this thread, but it seems fit o.k. My questions about DR:

    With /opt/zimbra, /opt/zimbra/store, and /opt/zimbra/backup on separate SAN disks, is it enough to only backup /opt/zimbra/backup directory off site for disaster recovery purpose?

    Is there a DR wiki for 5.0.X version with multiple-server installation? The one I found is for 4.X, single server installation, which is very different from 5.0.X multi-server installation, and the documentation seems not certified by Zimbra for 5.0.X DR.

    We are now close to 800GB usage in /opt/zimbra/backup directory, with 2000 accounts The user number will double. The standard daily zmbackup is already taken 2 hours. The DR copy will have to be performed after the zmbackup is completed. This will push the backup well into work hours. Any one has performance issue while zmbackup is still running during the peak time?

    BTW, we use TSM (Tivoli Storage Management) to backup the /opt/zimbra/backup directory. It's working fine, except takes long time - 3 hours for about million files, 140GB transfered data. Not sure how this will scale when we have more accounts and data.

    Thanks!

  7. #7
    Rich Graves is offline Outstanding Member
    Join Date
    Jan 2007
    Location
    Minnesota
    Posts
    718
    Rep Power
    9

    Default

    /opt/zimbra/backup is an efficient, reliable, consistent disk-to-disk backup of constantly changing transactional data in /opt/zimbra/{store,db/data,index,openldap-data,redolog}. You definitely don't need to back up those subdirectories again, and if you do, they won't be consistent or useful.

    zmbackup doesn't necessarily get everything in conf, and logs are also nice to back up (for forensics, not DR). I let my traditional backup software get everything with the exception of the transactional data above and temporary queue files in /opt/zimbra/data.

    I don't run a traditional backup of /opt/zimbra/backup, because it's already a perfectly good backup, and as you've noticed, traditional backup software can't handle millions of small files. Consider snapshots, replication, or simply geographic isolation instead.

    To satisfy extraordinary compliance requirements, put /opt/zimbra/backup on storage with WORM/ILM features, such as a Centera, DataDomain, or NetApp NearStore.

    What's your end goal? Even if you achieve a "good" and "fast" backup of the backup in /opt/zimbra/backup, you'll still need to restore it somewhere, then zmrestore, and you can expect zmrestore to take longer than the original zmbackup. At some point it makes sense to stop spending money on traditional backup and invest in a delayed-asynchronous-replicating SAN that would let you bring up a DR site immediately, without multiple restore procedures. /opt/zimbra/backup is still necessary for recovery from user or sysadmin mistakes, but it's not a DR solution for large volumes of mail.

    If you can't afford asynchronous replication for all of /opt/zimbra (minus backup), activate Zimbra HSM and replicate your primary store, but not your HSM store. You'll be able to restore access to email newer than your Zimbra HSM migration period immediately. Older mail would require recourse to backup.

    Yes, asynchronous replication means very recent mail may be lost, but that's a heck of a lot better than rolling back to the last backup. Sites concerned about saving every last byte, regardless of dollar and potential performance cost, can do synchronous replication of redolog and data/postfix/spool only.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 8
    Last Post: 01-12-2012, 02:20 AM
  2. /tmp filling
    By Nutz in forum Administrators
    Replies: 8
    Last Post: 02-22-2008, 02:00 AM
  3. Major Issue - 5.0RC2 NE to 5.0GA NE failed
    By DougWare in forum Installation
    Replies: 7
    Last Post: 01-06-2008, 09:56 PM
  4. Replies: 22
    Last Post: 12-02-2007, 05:05 PM
  5. Zimbra server crashed
    By goetzi in forum Administrators
    Replies: 6
    Last Post: 03-25-2006, 01:00 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •