Let's Talk About Zimbra DR, Not Just LDAP DR
@Quanah: Thanks for taking the time to clarify here the issues surrounding backing up LDAP. I'd be grateful for your input please on widening the scope of this thread to encompass Zimbra Disaster Recovery in a virtualized environment.
Essentially, we are looking to deploy a supported DR solution for NAT'd Zimbra NE which results in shorter recovery times than that provided by the "zmrestoreldap + zmrestoreoffline" method in the Admin Guide. Specifically, we are looking at a Sandy/Irene/Katrina use case where the primary data center is taken offline and is expected to remain offline for some time. We want to be able to cutover to the secondary data center as quickly as practicable. E.g. Server Live sync - Zimbra :: Wiki looks interesting but is of course not supported.
Challenges and Questions:
- The NE Admin Guides for 7.2 and 8.0 make no mention of the "install.sh -s + rsync" DR method in their Disaster Recovery sections; this method is highlighted in various posts here in the forums, Zimbra blog posts and in several non-certified wiki articles. We have used that method ourselves previously and successfully to get the new replacement Zimbra server up and running quickly with minimal downtime on the old Zimbra server. Assuming we do an LDAP export/import procedure, is this method still viable with ZCS 7.2.x and 8.0.x for DR?
- Tools exist for various hypervisors which leverage disk+RAM snapshots. If we have a disk+RAM snapshot of a running virtualised Zimbra server (e.g. available in XenServer Enterprise), is it safe to use that disk+RAM snapshot:
- alone for DR purposes?
- in conjunction with restoring LDAP after restoring the snapshot at the DR site?
- not at all?
- Do the above answers change if Zimbra is stopped before taking the snapshot at the hypervisor level?
- Related to 2. above, are "crash consistent" snapshots (in conjunction with an LDAP restore) such as those provided by PHD Virtual, Alike etc. and which leverage the disk snapshot APIs in both vSphere and XenServer safe?
- The Admin Guide points out "zmplayredo" can be used when snapshots of subfolders within /opt/zimbra are taken at different times, but suppose we just snapshot the whole virtual machine at one time or utilize SAN replication? The Admin Guide is silent on what we need to do after running "zmplayredo" to complete the DR process. Do we then boot the Zimbra DR server, restore LDAP, start Zimbra OK and then just change public DNS at that point?
Thanks Quanah for any help/pointers you can provide here!
All the best,
Mark