Zimbra offers Open Source email server software and shared calendar for Linux and the Mac
Go Back   Zimbra :: Forums > Zimbra Collaboration Suite > Administrators

Welcome to the Zimbra :: Forums!
Welcome, if you would like to post a comment please register. We also encourage you to explore all things Zimbra with our team and members of the community.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 01-21-2009, 06:04 PM
Moderator
 
Posts: 1,432
Default Backup mx + failover for NE

We are in final stages of evaluating Zimbra NE as a replacement for our current mail system, with Exchange as the other candidate.

The ability to provide at least a cold standby (at a remote location) is a major requirement, and I believe I've worked out procedures which will allow me perform frequent scheduled data synchronization and then bring up the standby should it be needed. Some of this is discussed in these threads:
Mac OS X Leopard Beta: compatibility with Tiger
Of backups and restoration

However I've also been asked to provide a backup mx in the remote location. I suppose the simplest approach would be to just run something on another (third) machine, but I'd like to be able to offer a more elegant solution that minimizes hardware and physical space requirements. In short, ideally, the standby machine would be able to operate as a secondary mx, and if I need to make the sync'ed data "live", I would be able to integrate any mail that had arrived in the meantime.

My first thought on this would be to have some steps where the MTA being used as a secondary mx would be configured to stop listening on standard ports before Zimbra is made active. Then I could tell it to go ahead and deliver its mail to Zimbra on the same machine. (I'd possibly multihome and have the two MTAs on different IP addresses if necessary.)

I'm looking for advice on whether this sounds feasible and any suggestions on proceeding. This would likely be done under Mac OS X 10.4 or 10.5, so any of the common MTAs available for that platform could be used, or even Communigate, which is our current mail server and could be relegated to that role.

I also wonder if I'm making this more complicated than it needs to be. E.g. is there a way to simply submit mail directly by copying files? Or perhaps a way to run Zimbra (FOSS) as a backup mx and then integrate the outgoing mail queue when I bring up the NE?

Will the next major release of Zimbra offer anything to help with this--hopefully without requiring purchase of multiple NE licenses?

Last edited by ewilen; 01-21-2009 at 06:07 PM..
Reply With Quote
  #2 (permalink)  
Old 01-22-2009, 02:27 AM
Moderator
 
Posts: 1,432
Default

Ah, I see from

Zimbra Product Portal
Bug 11423 - disaster recovery through server to server sync (beta)
also Enterprise messaging and collaboration: Zimbra's product roadmap

that "disaster recovery through server to server sync" is planned for 6.0, but the status is currently "at risk", and the details of how the feature will work aren't too clear--as e.g., whether would it require a separate NE license to implement the feature on a NE installation.
Reply With Quote
  #3 (permalink)  
Old 01-22-2009, 07:01 AM
Moderator
 
Posts: 1,209
Default

Sounds like you have two requirements there:
  1. Backup MX in a data center separate from the production Zimbra data center.
  2. Cold standby Zimbra servers in the second data center.

For the backup MX, we use a plain-jane Postfix box and some scripts on the Zimbra and Postfix boxes to export a list of valid email addresses and domains (Zimbra box) and on the Postfix box a separate script looks for changes from the previously exported list of valid email addresses, and makes appropriate changes to the relevant Postfix files. You can run the scripts as often as you like; we run them several times a day.

In this way, we have automated the process of having our backup MX be completely up to date at all times.

For the cold standby servers requirement, I would think the real issues for management are to clarify the amount of Zimbra downtime they are willing to tolerate in the event of an outage at the primary data center, and whether or not they can tolerate any "lost" emails when switching over to the cold standby farm at the backup data center.

The less downtime and data loss that can be tolerated, the more $$$ it will take.

We are experimenting with ldap exports, mysql dumps, and syncing all of /opt/zimbra to Amazon S3, but this is more for D/R than near real-time failover to a secondary data center.

Basically what I am saying is that satisfying the backup MX requirement is pretty straightforward and inexpensive, but satisfying the second requirement for cold standby servers I would not attempt without further refinement of the needs from management.

Hope that helps,
Mark
__________________
___________________________________
L. Mark Stone, CIO


"Uptime. All the time."

477 Congress Street | Portland, ME 04101-3431 | (207) 772-5678

proactive maintenance and monitoring | technology consulting
Zimbra groupware | EMR implementations | private cloud hosting
Reply With Quote
  #4 (permalink)  
Old 01-22-2009, 10:46 AM
Moderator
 
Posts: 1,432
Default

Hi, Mark, thanks for your reply.

Since we're doing comparative evaluations with Exchange, there's strong implication that the benchmark will be defined in terms of Exchange 2007 SP1's Standby Continuous Replication. That said, I've suggested that bringing up the standby might not be instantaneous and that the frequency of replication might be as low as once per hour (meaning up to an hour's worth of messages could be lost), and this has been deemed acceptable.

The method I've worked out is basically to have a clone installation of Zimbra on the standby, and to rsync /opt/zimbra/backup and /opt/zimbra/redolog to it. When necessary, I perform

zmrestoreldap
zmrestoreoffline
zmplayredo

in that order (with appropriate arguments and necessary turning on/off of zimbra services before each command). I'm not sure yet whether it will be better to execute those commands periodically or to wait until the standby needs to be brought online--possibly a daily or weekly automated run to ensure that restoration won't take too long when we really need the standby.

Given that we have on the order of ~100 users and will be pushing the data from a T1 to a 6Mb DSL connection, I believe this is feasible with rsync.

As you can probably see, the standby's purpose is more along the lines of disaster recovery than near-realtime failover. However, in the event of extended downtime at our primary site (e.g., a cable cut on our T1), we would expect to bring up the standby at the secondary site so that users with alternate means of Internet connectivity would be able to access their data and continue their work.

With all that in mind I have to admit that we may not actually need a live secondary mx--brief outages at our primary site should simply result in the sender trying again, while anything over 4-8 hours ought to trigger our standby plan. However it's probably easier to just set up a secondary mx than to argue the point.

What I was hoping to avoid, though, was an extra machine in the secondary site's server room. If we end up using Exchange, then (according to my Exchange-savvy colleague) the primary and secondary servers will be able to handle replication while also jointly accepting mail as primary and secondary mx's. With Communigate, I've found that it's simple to rsync data to a "standby directory" on the secondary mx, and then if I need to make that data "live", I can also easily integrate any messages which have arrived on the secondary mx in the time since the primary became unavailable, using a feature called Foreign Queue Processing.

I suppose that having an extra machine for secondary mx isn't too bad and may be preferable to over-complicating the configuration of the standby. Nevertheless if there's a good way to save space and avoid maintaining a separate piece of hardware, I feel that would be desirable.
Reply With Quote
  #5 (permalink)  
Old 01-22-2009, 01:42 PM
Moderator
 
Posts: 1,209
Default

Exchange replication is indeed more advanced than Zimbra (we support both), and yes, the Exchange replica box can do double duty as a backup MX. You will definitely spend less setup time IMHO deploying Exchange than Zimbra in a failover, WAN-connected two-data-center deployment scenario.

One thing regarding costs; last I checked E2K7 no longer bundles an Outlook CAL like all previous versions of Exchange did. So, unless you are an edu, those Outlook CALs can run you ~ $70 seat, and with 100 users that's another $7,000 in Exchange licensing you may not have counted on.

Zimbra make a big deal about their deployments being much less expensive than Exchange, not only on the licensing front but also on the hardware requirements front, and that has indeed been our experience. You could run 100 users on Zimbra comfortably on a used HP DL-360 G4 with 6GB of RAM and a pair of old single-core 3.0GHz Xeons and RAID1 146GB or 300GB disks.

One "brute-force" D/R scenario we use is to have a spare identical chassis in the production rack and in the second data center. If the first data center gets an extended outage, we go to the data center, shut down the Zimbra server, rsync /opt/zimbra somewhere local just in case, and remove the disk drives. We take the drives to the secondary data center, shove them in the spare chassis there, boot the server, reconfigure the NICs, make a change to public DNS and everything is back in service.

Same as if there is a hardware issue in the primary data center; just yank the disks and put them in the spare chassis there.

If you can afford the downtime for someone to get to the data center to do this, it is a much much less expensive way to get some good redundancy. Elegant it is not! But effective it is!

Hope that helps,
Mark
__________________
___________________________________
L. Mark Stone, CIO


"Uptime. All the time."

477 Congress Street | Portland, ME 04101-3431 | (207) 772-5678

proactive maintenance and monitoring | technology consulting
Zimbra groupware | EMR implementations | private cloud hosting
Reply With Quote
  #6 (permalink)  
Old 01-22-2009, 02:25 PM
Moderator
 
Posts: 1,432
Default

Thanks again for sharing your experiences. Our data centers will be on opposite coasts, though, so I don't think the drive-swapping method will work for us.

Also, yes, we've looked at the Exchange licensing costs and those will probably be a major argument in favor of Zimbra.

I have another question but I'll send it via PM.
__________________
Elliot Wilen
Berkeley, CA

Don't forget to enter your Zimbra version in your forum profile.
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Similar Threads

Why Join?

Registering let's you ask questions, makes it easier to search, displays any files attached to posts, and notifies you about replies.

blog.zimbra.com




 

SEO by vBSEO ©2011, Crawlability, Inc.