yeah..that is better, rather I am preparing a deep level visio to understand the scenario. Though I feel there are few glitches like at geo dispersed locations, all the boxes will be behind firewall thus should be on Private LAN we can not afford to have it exposed on Internet and this creates the confusion. I think we can achieve the loadbalancing or HA at geo level by putting a GTM solutions like F5 or Radware, which automatically removes the entries of a down segment. Also, relying on Single ISP would not be a good option, in that case owning a IP address and then load balance using BGP would be an other option hence I am evaluating all the options for making a solid HA plan.
I am making a visio and share with you; then can discuss the pros/cons?
Here is the diagram and document I found. Can try incorporating the same while constructing the scenario.
Building a Scalable High-Availability E-Mail System with Active Directory and More | Linux Journal
I was not proposing use external IPs on servers without firewall. For sure, your servers have to be behind firewall, whether using NAT, or transparent firewalling and routing. Both ways work. Only, if you deploy NAT, you have to build dual DNS system - for external and internal resolution (Split DNS), or Zimbra will not work correctly. You may operate with /etc/hosts files, but it turns out to be more productive spend time on correct DNS configuration. In one location we user transparent firewalling, leaving external IPs on actual servers. And firewall, which actually route packets, additionally do full packet inspection, still using external IPs for policies.
Geo load balancing
If not via DNS, then this is OK, as I doubt, that DNS service providers do more granular region separation, rather US/EU/ASIA, etc. As you mentioned, both of your systems will be US located, and thus custom or F5 load balancing would be a way. There are several commercial products available for sure. Citrix NetScaler to add. But I'd rather go with Zimbra built in options or use custom Reverse proxy, like Nginx (both for web and IMAP/POP with/wo SSLs) or other. This significantly lowers costs and give you more personal flexibility options. Actually Zimbra proxy is Nginx based, and it copes well with HTTP, HTTPS, IMAP, IMAPs, POP and POPs redirection to required mail server. But this is slightly another thing, and more connected with deployment of several mailbox servers, rather load balancing and HA. GTM in terms of DNS service would not give you required results.
It depends on class of ISP. And if it's one of available in whole US, their network should be in HA state as such. But it would be more easy to negotiate on subnets and IP assignment, where as different ISPs can not operate the same IP subnet. But as you go with load balancing/routing layer, this, actually, is not of concern any more.
Routing protocols & HA
If you get access or operate with your own BGP or AS, then you have an option to look for anycast solution. Although, I'd say, that this is more connected with regional balancing, rather HA. This will lead your users to their closest server, not master one. And you have to be sure, that even BGP or OSPF has their own TTL settings, which are far more, than HA requires to operate efficiently. I'd say, that these protocols are for other things, rather HA.
In HA environment most significant thing is to recover services with the least amount of time possible, and do not involve regional spreading.
To your graph
Do you consider both sides of graph as different locations? Or the other way, it seems, that it would be wise to build the same systems in both locations, except MTA servers, which may be operational one per site and both online. They actually prolong mail delivery, but it is not significant one, and users notice new email only after mailbox receives it.
I have to say, that Zimbra IMAP and POP solution actually is based on Cyrus, which pleases me, as Cyrus is the way to go for intensive and large scale. So I doubt, that you have to build separate Cyrus nodes, just to provide IMAP/POP access to Zimbra mailbox. I'd go in maximum for options from one vendor, if it suites requirements. Cyrus nodes in your graph may be replaced with Zimbra Proxy, still geting unified management capabilities. And they serve well as for web access and IMAP/POP traffic. Although, I'd separate MTA from Web UI, but that might be only my suspicion. MTAs may suffer big load, if SPAM attacks happen, as well as if you allow large mail attachments, these tasks are very intensive, as antivirus check might need resources. And it would be bad, if web access would suffer from this load. The only service, I'd go with on MTAs in addition to Postfix/SA/Clam, is Zimbra LDAP slave, to make account resolution faster, and offload other servers for this. It still gives you a benefit of reserve copies of your account data. And in such manner, you may scale your MTA stack as you wish. It's purely DNS based, without any other necessities. There are recommendations in this forum to use tmpfs for Amavis disks, which increase productivity of MTAs. Or use outside SPAM/Antivirus scanning solution, leaving Zimbra only to mail delivery. All this is up to unified management, but quality of AS/AV depend on different things, where as Zimbra solution is not the worst one. It actually works quite well. Just keep only systems up to date.
To get an idea, I need some precision:
* what's MUpdate intended?
* what is Master node? (you probably do routing and balancing in facing nodes (cyrus frontend according to your grahp)
* is mailstore1 & mailstore2 active/passive or active/active nodes?
* do you plan sync on SAN level, or other means?
Actually to say, Zextras suite's one of the best features is the fact, that these services work not depending on Zimbra version used (at least major versions should be the same, but this is not requirement). Then you'd deploy starter level SAN each in separate location. This would give you storage flexibility. If you use SAN-SAN sync, then Cluster FS should be deployed. If just rsync zextras backups - you do not need to use anything more, than rsync commandline and cron. Although, you do not get 1:1 up to a second images of store. But that, you mentioned, is sustainable issue. SAN/DRBD sync may introduce some corruptions, where as rsync would be pure, plain and simple, yet not 1:1 up to second.