Our production system for a liberal arts college of 3000 users, some 2500 migrating from Cyrus in July/August and 500 migrating from GroupWise 7 in November (we need to wait for a few ZCS 5 features), is shaping up like the following. I'm moving our first 30 users from a basic pilot system to this architecture Monday night, so feedback would be appreciated ASAP. :-)
RedHat Enterprise 4 (I'd probably prefer 5, but not if it's not really supported) running RedHat Cluster Suite (compiled from CentOS source to save $1000 annually) on two Dell 2950's. For cost and green sustainability reasons, they are differently configured. The primary node running everything but the MTA is quad-core: two dual-core 3GHz 5160, 16GB RAM. The failover node is lighter weight: single dual-core 2GHz 5130, 8GB RAM. Just so that the failover node has something to do and remains monitored/in use, it will also run the free VMWare Server, hosting the Zimbra MTA as a virtualized guest. A third box, a leftover Dell 1850, will run DNS and a Zimbra LDAP secondary used by the MTA (it doesn't seem possible to host this on the cluster partner, and I don't want to dedicate a lot of RAM or context switches to VMWare).
Both cluster members will have dual-ported QLogic 2462 cards connected to different SAN switches, multipathing handled by RedHat's default dm-multipath.
/opt/zimbra is local to each machine, as required. Primary LDAP, mail store, MySQL, etc. are one large SAN volume managed as a cluster resource.
Our brand-new Compellent Storage Center has some nice thin provisioning and block-level HSM features that seem to make it a very good fit for Zimbra. We should not need to use Zimbra's own HSM, since policies on the Compellent should have similar effect, but if we later decide that gzip-compressing items older than (say) 180 days is a good idea, we could, provided that putting /opt/zimbra/store2 on a different LUN than /opt/zimbra/store will play nicely with clustering. Can anyone comment on that? I want it to be safe for someone other than me to apply Zimbra patches and upgrades.
/opt/zimbra/backup goes to an Apple XServe RAID located at a partner college 5 kilometers away, over single-mode fiber we co-own. So all backups are immediately off-site. There exist SATA-to-FC boxes with better price/performance nowadays, but we got a really good price on several XServe RAIDs when they first came out, so we're sticking with them. /opt/zimbra/backup is *not* configured as a cluster resource, so when Zimbra is running on the secondary, it will be unavailable. (This could be addressed manually in the event of an extended outage.) We will use NetWorker to back up the base OS and to copy the remote disk-to-disk backups to tapes in the primary data center. /opt/zimbra/backup is for DR, and tapes are for longer-term archival.
* Anything glaringly wrong about this?
* Is RedHat Enterprise 5 a fully viable and supported option as soon as this week?
* Is is safe/supported/upgradeable to put /opt/zimbra/backup on a different LUN on a different SAN than /opt/zimbra/store, etc?
* Would it be safe/supported/upgradeable to put the HSM secondary /opt/zimbra/store2 on a different LUN on the same SAN as /opt/zimbra/store, *and* make it a RHCS resource that works with Zimbra Cluster?
* Will MTA for 3000 users perform reasonably as a VMWare guest on an otherwise idle 5130 (dual core 2GHz, 4MB cache)? I know that VMWare performs surprisingly well for some workloads, surprisingly poorly for some others.