Results 1 to 7 of 7

Thread: Use of load balancers with a multi-box Zimbra install

  1. #1
    mcgreen is offline Trained Alumni
    Join Date
    Apr 2008
    Posts
    2
    Rep Power
    7

    Default Use of load balancers with a multi-box Zimbra install

    An architecture question, regarding the use of load balancers and/or SSL accelerators in conjunction with a multi-box Zimbra install. We are an edu with 30,000 accounts, and would especially appreciate feedback from other edu's.

    We are beginning to implement our dev, test and production servers for Zimbra. We are currently planning to have a load balancer in front of three Zimbra MTAs; the MTAs will also have the Zimbra proxy installed, for IMAP, POP and HTTP. We would like to hear about the experiences of other sites that have used load balancers in front of their Zimbra systems. What special considerations may we have to resolve for this architecture?

    Our load balancers also have an SSL accelerator module for SSL decryption. In our legacy system, the load balancers front-ended the SSL traffic, performing SSL decryption; communication with the back-end servers was non-SSL. All systems involved were duly protected by a hardware firewall as well as other security mechanisms.

    We are looking for other sites with experience in either or both of these options, particularly with wise words about configuration, pitfalls, etc.

    Thanks for your comments.

  2. #2
    nickragusa is offline New Member
    Join Date
    Apr 2008
    Posts
    4
    Rep Power
    7

    Default

    mcgreen --

    Here at Brandeis University, we are in the process of rolling out a multi-server install with 7000 accounts behind the Cisco ACE load balancer module. Today we're in the process of wrapping up an opt-in period and getting underway with migrating the rest of the university. The road to here was relatively painless, but not without a bruise or two along the way.

    Our infrastructure is comprised of 12 servers -- 4 mail stores (with the potential of adding 2 more), 2 ldap (master and replica), 2 IMAP/POP3 proxies, 2 MXs, and 2 dedicated name servers (for running a split domain). All servers are Xen 3.1.3 virtual machines, running RHEL 5.1 x86. For more details about server infrastructure, please see this post by another member of our team: Zimbra and Xen.

    Our load balance configuration consists of the following:
    - 3 server farms -- zimbra-web, zimbra-proxy, and zimbra-mx
    - The zimbra-web farm consists of the 4 mail stores, with probes on ports 80 and 443
    - The zimbra-proxy farm consists of the 2 proxies with probes on 993 and 995
    - The zimbra-mx farm consists of the 2 MXs with probes on 25 and 465

    Our probes are simple tcp connections to the specified ports, which are executed every 5 seconds. If the probe fails, a server is immediately 'out of service' and a pass detect interval of 15 is set before a server can be active in the farm (a probe must succeed 3 times before it can join the farm again). Down the road we will configure better probes, such as HTTP probes on the mail stores (we'd expect a 302 on 80 and a 200 on port 443), SMTP probes for the MXs (we can make an client request to the port and expect a HELO back), etc. There is some danger in doing SMTP probes since the MXs are configured to rate limit connections, however, that can easily be changed.

    SSL termination was a consideration of ours, however, we chose not to proceed. For starters, there are a handful of known issues with terminating SSL on the Cisco ACE -- both security and stability related. Secondly, this was before the web proxy was introduced, which would have made this LB scenario extremely complex. For example, you need 1 priv key / cert pair for your load balanced VIP, say mail.example.com. Additionally, you'll need a priv key / cert for each mail store you had, since Zimbra would redirect you if the user did not land on their appropriate mail sotre (in our case you had a 1 in 4 chance). Web proxying would fix all of this, though this was just introduced in v 5.0.5 and documented in v 5.0.6. It is currently delivered with a 'BETA' disclaimer so I'm reluctant to put it into production. I've given it a shot with minimal success, so we're sticking with a certificate with SANs.

    Speaking of SANs (Subject Alternate Names), it was a little bit of a to do in order to get certificates working properly. SANs basically allow you to specify multiple CNs in 1 certificate. Our current CA Thawte does not sign certificates with SANs -- nor do many of the big CAs. We did have success with using Digi Cert and was rather impressed with their responsiveness. Verifying our domain was easy (almost too easy) and using their web interface was a snap. One thing to be sure of is to include the CN of the certificate as 1 of the SANs else you'll still get a browser warning.

    Installing the certificates was somewhat of a pain, though that was a known bug. Our work around was to install the certificates manually to all our servers:
    Code:
    for i in $ZIMBRASERVERS; do scp * root@${i}:/opt/zimbra/ssl/zimbra/commercial/ && ssh root@${i} "chmod 700 /opt/zimbra/ssl/zimbra/commercial/commercial_ca.crt"; done
    for i in $ZIMBRASERVERS; do ssh root@${i} "/opt/zimbra/bin/zmcertmgr deploycrt comm /opt/zimbra/ssl/zimbra/commercial/commercial.crt /opt/zimbra/ssl/zimbra/commercial/commercial_ca.crt"; done
    for i in $ZIMBRASERVERS; do ssh root@${i} "su - zimbra -c 'zmcontrol stop && zmcontrol start'"; done
    This bug I believe has been fixed, though I can't say for sure: Bug 24153

    One of the more difficult aspects of running a multi-server install is trying to follow the documentation. Even the multi-server documentation itself gets confusing -- most of it is written without specifying which server in your multi-server environment things should be executed from! Additionally, you won't find everything you need to know in the multi-server docs, so you'll have to touch base with the single-node docs and make your own judgment calls. Not a show stopper, just a few extra steps involved.

    Infrastructure aside, migrating existing users has been our primary focus of this whole process. Our original plan was to do the cutover over in a weekend, with the hope that we could update DNS to point old IPs (old imap server, outbound smtp, etc.) to the new load balanced VIP so users would not have to update their existing mail clients. Ha! Zimbra recommends imapsync are the recommended methodology of migrating users, which is basically a perl script which serially copies e-mails one by one between 2 different mail stores. To help automate our Zimbra account creation, we have created a custom python script which performs the following:
    - Takes in a comma delimited file of uids
    - One by one, verifies they have an existing mail account
    - Creates a zimbra account and sets their COS
    - Sends the user a kick off e-mail letting them know the process is about to begin
    - Syncs their e-mail via imapsync
    - After the first pass of imapsync, start forwarding mail to their Zimbra account, makes their old INBOX and folders read-only, then runs through a second imapsync pass (this is to prevent user's from making any updates, moving e-mails to different folders, etc.) and make sure we have a perfect replica of their old mail
    - Once successful, we create a new INBOX on the user's old account which contains instructions on how to confirm their account (via another web-app system we developed) as well as instructions on how to update their existing mail clients.

    To date, we have not had a single complaint about this process. We did this in an incremental fashion -- starting with a subset of about 30 people to move first. Once we ironed out the kinks, we opened it up to our technology departments which was about 140 people. We then expanded our account confirmation system to allow users an opportunity to opt-in. Since we made the opt-in available (2008/05/14), about 300 people have signed up and migrated their account. We've received a lot of praise from our users -- both from a user experience during the migration / confirmation process, as well as excitement to use a high quality product like Zimbra.

    All in all we're extremely happy with our decision to go to Zimbra. Once mail has been completely migrated, we intend to migrate all of our calendar data from Oracle Calendar into Zimbra (see Oracle Calendar to Zimbra pain) which if you read the post can see we're dealing with a culture shock more than any technical challenges.

    I'll be happy to keep you updated once we start the campus migration (scheduled to start 2008/07/07) and let you know the end result.
    Nick Ragusa
    Systems Engineer Manager
    Brandeis University

  3. #3
    mcgreen is offline Trained Alumni
    Join Date
    Apr 2008
    Posts
    2
    Rep Power
    7

    Default

    Many thanks for your excellent reply. Please do keep us posted on your progress, and we will do the same. Thanks again.

  4. #4
    phoenix is offline Zimbra Consultant & Moderator
    Join Date
    Sep 2005
    Location
    Vannes, France
    Posts
    23,581
    Rep Power
    57

    Default

    Welcome to the forums.

    Quote Originally Posted by nickragusa View Post
    This bug I believe has been fixed, though I can't say for sure: Bug 24153
    You're correct, it has been fixed and should be included in the upcoming release 5.0.7
    Regards


    Bill


    Acompli: A new adventure for Co-Founder KevinH.

  5. #5
    Klug's Avatar
    Klug is offline Moderator
    Join Date
    Mar 2006
    Location
    Beaucaire, France
    Posts
    2,322
    Rep Power
    13

    Default

    Quote Originally Posted by nickragusa View Post
    - The zimbra-web farm consists of the 4 mail stores, with probes on ports 80 and 443
    .../...
    If the probe fails, a server is immediately 'out of service'
    There's something I don't understand here : are you telling us you're achieving load-balancing with ZCS?

    AFAIK, this is not possible : one user account is attached to a single mailstore server.
    If you pull one mailstore server out, the users running on this server won't have any more access to his account...

  6. #6
    su_A_ve is offline Advanced Member
    Join Date
    Dec 2006
    Posts
    184
    Rep Power
    8

    Default

    Wow - we have upwards of 11K accounts and are only using 2 boxes

    In any case, I agree, I don't see what is the benefit of the load balancer. Once a connection is made, it will be redirected to the mailstore. If it's via imap, then it would hit the proxy, which would hit the mailstore anyway. I could see some benefits here, but definetly not for the web interface...

    My .02...

  7. #7
    jwtadmin is offline Intermediate Member
    Join Date
    Apr 2008
    Posts
    17
    Rep Power
    7

    Default

    The benefit in this case is high availability with low overhead. In this config each mailstore can answer for the web requests and do their redirects. This lowers the overall stress on a single mailstore acting as the redirector. It does work because the mail store will redirect the request to the correct server. In our case 1 out of 4 requests will hit the users actuall mailstore.

    As far as IMAP proxy this has great bebefit as it removes the proxy as another single point of failure.

    Yes you still have the mailstores as single points of failure, but by adding additional mailstores you spread the risk across your cluster. In our case we would only affect 1/4 of our population if we had a failure on a mailstore. For me this is a lot better that a complete or 50% failure.

    Also given that this on a Xen infrastructure we can afford to deploy this like it is as adding additional VM's is a LOT less work than adding physical HW. Plus we remove HW as a point of contention if there is maintainance or physical issues with the HW.

    This is our approach to this and our millage may vary, we have been called crazy for doing this and perhaps we are, but so far so good.
    John Turner
    Brandeis University
    Waltham MA

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 31
    Last Post: 12-15-2007, 09:05 PM
  2. Replies: 22
    Last Post: 12-02-2007, 05:05 PM
  3. Zimbra shutdowns every n hours.
    By Andrewb in forum Administrators
    Replies: 13
    Last Post: 08-14-2007, 08:55 AM
  4. Post instsallation problems
    By Assaf in forum Installation
    Replies: 14
    Last Post: 01-29-2007, 11:38 AM
  5. svn version still won't start
    By kinaole in forum Developers
    Replies: 0
    Last Post: 10-04-2006, 06:47 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •