| Welcome to the Zimbra :: Forums! | |
Welcome, if you would like to post a comment please register.
We also encourage you to explore all things Zimbra with our team and members of the community.
|  | 
03-02-2010, 10:41 AM
| | | System specifications Hello all,
I seem to have a conundrum on my hands, and I am hoping that the Zimbra community might be able to help me get to the bottom of it. I am currently managing a Zimbra install that is split into 2 servers. The LDAP & Store are running on one server, and the MTA is running on the other server. The server specs are as follows:
LDAP/MailStore: - Quad Xeon 3.0 GHz
- 6G RAM
- 500G DAS SCSI Storage (RAID 5)
MTA: - Xeon 3.0 GHz
- 4G RAM
- 40G internal storage
I'm sure that there are other stats that are relevant, so please let me know what you need.
The issue that I am facing is that it seems to me that these specs are mostly adequate for my 100 mailbox user base, and yet I have times when the load average on the mail store server gets so high (I saw the 1 minutes load average cross 10 this morning) that the server becomes unresponsive. I have left the server for upwards of an hour, hoping that the load would stabilize, but it never does. I end up hard shutting down the server, and bringing it back up.
I am running both of these servers on Ubuntu 8.04 32 bit, and they are running inside of VMWare ESXi 3.5.
What I find the most vexing is that I am managing another ZCS install for another company using Community Edition and less beefy machines. The second company has a slightly larger userbase, and significantly more mail traffic (upwards of 3 times as much); yet the mail store server for the second company never exhibits this behavior.
While I know that the RAID5 is not ideal, it does not seem to me that I have an I/O issue, either at the ESXi level, or at the vServer level. My servers are never swapping (according to top). When this comes up, there always seems to be a java process that is tying everything. I do not currently have an example of exactly what the command is, and it does seem to vary slightly.
Does anyone out there have any thoughts? | 
03-02-2010, 11:25 AM
| | | the other setup you describing as being better..whats he HW config for it?
i know for sure that old XEON chip with 533 FSB ( even if you have 4 of them) will be slower than single Quad core with 1333FSB..and faster RAM.
we use to run lots of dell 1750 PE, 2 x Xeon 3.0 with 10K SCSI RAID 5 on ESXi a year back and this machines will run on 80% CPU...
We consolidated them on Dell 1950 Quad core and we put 5 zimbra of the dell 1750's on one 1950 ESXi and we still have 70% cpu free
so from my exp..old XEON chips with slower FSB MB some how performs way slower and at higher CPU than single Quad Core
I/O subsystem (RAID 5 ) or so on do have inpact when they fail..but what you seeing may not be a RAID problem ..yet.
Raj PS: by the way that was my 500th Post/Reply heeeeeeeee haaaaaaaaaa..i get happy with small things..lol
__________________ i2k2 Networks
Dedicated & Shared Zimbra Hosting Provider
| 
03-02-2010, 11:38 AM
| | | Well I'm glad that I got to be your 500th reply. And thanks for making it such a speedy one.
All of these Zimbra servers are running on the same physical hardware. ESXi is reporting that the average CPU usage is around 80% (which is obviously higher than I would like it to be ...). The config for the setup that is performing better is:
LDAP/Store: - Dual Xeon 3.0
- 5G RAM
- Identical storage to the other server
My goal all along has been to split these servers onto separate physical hardware, but this rash of recent problems (weekly hard shutdowns) has me concerned that something else is amiss, and I want to make sure that I don't shoot myself in the foot by guaranteeing that new hardware will solve this problem.
For whatever it is worth, the problematic cluster pair has a history that looks like this: - Set up using Community edition in April 2009
- Licensed in Nov 2009
- BES installed in Nov 2009
- upgraded from 5.X to 6.X 2/27/2010
| 
03-02-2010, 01:35 PM
| | | I can attest to that... I have two Zimbra servers; one NE and one CE. The NE has 46 account spread across four domains. The CE has 2000 account across around 50 domains. Both are on Ubuntu 8.04 and both are running Zimbra version 6.04. Both are hosted on the same ESXi box that has two dual quads, 16GB RAM, and a host of mirrored SATAII's. The NE server exhibits the identical behavior as described above. Very frequently, the CPU utilization spikes and TOP reports that a Java process is consuming all free resources. All the while, the CE server never has these issues. The best thing is that I can reproduce the behavior! Every time I log in to the Admin Console, I'm guaranteed to have about a five minute wait until Java settles down and the system stabilizes. During this wait, the regular web client is unresponsive. If I log in to the CE Admin Console, it comes right up never causing any sort of delay on the user's side.
I've seen other discussions on the forums that talk about platform architecture, system sizing, load, etc. However, in all cases, nobody ever seems to give the benefit of the doubt to the original poster. I for one am convinced this is an NE issue.
If there's any additional info I can provide to help resolve this issue permanently, I'm at your service! | 
03-02-2010, 02:01 PM
| | | well NE version sure has more stuff than OSS ..so more services running..
ie: NE runnig Mobile, backups and other stuff will for sure consume more CPU and heavy on I/O
* with 6.xx the logger service and other java process consume more cpu..the fix above may give you a tem resolve.
i belive in the same post in the end mmorse posted the high cpu bug too
Raj
__________________ i2k2 Networks
Dedicated & Shared Zimbra Hosting Provider
| 
03-02-2010, 02:47 PM
| | | Hey, Raj! Great to hear from you!
Agreed. NE certainly is more resource intensive than OSS. However, as discussed in other similar threads, I've already disabled the logger service. Further, I only have one mobile sync device and the backups occur overnight while these problems occur all day long.
I'm afraid that this goes much deeper than platform scaling. | 
03-04-2010, 10:04 AM
| | | How very very bizarre! Perhaps it's related to VMWare? Is the time stable inside the VM? Are you using emulated devices or para-virtualised devices inside the VM (does VMWare even support paravirt)? Why are you running 32-bit OSes, which limits you to 4 GB of RAM? Are you using PAE to access more than 4 GB? If so, you should really look into moving to a 64-bit OS. PAE will severely hamper performance, and still limits you to 4 GB of virtual memory per process.
We're running Zimbra NE 5.0.13 inside a Linux-KVM virtual machine with over 1000 accounts, 90% of which use the advanced web client, with 20 or so Blackberries syncing via ZCB and a handful of iPhones syncing via ActiveSync.
The only time the server load goes above 2.0 (yes, two-point-zero) is when migrating an entire school worth of staff accounts from our old webmail server to Zimbra (via IMAP, iCal import, and CSV addressbook import). The rest of the time, the server just chugs along nicely.
Our VM host is: - Tyan h2000M motherboard
- 2x dual-core AMD Opteron 2220 @ 2.8 GHz
- 16 GB DDR2-SDRAM @ 667 MHz
- 3Ware 9650SE-12ML RAID controller
- 12x 500 GB SATA harddrives in 3 RAID6 arrays, using LVM to stripe them together
- Intel PRO/1000MT quad-port gigabit NIC trunked using LACP
- Ubuntu Server 8.04 LTS 64-bit
The VM config is: - 2 virtual CPUs
- 8 GB of RAM
- separate virtual disks for OS, zimbra mailstore, zimbra backups
- Ubuntu 8.04 LTS 64-bit
This VM box is also hosting separate Linux VMs for 3 websites and a separate LDAP server for Zimbra, along with 2 Windows XP VMs for remote desktop access.
IOW, you have plenty of horsepower, but it's not being passed through to your VMs.
__________________
Freddie
Last edited by fcash; 03-04-2010 at 10:16 AM..
| 
03-04-2010, 11:51 AM
| | | Hmm... It seems that me and 'Meaulnes' and are both using Ubuntu 32-bit. Other than the obvious benefit of allowing for higher memory utilization, would I see any other performance benefits to migrating to 64-bit? I don't relish that thought at all...
I'm yet to be convinced that this is server-platform related. I'm sure there are *many* of you using Ubuntu 8.04-32bit... Is anyone else doing it with an underlying VM infrastructure and, if so, do you have any performance issues? | 
03-04-2011, 06:08 AM
| | Intermediate Member | |
Posts: 16
| | Quote:
Originally Posted by mbristol@uplync.com Hmm... It seems that me and 'Meaulnes' and are both using Ubuntu 32-bit. Other than the obvious benefit of allowing for higher memory utilization, would I see any other performance benefits to migrating to 64-bit? I don't relish that thought at all...
I'm yet to be convinced that this is server-platform related. I'm sure there are *many* of you using Ubuntu 8.04-32bit... Is anyone else doing it with an underlying VM infrastructure and, if so, do you have any performance issues? | Just out of curiosity, why run in a 32bit environment in the first place? There are known memory limitations running 32bit. | | Thread Tools | Search this Thread | | | | | Display Modes | Linear Mode | | Why Join? Registering let's you ask questions, makes it easier to search, displays any files attached to posts, and notifies you about replies.  |