High I/O wait times leads to a slow system We recently went through a major upgrade in our email system (about 3500 active users) where we changed quite a few aspects of the system, all presumably helpful changes that would make the system work better and/or give us greater flexibility to make changes in the future. Now, we are experiencing significant I/O problems, especially when a large volume of messages get delivered to the server all at once, like our daily campus announcements. Briefly, here are the changes that we made:
ZCS 5.0.9 --> ZCS 5.0.12
RHEL5 32-bit --> RHEL5 64-bit
2TB SATA DAS, RAID 1-0 via fibre channel --> 5TB SATA SAN, RAID 5-0 via iSCSI
OS on a dedicated server --> OS installed on VMWare
Single server running ZCS --> One server running MTA, other server running the rest (both virtual servers)
There are three things that I think may be causing this problem, but I'm not sure which is the real culprit: iSCSI, the RAID 5-0, or VMWare. My first guess is the RAID 5-0, but our hope was that going from 4 disks in a RAID0 mirrored to another 4 disks and moving to 14 active disks in a RAID 5-0 would provide an increase in I/O speed since it was writing to many more disks at once and was managed by a SAN which should be able to run a RAID 5-0 quickly and efficiently. Before we wait for a maintenance time to implement a change that may not even work, I'm wondering if anyone has any suggestions as to what we might do to optimize our setup and make this situation better. If more detailed information on any piece of our setup would help, please let me know and I will post it. Thanks in advance for any suggestions! |