Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: High CPU Load every couple of days

  1. #1
    pornsakb is offline Intermediate Member
    Join Date
    Sep 2007
    Posts
    21
    Rep Power
    8

    Default High CPU Load every couple of days

    Hi all,

    I'm using ZCS Community 4.5.10 and every couple of days (2-4 days) the CPU load will skyrocket causing the system to become unresponsive. When this happens, I ran top and noticed that the %wa often stay at 99% or a little bit lower. I'm thinking maybe this is because of Java (seeing that it takes the highest amount of resource) or something in the crontab job. As a reference, I've attached a screenshot of top and catalina.out with this post.

    The system serves 15-20 mailboxes, it runs on Ubuntu6, and it is virtualized with Microsoft Virtual Server 2005 R2. Other guest systems (Windows) experience no similar symptoms and they are very stable. Because of that, I think we can safely rule out the possibility of a faulty hard disk. Any idea why this is happening?
    Attached Images Attached Images
    Attached Files Attached Files
    Last edited by pornsakb; 11-29-2007 at 09:51 AM.

  2. #2
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    10

    Default

    My first question would be about resources--specifically RAM--and those "other guest systems" to which you refer. Is the 1.5 GB of RAM that I see listed on your top screen allocated exclusively to the Ubuntu/Zimbra virtual machine, or is that everything you have on your box?

    Knowing that Windows swaps like crazy on less than 512 MB (for XP, and to a lesser extent for 2000 as well), if your whole machine has a gig and a half of RAM and is running both Ubuntu/Zim and Windows in separate virtual machines, you could well be running out of RAM when Windows and Ubuntu try to do some RAM-intensive work at the same time. Best practices here generally come down to 2GB exclusively for Linux and Zimbra as being the sweet spot for even smaller installations than yours. That 2 GB can then support a fair amount of growth; it's what I'm using on a 32 mailbox system and it is extremely responsive, but below that 2 GB floor point systems run a whole lot less efficiently.

  3. #3
    pornsakb is offline Intermediate Member
    Join Date
    Sep 2007
    Posts
    21
    Rep Power
    8

    Default

    Hi,

    The 1.5GB of ram that you see is dedicated exclusively to the virtualized Ubuntu6 machine. I also tried shutting down all other guest systems but the load average does not get reduced.

  4. #4
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    10

    Default

    Quote Originally Posted by pornsakb View Post
    Hi,

    The 1.5GB of ram that you see is dedicated exclusively to the virtualized Ubuntu6 machine. I also tried shutting down all other guest systems but the load average does not get reduced.
    OK, that makes hardware an unlikely candidate for the problem . . . I presume you have at least a moderate-horsepower cpu (doesn't take a killer; I run mine on a single PIII 1.4GHz).

    You might, next time you have such a high utilization, do a tail of your zimbra.log and mail.log (both are in /var/log) files. zimbra.log in particular keeps a record of the activities--with timestamp--that the zimbra software is doing and can give you some pretty useful insights.

    By way of comparison, I ran top on my system and just watched it for the last five minutes. I see java hit the top of the list whenever I log into or do any action in a web client (most of the time my users connect by IMAP, not web client), but the rest of the time it doesn't even show on the list. It does take a fair chunk of RAM when in action, but not a particularly high amount of CPU, and that usage appears to be transient.

    I presume, if java is heavily utilized by your system, that most of your users are using the web client? This is not a bad thing; many systems with hundreds or thousands of users use it heavily; just trying to sort out possibilities.

  5. #5
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    10

    Default

    Quote Originally Posted by pornsakb View Post
    I'm thinking maybe this is because of Java (seeing that it takes the highest amount of resource) or something in the crontab job.
    You can rule out crontab jobs real easily. Just su - zimbra and then run crontab -l, and see if any processes are time-correlated with your problem. Very few of the jobs in crontab take more than a few seconds to run.

  6. #6
    pornsakb is offline Intermediate Member
    Join Date
    Sep 2007
    Posts
    21
    Rep Power
    8

    Default

    I think the key here is to try to find out what is waiting for IO (%wa) and take it from there. Any idea how I can do that?

  7. #7
    mdeneen is offline Active Member
    Join Date
    Jul 2007
    Posts
    45
    Rep Power
    8

    Default

    I had a problem like this once, but Linux and not windows. Oddly enough, it happened because the server had a lot of RAM.

    Linux, by default, dedicate 10% of memory for the disk IO cache. The server had 16 gigabytes of memory, so file writes would buffer up to 1.6G and then everything would wait for the disk to catch up. It was a fairly speedy disk array, but it could not keep up under heavy loads.

    Anyway, in linux, you can tune how much memory is reserved for file buffers, and how frequently it will write the buffer to the disk. I ended up doing something like this in /etc/sysctl.conf:

    Code:
    vm.dirty_background_ratio = 1
    vm.dirty_ratio = 1
    vm.dirty_expire_centisecs = 50
    vm.dirty_writeback_centisecs = 50
    followed by sysctl -p to use the new values.

    which means that it will reserve 1% of memory for file buffers, and it will write the buffers to disk every half a second. Obviously, you have to play with these numbers a bit to figure out what works best for you.

    I have never used Microsoft Virtual Server, but given the description I have to wonder if file writes are taking too long. Have you benchmarked the disk performance to see how well it can perform? The iostat command can help you see how much io the system is doing.

    Good luck!

  8. #8
    pornsakb is offline Intermediate Member
    Join Date
    Sep 2007
    Posts
    21
    Rep Power
    8

    Default

    Attached is the results produced by hdparm -tT before the CPU spikes. Are the figures normal?
    Attached Images Attached Images

  9. #9
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    10

    Default

    I don't know if you have noticed but there are a number of threads on high CPU load that might be worth reviewing in case they give you any insight:

    Here's one suggesting that NFS may (or may not, depending on the poster) be part of the problem.

    Another traced it to zmmtaconfig syncing to LDAP.

    Still another (Bugzilla 15598)was a bug (fixed since 4.5) related to malformed MIME messages.

    It also occurs to me just now to ask-is your Ubuntu fully patched? Could also be something with an odd module. . .

    Of course your issue may well be none of the above. Unfortunately finding these things can be somewhat of a needle-in-a-haystack search. Have you checked your zimbra.log file yet?

    And if it seems that I'm scattergunning here, I am. Until more information is discovered this could take you any of a zillion ways. . .

  10. #10
    mdeneen is offline Active Member
    Join Date
    Jul 2007
    Posts
    45
    Rep Power
    8

    Default

    This is from our zimbra server. It has a pair of mirrored 750GB SATA drives which are moderately fast.

    Code:
    [root@email ~]# hdparm -Tt /dev/sda
    
    /dev/sda:
     Timing cached reads:   2296 MB in  2.00 seconds = 1145.88 MB/sec
     Timing buffered disk reads:  184 MB in  3.00 seconds =  61.30 MB/sec
    [root@email ~]# cat /proc/scsi/scsi
    Attached devices:
    Host: scsi1 Channel: 00 Id: 00 Lun: 00
      Vendor: ATA      Model: ST3750640AS      Rev: 3.AA
      Type:   Direct-Access                    ANSI SCSI revision: 05
    Host: scsi1 Channel: 00 Id: 01 Lun: 00
      Vendor: ATA      Model: ST3750640AS      Rev: 3.AA
      Type:   Direct-Access                    ANSI SCSI revision: 05
    I suppose it all depends on what kind of disk you have. Your numbers are not particularly good, but they are also not particularly bad.

    I don't have much experience with Ubuntu on servers, but do you have anything in /var/log/sa ? The files in there may provide some useful system information over the "wait" periods.

    Mark

    Quote Originally Posted by pornsakb View Post
    Attached is the results produced by hdparm -tT before the CPU spikes. Are the figures normal?

Page 1 of 2 12 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. High CPU usage on server and client
    By jack-Z in forum Administrators
    Replies: 10
    Last Post: 01-07-2009, 10:12 PM
  2. poor performance ZCS NE vs. ZCS OSS
    By fisch09 in forum Administrators
    Replies: 5
    Last Post: 08-19-2007, 06:17 AM
  3. CPU Load
    By claros in forum Administrators
    Replies: 1
    Last Post: 08-11-2006, 06:56 PM
  4. HIGH CPU use
    By brwatters in forum Administrators
    Replies: 3
    Last Post: 07-05-2006, 08:47 AM
  5. 3.1 on FC4 problems
    By cohnhead in forum Installation
    Replies: 8
    Last Post: 05-26-2006, 11:16 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •