Results 1 to 5 of 5

Thread: zmmailboxd: garbage collector stops application for several minutes

  1. #1
    cosmospt is offline Member
    Join Date
    Sep 2009
    Posts
    12
    Rep Power
    5

    Default zmmailboxd: garbage collector stops application for several minutes

    Hi All

    I am currently using zcs-6.0.2_GA_1912.UBUNTU8_64 (Open Source Edition) and sometimes the zmmailbox process stops for several minutes. The entry in zmmailboxd.out is the following:

    49663.762: [GC 49663.762: [ParNew (promotion failed): 93675K->90505K(94016K), 2.6561510 secs]49666.418: [CMS[Unloading class sun.reflect.GeneratedMethodAccessor217]
    [Unloading class sun.reflect.GeneratedMethodAccessor1]
    [Unloading class sun.reflect.GeneratedSerializationConstructorAcces sor2]
    [Unloading class javax.servlet.Servlet]
    [Unloading class sun.reflect.GeneratedConstructorAccessor9]
    [Unloading class sun.reflect.GeneratedConstructorAccessor3]
    [Unloading class sun.reflect.GeneratedConstructorAccessor87]
    [Unloading class sun.reflect.GeneratedConstructorAccessor86]
    [Unloading class org.apache.jasper.servlet.JspServlet]
    [Unloading class sun.reflect.GeneratedMethodAccessor232]
    [Unloading class sun.reflect.GeneratedMethodAccessor228]
    [Unloading class sun.reflect.GeneratedSerializationConstructorAcces sor1]
    [Unloading class sun.reflect.GeneratedConstructorAccessor2]
    [Unloading class sun.reflect.GeneratedSerializationConstructorAcces sor3]
    [Unloading class javax.servlet.GenericServlet]
    [Unloading class sun.reflect.GeneratedMethodAccessor2]
    [Unloading class sun.reflect.GeneratedConstructorAccessor6]
    [Unloading class sun.reflect.GeneratedConstructorAccessor5]
    [Unloading class sun.reflect.GeneratedMethodAccessor216]
    [Unloading class sun.reflect.GeneratedConstructorAccessor10]
    [Unloading class javax.servlet.http.HttpServlet]
    [Unloading class javax.servlet.ServletConfig]
    [Unloading class sun.reflect.GeneratedMethodAccessor233]
    [Unloading class sun.reflect.GeneratedMethodAccessor227]
    [Unloading class sun.reflect.GeneratedMethodAccessor234]
    [Unloading class sun.reflect.GeneratedConstructorAccessor4]
    : 287868K->105350K(315392K), 193.3510190 secs] 378582K->105350K(409408K), [CMS Perm : 71863K->71612K(131072K)], 196.0071700 secs] [Times: user=0.50 sys=0.12, real=196.01 secs]
    Total time for which application threads were stopped: 196.0071700 seconds


    My GC configuration is the following:

    $ zmlocalconfig mailboxd_java_options | fmt
    mailboxd_java_options = -server -XX:NewRatio=2 -Djava.awt.headless=true
    -XX:MaxPermSize=128m -XX:SoftRefLRUPolicyMSPerMB=1
    -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
    -XX:+PrintGCApplicationStoppedTime -XX:+UseConcMarkSweepGC

    $ zmlocalconfig mailboxd_java_heap_new_size_percent
    mailboxd_java_heap_new_size_percent = 25

    Can anyone help?

    Thanks

  2. #2
    jocx is offline Active Member
    Join Date
    Nov 2007
    Location
    Columbus, Ohio
    Posts
    33
    Rep Power
    7

    Default

    I have something similar. I see
    ParNew (promotion failed)
    as you do.

    I see the java process go cpu intensive after some
    time of mailboxd running (days sometimes) of
    Release 6.0.2_GA_1912.UBUNTU8 UBUNTU8 NETWORK edition.

    I get mailboxd service delays with
    CMS: abort preclean due to time

    I can do a /opt/zimbra/bin/zmmailboxdctl restart
    to restart the java process and things run better (CPU
    load drops, no CMS aborts for a while).


    Total time for which application threads were stopped: 0.0006790 seconds
    Total time for which application threads were stopped: 0.0002510 seconds
    en_US
    Total time for which application threads were stopped: 0.0019170 seconds
    Total time for which application threads were stopped: 0.0002690 seconds
    91944.382: [GC 91944.382: [ParNew (promotion failed): 554816K->554816K(554816K), 10.0587830 secs]91954.441: [CMS[Unloading class sun.reflect.GeneratedMetho\
    dAccessor1]
    [Unloading class sun.reflect.GeneratedConstructorAccessor2]
    [Unloading class sun.reflect.GeneratedConstructorAccessor93]
    [Unloading class sun.reflect.GeneratedConstructorAccessor99]
    [Unloading class sun.reflect.GeneratedConstructorAccessor92]
    [Unloading class org.apache.jasper.servlet.JspServlet]
    [Unloading class javax.servlet.http.HttpServlet]
    [Unloading class sun.reflect.GeneratedConstructorAccessor5]
    [Unloading class sun.reflect.GeneratedMethodAccessor118]
    [Unloading class sun.reflect.GeneratedMethodAccessor22]
    [Unloading class sun.reflect.GeneratedConstructorAccessor30]
    [Unloading class sun.reflect.GeneratedMethodAccessor122]
    [Unloading class sun.reflect.GeneratedConstructorAccessor4]
    [Unloading class sun.reflect.GeneratedConstructorAccessor6]
    [Unloading class javax.servlet.GenericServlet]
    [Unloading class sun.reflect.GeneratedConstructorAccessor100]
    [Unloading class sun.reflect.GeneratedConstructorAccessor10]
    [Unloading class sun.reflect.GeneratedSerializationConstructorAcces sor1]
    [Unloading class sun.reflect.GeneratedMethodAccessor119]
    [Unloading class sun.reflect.GeneratedMethodAccessor21]
    [Unloading class sun.reflect.GeneratedMethodAccessor117]
    [Unloading class sun.reflect.GeneratedSerializationConstructorAcces sor2]
    [Unloading class sun.reflect.GeneratedMethodAccessor115]
    [Unloading class sun.reflect.GeneratedConstructorAccessor11]
    [Unloading class sun.reflect.GeneratedConstructorAccessor25]
    [Unloading class sun.reflect.GeneratedConstructorAccessor8]
    [Unloading class sun.reflect.GeneratedConstructorAccessor3]
    [Unloading class sun.reflect.GeneratedConstructorAccessor26]
    [Unloading class sun.reflect.GeneratedConstructorAccessor82]
    [Unloading class javax.servlet.Servlet]
    [Unloading class sun.reflect.GeneratedConstructorAccessor95]
    [Unloading class sun.reflect.GeneratedConstructorAccessor98]
    [Unloading class sun.reflect.GeneratedMethodAccessor120]
    [Unloading class javax.servlet.ServletConfig]
    [Unloading class sun.reflect.GeneratedMethodAccessor2]
    [Unloading class sun.reflect.GeneratedMethodAccessor121]
    : 1630777K->727223K(1851392K), 6.2803740 secs] 2118124K->727223K(2406208K), [CMS Perm : 47726K->47524K(131072K)], 16.3396850 secs] [Times: user=10.26 sys=8\
    .25, real=16.34 secs]
    Total time for which application threads were stopped: 16.3410590 seconds
    91960.740: [GC [1 CMS-initial-mark: 727223K(1851392K)] 728034K(2406208K), 0.0036430 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
    Total time for which application threads were stopped: 0.0059940 seconds
    91960.744: [CMS-concurrent-mark-start]
    91963.556: [CMS-concurrent-mark: 2.812/2.812 secs] [Times: user=2.86 sys=0.21, real=2.81 secs]
    91963.557: [CMS-concurrent-preclean-start]
    91963.577: [CMS-concurrent-preclean: 0.018/0.021 secs] [Times: user=0.01 sys=0.01, real=0.02 secs]
    91963.577: [CMS-concurrent-abortable-preclean-start]
    en_US
    en_US
    Total time for which application threads were stopped: 0.0035950 seconds
    CMS: abort preclean due to time 91968.620: [CMS-concurrent-abortable-preclean: 0.325/5.042 secs] [Times: user=0.47 sys=0.23, real=5.04 secs]
    91968.658: [GC[YG occupancy: 25104 K (554816 K)]91968.658: [Rescan (parallel) , 0.0236440 secs]91968.682: [weak refs processing, 0.0000620 secs] [1 CMS-rem\
    ark: 727223K(1851392K)] 752328K(2406208K), 0.0241580 secs] [Times: user=0.05 sys=0.00, real=0.03 secs]
    Total time for which application threads were stopped: 0.0616090 seconds
    91968.683: [CMS-concurrent-sweep-start]
    Total time for which application threads were stopped: 0.0014030 seconds
    Total time for which application threads were stopped: 0.0030990 seconds
    91969.900: [CMS-concurrent-sweep: 1.217/1.217 secs] [Times: user=1.17 sys=0.05, real=1.21 secs]
    91969.900: [CMS-concurrent-reset-start]
    91969.940: [CMS-concurrent-reset: 0.040/0.040 secs] [Times: user=0.04 sys=0.00, real=0.04 secs]
    Total time for which application threads were stopped: 0.0381810 seconds
    Total time for which application threads were stopped: 0.0603880 seconds
    Total time for which application threads were stopped: 0.0018140 seconds
    Total time for which application threads were stopped: 0.0031500 seconds
    Total time for which application threads were stopped: 0.0282040 seconds
    Total time for which application threads were stopped: 0.0447860 seconds
    en_US
    en_US
    Total time for which application threads were stopped: 0.0015210 seconds
    Total time for which application threads were stopped: 0.0029940 seconds
    Total time for which application threads were stopped: 0.0289840 seconds
    en_US
    Total time for which application threads were stopped: 0.0366490 seconds
    Total time for which application threads were stopped: 0.0009820 seconds
    Total time for which application threads were stopped: 0.0003620 seconds
    Total time for which application threads were stopped: 0.0006070 seconds
    Total time for which application threads were stopped: 0.0002480 seconds

    I don't know why en_US shows up in the log. That's unrelated I think.


    Info from zmlocalconfig
    -----------------------
    mailboxd_java_options = -server -Djava.awt.headless=true -XX:+UseConcMarkSweepGC -XX:NewRatio=2 -XX:PermSize=128m -XX:MaxPermSize=128m -XX:SoftRefLRUPolicy\
    MSPerMB=1 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime
    zimbra_zmjava_options = -Xmx256m
    zimbra_mysql_connector_maxActive = 100
    zimbra_index_max_uncommitted_operations = 200
    mailboxd_java_heap_memory_percent = 50
    mailboxd_java_heap_new_size_percent = 25
    search_dbfirst_term_percentage_cutoff = 0.8

    Mysql Info
    ----------
    max_connections = 110
    innodb_buffer_pool_size = 1220404377
    innodb_max_dirty_pages_pct = 10
    innodb_flush_method = O_DIRECT

    Java Process
    ------------
    zimbra 5631 5630 28 23:46 ? 00:00:02 /opt/zimbra/java/bin/java -XX:ErrorFile=/opt/zimbra/log -client -Xmx256m -Dzimbra.home=/opt/zimbra -Djava.l\
    ibrary.path=/opt/zimbra/lib -Djava.ext.dirs=/opt/zimbra/java/jre/lib/ext:/opt/zimbra/lib/jars:/opt/zimbra/lib/ext-common:/opt/zimbra/lib/ext/backup:/opt/zi\
    mbra/lib/ext/clamscanner:/opt/zimbra/lib/ext/network:/opt/zimbra/lib/ext/zimbra-license:/opt/zimbra/lib/ext/zimbrahsm:/opt/zimbra/lib/ext/zimbrasync com.zi\
    mbra.cs.account.ProvUtil -l gacf
    zimbra 5923 4828 0 23:46 pts/1 00:00:00 grep java
    root 21729 1 0 Nov12 ? 00:00:00 /opt/zimbra/libexec/zmmailboxdmgr start -server -XX:NewRatio=2 -Djava.awt.headless=true -XX:MaxPermSize=128\
    m -XX:SoftRefLRUPolicyMSPerMB=1 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+UseConcMarkSweepGC -XX:PermS\
    ize=128m -Xss256k -Xms2409m -Xmx2409m -Xmn602m
    zimbra 21730 21729 1 Nov12 ? 00:23:54 /opt/zimbra/java/bin/java -server -XX:NewRatio=2 -Djava.awt.headless=true -XX:MaxPermSize=128m -XX:SoftRefL\
    RUPolicyMSPerMB=1 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+UseConcMarkSweepGC -XX:PermSize=128m -Xss2\
    56k -Xms2409m -Xmx2409m -Xmn602m -Djava.io.tmpdir=/opt/zimbra/mailboxd/work -Djava.library.path=/opt/zimbra/lib -Djava.endorsed.dirs=/opt/zimbra/mailboxd/c\
    ommon/endorsed -Dzimbra.config=/opt/zimbra/conf/localconfig.xml -Djetty.home=/opt/zimbra/mailboxd -DSTART=/opt/zimbra/mailboxd/etc/start.config -jar /opt/z\
    imbra/mailboxd/start.jar /opt/zimbra/mailboxd/etc/jetty.properties /opt/zimbra/mailboxd/etc/jetty-setuid.xml /opt/zimbra/mailboxd/etc/jetty.xml

    I posted earlier about this with
    CMS: abort preclean due to time/concurrent mode failure

    Let me know if you learn anything about your issue. I think I have
    more to learn about java memory / garbage collection. Something
    is not right.

    thanks
    John

  3. #3
    cosmospt is offline Member
    Join Date
    Sep 2009
    Posts
    12
    Rep Power
    5

    Default

    No solution yet. Currently I am restarting zmmailboxd every night at 5.00 AM. Even with the restart, this problem has happened one time in the last 10 days.

    Cheers

  4. #4
    veronica is offline Outstanding Member
    Join Date
    Jun 2008
    Posts
    594
    Rep Power
    8

    Default

    Can you show the charts for the day ?

    zmstat-chart -s /opt/zimbra/zmstat/<date stat> -d /tmp/charts/

  5. #5
    jocx is offline Active Member
    Join Date
    Nov 2007
    Location
    Columbus, Ohio
    Posts
    33
    Rep Power
    7

    Default

    Here's some zmstat charts for a day where java was not being nice,
    users were complaining and so
    I did a zmmailboxdctl restart at around 1:12pm. After that the rest of
    the day it was okay. It's after a few days it seems to get stupid, though
    doing an early morning preemptive zmmailboxdctl restart seems to have
    good effect.

    Summary.txt shows no values, so I don't know what's up there.
    (Maybe ubuntu thing)?

    ===========================================
    # Perf Stats #
    ===========================================
    Mail Server CPU Sys :
    Mail Server CPU User :
    Mail Server CPU Idle :
    Mail Server CPU Iowait :
    Full GC% :
    Young GC% :
    Mailbox add rate :
    Mailbox add latency :
    Mailbox get latency :
    Soap response time:
    Pop response time:
    Imap response time:


    but the graphs are nice enough. (I'm modified for hostname = zimbra)
    See attached.

    Here's a bit of zmmailboxd.out at what I think is the time of the restart.

    659891.299: [Full GC 659891.299: [CMS: 1851391K->1851391K(1851392K), 10.6088180 secs] 2406207K->2405946K(2406208K), [CMS Perm : 47486K->47482K(131072K)], 10.6090670 secs] [Times: user=10.32 sys=0.21, real=10.61 secs]
    Total time for which application threads were stopped: 10.6108320 seconds
    659901.910: [GC [1 CMS-initial-mark: 1851391K(1851392K)] 2405985K(2406208K), 1.0150460 secs] [Times: user=0.93 sys=0.01, real=1.02 secs]
    Total time for which application threads were stopped: 1.0162430 seconds
    659902.925: [CMS-concurrent-mark-start]
    659902.987: [Full GC 659902.987: [CMS659914.927: [CMS-concurrent-mark: 11.999/12.001 secs] [Times: user=8.06 sys=0.03, real=12.00 secs]
    (concurrent mode failure): 1851391K->1851391K(1851392K), 24.2305050 secs] 2406207K->2405670K(2406208K), [CMS Perm : 47482K->47480K(131072K)], 24.2308000 secs] [Times: user=18.57 sys=0.30, real=24.23 secs]
    Total time for which application threads were stopped: 24.2324320 seconds
    Heap
    par new generation total 554816K, used 554279K [0x15660000, 0x3b060000, 0x3b060000)
    eden space 493184K, 99% used [0x15660000, 0x337ffff0, 0x33800000)
    from space 61632K, 99% used [0x33800000, 0x373a9c50, 0x37430000)
    to space 61632K, 0% used [0x37430000, 0x37430000, 0x3b060000)
    concurrent mark-sweep generation total 1851392K, used 1851391K [0x3b060000, 0xac060000, 0xac060000)
    concurrent-mark-sweep perm gen total 131072K, used 47480K [0xac060000, 0xb4060000, 0xb4060000)
    659927.238: [Full GC 659927.238: [CMS: 1851391K->1851391K(1851392K), 11.5150760 secs] 2406202K->2405705K(2406208K), [CMS Perm : 47483K->47483K(13107\
    2K)], 11.5157270 secs] [Times: user=10.32 sys=0.30, real=11.52 secs]
    CompilerOracle: exclude com/zimbra/cs/session/SessionMap.putAndPrune
    CompilerOracle: exclude com/zimbra/cs/mailbox/MailItem.delete
    CompilerOracle: exclude org/apache/xerces/impl/XMLDocumentFragmentScannerImpl$FragmentContentDisp atcher.dispatch
    Total time for which application threads were stopped: 0.0004250 seconds
    0 INFO [main] log - Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
    Total time for which application threads were stopped: 0.0003980 seconds
    Total time for which application threads were stopped: 0.0004740 seconds
    Zimbra server reserving server socket port=110 bindaddr=null ssl=false
    Zimbra server reserving server socket port=995 bindaddr=null ssl=true
    Zimbra server reserving server socket port=143 bindaddr=null ssl=false
    Zimbra server reserving server socket port=993 bindaddr=null ssl=true
    Zimbra server reserving server socket port=7025 bindaddr=null ssl=false
    Total time for which application threads were stopped: 0.0009940 seconds
    5242 INFO [main] log - Setting umask=027
    5244 INFO [main] log - Current rlimit_nofiles (soft=524288, hard=524288)
    5244 INFO [main] log - Set rlimit_nofiles (soft=65536, hard=65536)
    5244 INFO [main] log - Setting GID=1005

    before that I was getting things like


    CMS: abort preclean due to time 659306.572: [CMS-concurrent-abortable-preclean: 2.999/5.137 secs] [Times: user=2.26 sys=0.08, real=5.14 secs]
    659306.573: [GC[YG occupancy: 541888 K (554816 K)]659306.573: [Rescan (parallel) , 2.0809960 secs]659308.655: [weak refs processing, 0.0000260 secs] [1 CMS-remark: 1851379K(1851392K)] 2393268K(2406208K), 2.0814040 secs] [Times: user=1.20 sys=0.72, real=2.08 secs]
    Total time for which application threads were stopped: 2.0826130 seconds

    and

    659136.042: [GC 659136.042: [ParNew: 554815K->554815K(554816K), 0.0000730 secs]659136.042: [CMS659141.193: [CMS-concurrent-mark: 7.484/7.487 secs] [Times: user=7.78 sys=0.19, real=7.48 secs]
    (concurrent mode failure): 1851378K->1851391K(1851392K), 16.7058680 secs] 2406193K->2375479K(2406208K), [CMS Perm : 47477K->47474K(131072K)], 16.7062600 secs] [Times: user=16.38 sys=0.24, real=16.71 secs]
    Total time for which application threads were stopped: 16.7076830 seconds


    thanks,
    John
    Attached Files Attached Files

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •