Zimbra offers Open Source email server software and shared calendar for Linux and the Mac
Go Back   Zimbra :: Forums > Zimbra Collaboration Suite > Administrators

Welcome to the Zimbra :: Forums!
Welcome, if you would like to post a comment please register. We also encourage you to explore all things Zimbra with our team and members of the community.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #31 (permalink)  
Old 03-08-2006, 01:41 PM
Senior Member
 
Posts: 53
Default

Quote:
Originally Posted by KevinH
What interfaces? POP? IMAP? Outlook? Web UI?
After reading that other post I started to wonder if it could be IPSentry as the culprit for going in every couple of minutes on port 110 without really having a session, if thats the case I need an alternative method of monitoring that will still be able to call or email me on failure or a simple command and reply for IPSentry to use that won't crash the server (Send QUIT to 110 and Receive BYE?)
Reply With Quote
  #32 (permalink)  
Old 03-08-2006, 01:55 PM
Zimbra Employee
 
Posts: 4,792
Default

Quote:
Originally Posted by Dux T
After reading that other post I started to wonder if it could be IPSentry as the culprit for going in every couple of minutes on port 110 without really having a session, if thats the case I need an alternative method of monitoring that will still be able to call or email me on failure or a simple command and reply for IPSentry to use that won't crash the server (Send QUIT to 110 and Receive BYE?)
Or just do a HEAD on 80 to verify the server is it up and responding. POP/IMAP/Web UI are all running in tomcat so checking one should verify that it's alive.
__________________
Bugzilla - Wiki - Downloads - Offline Client
Reply With Quote
  #33 (permalink)  
Old 03-09-2006, 08:41 AM
Senior Member
 
Posts: 53
Default

Quote:
Originally Posted by KevinH
Or just do a HEAD on 80 to verify the server is it up and responding. POP/IMAP/Web UI are all running in tomcat so checking one should verify that it's alive.
I got no love from my users Same thing this morning 12am PST everything was happy then sometime before 5am PST it crashed. I have users from the UK and in the US from the east coast to the west coast.

Is it sane to assume that if catalina.out has more than 1 occurrence of:
Quote:
waiting for monitor entry
that I'm deadlocked and should restart zimbra?

Code:
if [ `cat /opt/zimbra/tomcat/logs/catalina.out | grep waiting for monitor entry | wc -l` -gt 1 ]
then
zmcontrol stop
zmcontrol start
fi

Last edited by Dux T; 03-09-2006 at 08:48 AM..
Reply With Quote
  #34 (permalink)  
Old 03-09-2006, 09:16 AM
Zimbra Employee
 
Posts: 4,792
Default

No... The waits are ok as lots of threads wait when they are idle. It's when you see waiting for some object <lock id> and many threads waiting for the same lock id's. Do you have the log of when it stopped processing information? What is your usage mix? IMAP/POP/Web UI. At least try to corner in on a particular interface or user's mailbox.
__________________
Bugzilla - Wiki - Downloads - Offline Client
Reply With Quote
  #35 (permalink)  
Old 03-09-2006, 11:02 AM
Senior Member
 
Posts: 53
Default

Quote:
Originally Posted by KevinH
No... The waits are ok as lots of threads wait when they are idle. It's when you see waiting for some object <lock id> and many threads waiting for the same lock id's. Do you have the log of when it stopped processing information? What is your usage mix? IMAP/POP/Web UI. At least try to corner in on a particular interface or user's mailbox.
Not scientific but:
Code:
[voir@dux100 logzimbra]$ cat opt.zimbra.log | grep ImapServer | wc -l
1526
[voir@dux100 logzimbra]$ cat opt.zimbra.log | grep Pop3Server | wc -l
961
Other than some Auth Failures I don't see anything in /var/log/messages /var/log/zimbra.log or /opt/zimbra/log/zimbra.log that looks out of place. but I do have the time frame narrowed to between 04:17am and 04:40am between that time I had 2 users active in the beggining and 8 users active when it crashed
Reply With Quote
  #36 (permalink)  
Old 03-09-2006, 12:16 PM
Zimbra Employee
 
Posts: 4,792
Default

Any cronjobs set to fire around those times?

su - zimbra
crontab -l
__________________
Bugzilla - Wiki - Downloads - Offline Client
Reply With Quote
  #37 (permalink)  
Old 03-10-2006, 04:28 AM
Senior Member
 
Posts: 53
Default

Quote:
Originally Posted by KevinH
Any cronjobs set to fire around those times?

su - zimbra
crontab -l
closest thing is Purge messages at 3am, then starting this morning a zmrestart, but I still had to do it again at 4am

Code:
Fri Mar 10 03:59:43 PST 2006
zmtomcatstart: info: stale pid 29118 in pid file: No such process
Mar 10, 2006 3:04:17 AM org.apache.coyote.http11.Http11Protocol init
INFO: Initializing Coyote HTTP/1.1 on http-80
Mar 10, 2006 3:04:19 AM org.apache.coyote.http11.Http11Protocol init
INFO: Initializing Coyote HTTP/1.1 on http-7071
Mar 10, 2006 3:04:19 AM org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 3273 ms
Mar 10, 2006 3:04:19 AM org.apache.catalina.core.StandardService start
INFO: Starting service Catalina
Mar 10, 2006 3:04:19 AM org.apache.catalina.core.StandardEngine start
INFO: Starting Servlet Engine: Apache Tomcat/5.5.7
Mar 10, 2006 3:04:19 AM org.apache.catalina.core.StandardHost start
INFO: XML validation disabled
log4j:WARN No appenders could be found for logger (org.apache.catalina.session.ManagerBase).
log4j:WARN Please initialize the log4j system properly.
Zimbra server reserving server socket port=110 bindaddr=null ssl=false
Zimbra server reserving server socket port=995 bindaddr=null ssl=true
Zimbra server reserving server socket port=143 bindaddr=null ssl=false
Zimbra server reserving server socket port=993 bindaddr=null ssl=true
Zimbra server process is running as root, changing to user=zimbra uid=505 gid=505
Zimbra server process, after change, is running with uid=505 euid=505 gid=505 egid=505
Mar 10, 2006 3:04:40 AM org.apache.coyote.http11.Http11Protocol start
INFO: Starting Coyote HTTP/1.1 on http-80
Mar 10, 2006 3:04:40 AM org.apache.coyote.http11.Http11Protocol start
INFO: Starting Coyote HTTP/1.1 on http-7071
Mar 10, 2006 3:04:40 AM org.apache.catalina.startup.Catalina start
INFO: Server startup in 21234 ms
Reply With Quote
  #38 (permalink)  
Old 03-11-2006, 02:58 PM
Zimbra Employee
 
Posts: 4,792
Default

A few more ideas:

- make sure you're are up2date

- check /var/log/messages if kernel OOM killer is reaping the JVM process

- memtest86 (rpm -qli memtest86+)

- is updatedb running at 4 am - it usually does? (check /etc/updatedb.conf)

- Are you doing a zimbra restart at 4am and potential race condition in
tomcat stop/start script where the new server can not be started because
the old server really didn't die (for some reason) issues?
__________________
Bugzilla - Wiki - Downloads - Offline Client
Reply With Quote
  #39 (permalink)  
Old 03-11-2006, 04:01 PM
Senior Member
 
Posts: 53
Default

Quote:
Originally Posted by KevinH
A few more ideas:

- make sure you're are up2date

- check /var/log/messages if kernel OOM killer is reaping the JVM process

- memtest86 (rpm -qli memtest86+)

- is updatedb running at 4 am - it usually does? (check /etc/updatedb.conf)

- Are you doing a zimbra restart at 4am and potential race condition in
tomcat stop/start script where the new server can not be started because
the old server really didn't die (for some reason) issues?
- The system is "up2date"
- Can't find an indication that kernel OOM killer is reaping the JVM process in var/log/messages
- RAM passes diagnostics
- updatedb_daily is set to "no:
- The Zimbra restart is a new thing added to try and automate the fix for this problem but didn't seem to help

Is there any reason to believe that it is the actual INactivity of the system causing it to fail? I don't think any acpi type crap is running like disk shutting down or anything but I'm not even sure how to look for it.

*late breaking news* The Kernel seems different today, it was 2.6.9-22.0.02ELsmp now it's 2.6.9-34ELsmp
Reply With Quote
  #40 (permalink)  
Old 03-11-2006, 04:33 PM
Zimbra Employee
 
Posts: 4,792
Default

Quote:
Originally Posted by Dux T
*late breaking news* The Kernel seems different today, it was 2.6.9-22.0.02ELsmp now it's 2.6.9-34ELsmp
up2date then reboot? Might have given you the latest update.
__________________
Bugzilla - Wiki - Downloads - Offline Client
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Similar Threads

Why Join?

Registering let's you ask questions, makes it easier to search, displays any files attached to posts, and notifies you about replies.

blog.zimbra.com




 

SEO by vBSEO ©2011, Crawlability, Inc.