Zimbra offers Open Source email server software and shared calendar for Linux and the Mac
Go Back   Zimbra :: Forums > Zimbra Collaboration Suite > Administrators

Welcome to the Zimbra :: Forums!
Welcome, if you would like to post a comment please register. We also encourage you to explore all things Zimbra with our team and members of the community.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 01-17-2011, 07:29 AM
Intermediate Member
 
Posts: 16
Default Server becoming unresponsive for minutes at a time

We are starting to see our server hitch and become unresponsive more and more. Sometimes it's for 30 seconds and others for a couple of minutes. When it is unresponsive; webmail, mail clients, and the admin page time out. I have yet to remember to try to ping the server when this happens but I will do this the next time that I notice it.

What I'm wondering is the best way to diagnoise this issue. I have done some looking through the logs (mailboxd.csv, myslow.log, and the mailbox.log). Not a great deal is really jumping out at me but I'm not too certain where I should be looking. Any help that you can provide so that I can pin this thing down would be greatly appreciated. We are running Version 5.0.11_GA_2695.UBUNTU8.FOSS Nov 17, 2008.

Here's something that I did see in the mailbox.log at the time of our most recent disruption this morning. Thanks.

2011-01-17 08:11:58,404 INFO [Pop3Server-18419] [ip=BLANKED OUT;] pop - connected
2011-01-17 08:11:58,469 INFO [Pop3Server-18419] [ip=BLANKED OUT;] pop - -ERR invalid username/password (PASS ****)
2011-01-17 08:15:31,595 INFO [Pop3Server-18419] [ip=BLANKED OUT;] pop - disconnected without quit
2011-01-17 08:15:31,595 INFO [Pop3Server-18419] [] ProtocolHandler - Handler exiting normally
2011-01-17 08:15:31,685 INFO [Pop3Server-18420] [ip=BLANKED OUT;] pop - connected
2011-01-17 08:15:31,686 INFO [Pop3Server-18420] [ip=BLANKED OUT;] ProtocolHandler - Exception occurred while handling connection
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutp utStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStre am.java:136)
at java.io.BufferedOutputStream.flushBuffer(BufferedO utputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputS tream.java:123)
at com.zimbra.cs.pop3.Pop3Handler.sendLine(Pop3Handle r.java:373)
at com.zimbra.cs.pop3.Pop3Handler.sendResponse(Pop3Ha ndler.java:362)
at com.zimbra.cs.pop3.Pop3Handler.sendOK(Pop3Handler. java:337)
at com.zimbra.cs.pop3.Pop3Handler.startConnection(Pop 3Handler.java:125)
at com.zimbra.cs.pop3.TcpPop3Handler.setupConnection( TcpPop3Handler.java:42)
at com.zimbra.cs.tcpserver.ProtocolHandler.run(Protoc olHandler.java:126)
at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Wo rker.run(Unknown Source)
at java.lang.Thread.run(Thread.java:595)
Reply With Quote
  #2 (permalink)  
Old 01-17-2011, 08:22 AM
Zimbra Consultant & Moderator
 
Posts: 20,315
Default

Some idea of the server specification and the load (i.e. # of users, webmail or IMAP, what's actually shown in things like 'top' etc.) would be a good starting point.
__________________
Regards


Bill
Reply With Quote
  #3 (permalink)  
Old 01-17-2011, 08:25 AM
Intermediate Member
 
Posts: 16
Default

Thanks for the response. We have, roughly, a few hundred users. It's a mix of POP clients and webmail users.

Here's what I get after issuing the top command:

Code:
top - 10:32:01 up 11:41,  1 user,  load average: 1.62, 1.83, 2.11
Tasks: 110 total,   3 running, 107 sleeping,   0 stopped,   0 zombie
Cpu(s): 16.4%us, 36.5%sy,  0.0%ni, 45.2%id,  0.3%wa,  0.3%hi,  1.3%si,  0.0%st
Mem:   1583872k total,  1512452k used,    71420k free,   155208k buffers
Swap:  2097144k total,      248k used,  2096896k free,   468716k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 6164 zimbra    20   0  987m 580m  41m S  5.0 37.5  34:06.05 java
 1110 zimbra    20   0  4556 2760 1484 S  2.6  0.2   0:00.08 zmstatuslog
 5215 zimbra    20   0  253m 168m 4532 S  1.3 10.9   6:22.43 mysqld
 3470 zimbra    20   0  7144 4892 1144 S  1.0  0.3   0:30.48 zmmtaconfig
 4385 zimbra    20   0  252m  41m 8960 S  0.7  2.7   3:27.86 slapd
 7484 zimbra    20   0  5244 3588 1556 S  0.7  0.2   0:27.34 zmstat-proc

Last edited by phoenix; 01-17-2011 at 08:49 AM..
Reply With Quote
  #4 (permalink)  
Old 01-17-2011, 08:48 AM
Zimbra Consultant & Moderator
 
Posts: 20,315
Default

.... and the answer to my other question about hardware specification?
__________________
Regards


Bill
Reply With Quote
  #5 (permalink)  
Old 01-17-2011, 11:36 AM
Intermediate Member
 
Posts: 16
Default

This is a VM. It has been assigned 1.5 GB of RAM and a single processor. It doesn't appear that the CPU utilization gets too hammered but could we need to increase the RAM? Also, when poking around the server settings, I saw that the number of threads in the POP mail settings is set to 100. I read the help and saw that the default is 10. Should this be decreased? 100 seems high. I'm unsure if that could be causing some of our issues as well. Thanks.
Reply With Quote
  #6 (permalink)  
Old 01-17-2011, 12:41 PM
Zimbra Consultant & Moderator
 
Posts: 20,315
Default

Quote:
Originally Posted by ryan_r_sd View Post
This is a VM. It has been assigned 1.5 GB of RAM and a single processor. It doesn't appear that the CPU utilization gets too hammered but could we need to increase the RAM?
The minimum RAM is 2GB and the recommended minimum is 4GB for a production server.

Quote:
Originally Posted by ryan_r_sd View Post
Also, when poking around the server settings, I saw that the number of threads in the POP mail settings is set to 100. I read the help and saw that the default is 10. Should this be decreased? 100 seems high. I'm unsure if that could be causing some of our issues as well. Thanks.
That shouldm't really matter if you have enough RAM, I'd suggest you increase it to 2GB and preferably the recommended 4GB and see how you get on. Are there many other VMs on this host and are they also busy? What VM environment is the host server running?

I'd also suggest you consider an upgrade to the most recent Zimbra release, there have been a couple of security advisories in versions later than the one you're running.
__________________
Regards


Bill
Reply With Quote
  #7 (permalink)  
Old 01-17-2011, 12:48 PM
Intermediate Member
 
Posts: 16
Default

Thanks Phoenix, I looked at the requirments as well today and expected that upping it would be recommended. We will do that, likely taking it to 4 GB. The host really doesn't get hit too hard so I don't think that's an issue.

We are running ESX Version 3.0.1. We are going to be upgrading our mail system this year so for now, just want to clear up these hangs. We will start with upping the RAM and see where that gets us.
Reply With Quote
  #8 (permalink)  
Old 01-17-2011, 03:55 PM
Advanced Member
 
Posts: 212
Default

Whats your VM host disk usage? Disk I/O is very common problem on any virtual environment. Also you are running an old version of ESX. ESX/ESXi 4.1 has vast improvements in disk I/O.
Reply With Quote
  #9 (permalink)  
Old 01-18-2011, 12:08 AM
Zimbra Consultant & Moderator
 
Posts: 20,315
Default

Quote:
Originally Posted by ryan_r_sd View Post
Thanks Phoenix, I looked at the requirments as well today and expected that upping it would be recommended. We will do that, likely taking it to 4 GB. The host really doesn't get hit too hard so I don't think that's an issue.

We are running ESX Version 3.0.1. We are going to be upgrading our mail system this year so for now, just want to clear up these hangs. We will start with upping the RAM and see where that gets us.
The other question I forgot to ask is what's your HD subsystem for the Zimbra server and the VM host, a RAID array (if so, what RAID level)?
__________________
Regards


Bill
Reply With Quote
  #10 (permalink)  
Old 01-18-2011, 11:49 AM
Intermediate Member
 
Posts: 16
Default

I'm not too sure what the RAID level is Bill, I'm guessing RAID5. We increased the RAM to 4 GB and still have these same problems. Additional information that I can provide is that we are also seeing more and more deferred mail with these types of messages:

Delivery temporarily unavailable
Connection timed out
Mail transport unavailable

This is all pretty recent. We hadn't made any changes other than the typical new hires and removals. Does anyone have any recommendations as to where I should be digging and how I can narrow down what is happening? When the server recently became unresponsive, it stopped writing to the mailbox log for about 5 minutes. We could still SSH to the server. Also, while it was down; there were a bunch of authentication failures in the Zimbra.log. We are assuming this is a sympton and not a cause. Thanks.

Last edited by ryan_r_sd; 01-18-2011 at 12:04 PM..
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Similar Threads

Why Join?

Registering let's you ask questions, makes it easier to search, displays any files attached to posts, and notifies you about replies.

blog.zimbra.com




 

SEO by vBSEO ©2011, Crawlability, Inc.