| Welcome to the Zimbra :: Forums! | |
Welcome, if you would like to post a comment please register.
We also encourage you to explore all things Zimbra with our team and members of the community.
|  | | 
01-17-2011, 07:29 AM
| | Intermediate Member | |
Posts: 16
| | Server becoming unresponsive for minutes at a time We are starting to see our server hitch and become unresponsive more and more. Sometimes it's for 30 seconds and others for a couple of minutes. When it is unresponsive; webmail, mail clients, and the admin page time out. I have yet to remember to try to ping the server when this happens but I will do this the next time that I notice it.
What I'm wondering is the best way to diagnoise this issue. I have done some looking through the logs (mailboxd.csv, myslow.log, and the mailbox.log). Not a great deal is really jumping out at me but I'm not too certain where I should be looking. Any help that you can provide so that I can pin this thing down would be greatly appreciated. We are running Version 5.0.11_GA_2695.UBUNTU8.FOSS Nov 17, 2008.
Here's something that I did see in the mailbox.log at the time of our most recent disruption this morning. Thanks.
2011-01-17 08:11:58,404 INFO [Pop3Server-18419] [ip=BLANKED OUT;] pop - connected
2011-01-17 08:11:58,469 INFO [Pop3Server-18419] [ip=BLANKED OUT;] pop - -ERR invalid username/password (PASS ****)
2011-01-17 08:15:31,595 INFO [Pop3Server-18419] [ip=BLANKED OUT;] pop - disconnected without quit
2011-01-17 08:15:31,595 INFO [Pop3Server-18419] [] ProtocolHandler - Handler exiting normally
2011-01-17 08:15:31,685 INFO [Pop3Server-18420] [ip=BLANKED OUT;] pop - connected
2011-01-17 08:15:31,686 INFO [Pop3Server-18420] [ip=BLANKED OUT;] ProtocolHandler - Exception occurred while handling connection
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutp utStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStre am.java:136)
at java.io.BufferedOutputStream.flushBuffer(BufferedO utputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputS tream.java:123)
at com.zimbra.cs.pop3.Pop3Handler.sendLine(Pop3Handle r.java:373)
at com.zimbra.cs.pop3.Pop3Handler.sendResponse(Pop3Ha ndler.java:362)
at com.zimbra.cs.pop3.Pop3Handler.sendOK(Pop3Handler. java:337)
at com.zimbra.cs.pop3.Pop3Handler.startConnection(Pop 3Handler.java:125)
at com.zimbra.cs.pop3.TcpPop3Handler.setupConnection( TcpPop3Handler.java:42)
at com.zimbra.cs.tcpserver.ProtocolHandler.run(Protoc olHandler.java:126)
at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Wo rker.run(Unknown Source)
at java.lang.Thread.run(Thread.java:595) | 
01-17-2011, 08:22 AM
| | Zimbra Consultant & Moderator | |
Posts: 20,315
| | Some idea of the server specification and the load (i.e. # of users, webmail or IMAP, what's actually shown in things like 'top' etc.) would be a good starting point.
__________________
Regards
Bill
| 
01-17-2011, 08:25 AM
| | Intermediate Member | |
Posts: 16
| | Thanks for the response. We have, roughly, a few hundred users. It's a mix of POP clients and webmail users.
Here's what I get after issuing the top command: Code: top - 10:32:01 up 11:41, 1 user, load average: 1.62, 1.83, 2.11
Tasks: 110 total, 3 running, 107 sleeping, 0 stopped, 0 zombie
Cpu(s): 16.4%us, 36.5%sy, 0.0%ni, 45.2%id, 0.3%wa, 0.3%hi, 1.3%si, 0.0%st
Mem: 1583872k total, 1512452k used, 71420k free, 155208k buffers
Swap: 2097144k total, 248k used, 2096896k free, 468716k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6164 zimbra 20 0 987m 580m 41m S 5.0 37.5 34:06.05 java
1110 zimbra 20 0 4556 2760 1484 S 2.6 0.2 0:00.08 zmstatuslog
5215 zimbra 20 0 253m 168m 4532 S 1.3 10.9 6:22.43 mysqld
3470 zimbra 20 0 7144 4892 1144 S 1.0 0.3 0:30.48 zmmtaconfig
4385 zimbra 20 0 252m 41m 8960 S 0.7 2.7 3:27.86 slapd
7484 zimbra 20 0 5244 3588 1556 S 0.7 0.2 0:27.34 zmstat-proc
Last edited by phoenix; 01-17-2011 at 08:49 AM..
| 
01-17-2011, 08:48 AM
| | Zimbra Consultant & Moderator | |
Posts: 20,315
| | .... and the answer to my other question about hardware specification?
__________________
Regards
Bill
| 
01-17-2011, 11:36 AM
| | Intermediate Member | |
Posts: 16
| | This is a VM. It has been assigned 1.5 GB of RAM and a single processor. It doesn't appear that the CPU utilization gets too hammered but could we need to increase the RAM? Also, when poking around the server settings, I saw that the number of threads in the POP mail settings is set to 100. I read the help and saw that the default is 10. Should this be decreased? 100 seems high. I'm unsure if that could be causing some of our issues as well. Thanks. | 
01-17-2011, 12:41 PM
| | Zimbra Consultant & Moderator | |
Posts: 20,315
| | Quote:
Originally Posted by ryan_r_sd This is a VM. It has been assigned 1.5 GB of RAM and a single processor. It doesn't appear that the CPU utilization gets too hammered but could we need to increase the RAM? | The minimum RAM is 2GB and the recommended minimum is 4GB for a production server. Quote:
Originally Posted by ryan_r_sd Also, when poking around the server settings, I saw that the number of threads in the POP mail settings is set to 100. I read the help and saw that the default is 10. Should this be decreased? 100 seems high. I'm unsure if that could be causing some of our issues as well. Thanks. | That shouldm't really matter if you have enough RAM, I'd suggest you increase it to 2GB and preferably the recommended 4GB and see how you get on. Are there many other VMs on this host and are they also busy? What VM environment is the host server running?
I'd also suggest you consider an upgrade to the most recent Zimbra release, there have been a couple of security advisories in versions later than the one you're running.
__________________
Regards
Bill
| 
01-17-2011, 12:48 PM
| | Intermediate Member | |
Posts: 16
| | Thanks Phoenix, I looked at the requirments as well today and expected that upping it would be recommended. We will do that, likely taking it to 4 GB. The host really doesn't get hit too hard so I don't think that's an issue.
We are running ESX Version 3.0.1. We are going to be upgrading our mail system this year so for now, just want to clear up these hangs. We will start with upping the RAM and see where that gets us. | 
01-17-2011, 03:55 PM
| | Advanced Member | |
Posts: 212
| | Whats your VM host disk usage? Disk I/O is very common problem on any virtual environment. Also you are running an old version of ESX. ESX/ESXi 4.1 has vast improvements in disk I/O. | 
01-18-2011, 12:08 AM
| | Zimbra Consultant & Moderator | |
Posts: 20,315
| | Quote:
Originally Posted by ryan_r_sd Thanks Phoenix, I looked at the requirments as well today and expected that upping it would be recommended. We will do that, likely taking it to 4 GB. The host really doesn't get hit too hard so I don't think that's an issue.
We are running ESX Version 3.0.1. We are going to be upgrading our mail system this year so for now, just want to clear up these hangs. We will start with upping the RAM and see where that gets us. | The other question I forgot to ask is what's your HD subsystem for the Zimbra server and the VM host, a RAID array (if so, what RAID level)?
__________________
Regards
Bill
| 
01-18-2011, 11:49 AM
| | Intermediate Member | |
Posts: 16
| | I'm not too sure what the RAID level is Bill, I'm guessing RAID5. We increased the RAM to 4 GB and still have these same problems. Additional information that I can provide is that we are also seeing more and more deferred mail with these types of messages:
Delivery temporarily unavailable
Connection timed out
Mail transport unavailable
This is all pretty recent. We hadn't made any changes other than the typical new hires and removals. Does anyone have any recommendations as to where I should be digging and how I can narrow down what is happening? When the server recently became unresponsive, it stopped writing to the mailbox log for about 5 minutes. We could still SSH to the server. Also, while it was down; there were a bunch of authentication failures in the Zimbra.log. We are assuming this is a sympton and not a cause. Thanks.
Last edited by ryan_r_sd; 01-18-2011 at 12:04 PM..
| | Thread Tools | Search this Thread | | | | | Display Modes | Linear Mode | | Why Join? Registering let's you ask questions, makes it easier to search, displays any files attached to posts, and notifies you about replies.  |