We have been noticing with more regularity lately some of our emails are getting delayed. But first a quick overview of the setup.
All our mail is filtered through an external filter (Mail Protector), which does spam etc. Our zimbra server sits behind a hardware firewall which is natting traffic. SplitDNS is setup on the zimbra box as per the instructions. The zimbra machine is a vm on vmware esx4.1, the mailstore is about 350gb.
Most emails come through within a few seconds to a minute or two of being sent, some however just sort of sit out in the internet world and arrive up to a week later.
I read the post about MTU, I have checked all the connections which are within my scope, and they are all the same MTU. We have spoken with mail protector on several occasions to try to resolve the issue. Using the packet capture on the firewall, wireshark and tcpdump on the zimbra box we can see that the firewall is receiving the traffic, and passing it through to the zimbra box, we can see the attempts from mailprotector hitting the zimbra box, but it just doesn't respond to the handshake requests.
I tried moving the zimbra box onto a separate host of its own to make sure contention for resources wasn't the issue, same problem. Tried moving a couple of machines onto another firewall to reduce load on the main one, same problem. I've been testing using a php script on an external box which sends 15 emails each 1 second apart, on average 8-9 get through straight away, I'll then get another few 5-10mins later, and usually the remaining within a few hours.
The mail protector logs are showing connection timed out, which is consistent with what I'm seeing on the tcpdump on the zimbra side. At this point I'm not sure whether it can be attributed to the postfix portion of zimbra not responding, or if its something at a lower level in the virtual nic drivers, though I suspect the virtual nic drivers aren't to blame as they have been the same for a long time.
Has anyone experienced this before? Please let me know if there are any logs in particular you want to see which may help.