
Originally Posted by
Rich Graves
My server is bare metal (Dell R710) booting from Compellent SAN.
I still get snapshots at the storage layer, and I can still use VMs for test and DR. I always test upgrades by mounting consistent snapshots of production /opt/zimbra/* as Xen VMs (old version included in RHEL 5.7). When I do this, I/O performance is very noticeably worse than on bare metal, even when the Xen server is, on paper, faster and better connected.
With the VMWare acquisition of Zimbra, I had been intending to virtualize there -- VMWare I/O is rumored better than Xen, certainly than RedHat's build -- but my mail store is 5 terabytes of raw LUNs, which makes our VMWare guy uncomfortable. We would have needed to buy a new server anyway, so I ended up with dedicated hardware again. And modern dedicated hardware means NUMA.
I ran some simple tests (for i in `seq 1 10`; do for j in `zmprov -l gaa`; do echo sm $j;echo 'search -t message "in:trash -has:attachment"'; done|zmmailbox -z;done) that showed no NUMA cross-talk (according to numactl --hardware). So there don't seem to be any funky zero-copy handoffs between java and mysqld. There was a modest (7%) improvement for NUMA split versus NUMA interleaved. I didn't expect there to be much difference, but wanted to confirm that it didn't make things worse.
While the big story is I/O, we have seen memory-related performance issues, especially in the early 5.0 series, stemming from Java garbage collection. If I can help keep Java heap local to the CPU, I expect at least a small win. Also, I put amavis tmp on RAM disk. While the orders-of-magnitude boost from keeping it off spinning platters is the big win, that RAM disk should be kept NUMA-local, too.