I am trying to plan a scalable deployment and have tried to use the multi-master replication offered by ZCS 8. After a bit of investigation it looks like it isn't a workable solution.
The multi server install documentation isn't clear on what topologies are supported. I read somewhere that multi master replication should have up to four masters.
It follows that it might be possible to add replicas (the documentation suggests that multi master and replicas can co-exist, and the installation scripts allow it)
However during testing I find that the replicas are only really talking to one master, and if that master (in a multi master group) is shut down the replica stops getting updates. This leads to inconsistency as reads on the replica show no changes, whereas writes to the multi-master group succeed. The test case for this is pretty trivial and I have explained it in a previous post here.
From my limited knowledge of openldap (a couple of days of experimentation) it looks like replicas just aren't compatible with the multi-master approach. Please correct me if I'm wrong.
So it looks like Zimbra supports the single master with 0 or more replicas (where the master is a single point of failure), OR a multi-master configuration with no replicas (which is limited in its ability to scale).
Also zmreplcheck seems worse than useless, in my test case it fails to report that the master is still being updated and the replica is out of date. This seems to be because it is querying contextCSN and only looking at the first value returned - there are three values returned for that attribute.
The Perl code in zmreplchk is doing:
$pcsn = $entry->get_value('contextCSN');
so $pcsn will take only the first value. I have modified zmreplchk to dump out all the values but in its present state it doesn't have the logic to interpret them correctly. However it is clear that the servers are not in sync!
Is anyone at Zimbra working on multi-master as it seems not quite ready for GA in ZCS 8???