RHCS won't bring much more...
The scripts will check ZCS is running on the "primary" node. If there's a problem with ZCS (script finds one of the services if down), it will migrate to the "secondary" node.
It won't rebuild the partition nor data (might be needed if ZCS went down badly) and if there's a data issue, ZCS won't start either on "secondary" node so you'll get :
. "primary" node down because of fencing
. broken data
. "secondary" node up but ZCS unable to start on it
. broken cluster
Today, I'd rather have a good monitoring solution and manual failover than RHCS.
Or go the "VM" way (if primary host goes down, VM is moved to secondary node), because this way we're handling only hardware failures.
IMHO, the issue in RHCS is the way it triggers the move to secondary node.
If there's an hardware issue, it's OK.
If it's a software issue, automatic failover is not always good.
Oh, and BTW, don't use GFS with RHCS and ZCS. |