Zimbra offers Open Source email server software and shared calendar for Linux and the Mac
Go Back   Zimbra :: Forums > Zimbra Collaboration Suite > Installation

Welcome to the Zimbra :: Forums!
Welcome, if you would like to post a comment please register. We also encourage you to explore all things Zimbra with our team and members of the community.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 07-20-2007, 08:24 AM
Active Member
 
Posts: 49
Default Centos 4.5, RHCS, and Zimbra

I'm working on installing RedHat Cluster Suite to support a 2 node Zimbra cluster in active/passive mode. I am using the Zimbra documnet for installing a single node cluster using the zimbra cluster package.

At this point, I have all the related packages installed (rgmanager, system-config-cluster, ccsd, magma, magma-plugins, cman, cman-kernel-smp, dlm, dlm-kernel-smp, fence, gulm, iddev) from the csgfs repository. The ccsd service seems to start, however, cman or rgmanager fail outright:

Jul 20 09:50:15 wsl-mx1 ccsd: start succeeded
Jul 20 09:50:20 wsl-mx1 cman: FATAL: Module cman not found. failed
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Cluster is not quorate. Refusing connection.
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Error while processing connect: Connection refused
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Invalid descriptor specified (-111).
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Someone may be attempting something evil.
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Error while processing get: Invalid request descriptor
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Invalid descriptor specified (-111).
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Someone may be attempting something evil.
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Error while processing get: Invalid request descriptor
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Invalid descriptor specified (-21).
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Someone may be attempting something evil.
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Error while processing disconnect: Invalid request descriptor
Jul 20 09:50:26 wsl-mx1 clurgmgrd[5766]: Resource Group Manager Starting
Jul 20 09:50:26 wsl-mx1 clurgmgrd[5766]: Loading Service Data
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Cluster is not quorate. Refusing connection.
Jul 20 09:50:26 wsl-mx1 ccsd[5387]: Error while processing connect: Connection refused
Jul 20 09:50:26 wsl-mx1 clurgmgrd[5766]: #5: Couldn't connect to ccsd!
Jul 20 09:50:26 wsl-mx1 clurgmgrd[5766]: #8: Couldn't initialize services
Jul 20 09:50:26 wsl-mx1 rgmanager: clurgmgrd startup failed
Jul 20 09:50:39 wsl-mx1 ccsd[5387]: Unable to connect to cluster infrastructure after 150 seconds.

Since cman complains about a missing module, I tried to verify and found it right away:
[root@wsl-mx1]# ls -la /lib/modules/2.6.9-55.ELsmp/kernel/cluster/
total 700
drwxr-xr-x 2 root root 4096 Jul 20 09:19 .
drwxr-xr-x 10 root root 4096 Jul 20 09:19 ..
-rwxr-xr-x 1 root root 159744 Jun 17 18:32 cman.ko
-rwxr-xr-x 1 root root 185592 Jun 17 18:32 cman.symvers
-rwxr-xr-x 1 root root 150424 Jun 17 19:14 dlm.ko
-rwxr-xr-x 1 root root 185884 Jun 17 19:14 dlm.symvers

If I load the modules manually, they load fine, but cman and rgmanager still fail to start:

[root@wsl-mx1]# insmod /lib/modules/2.6.9-55.ELsmp/kernel/cluster/cman.ko
[root@wsl-mx1]# insmod /lib/modules/2.6.9-55.ELsmp/kernel/cluster/dlm.ko
[root@wsl-mx1]# lsmod | egrep -i "dlm|cman"
dlm 117604 0
cman 125664 1 dlm

Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Cluster is not quorate. Refusing connection.
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Error while processing connect: Connection refused
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Invalid descriptor specified (-111).
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Someone may be attempting something evil.
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Error while processing get: Invalid request descriptor
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Invalid descriptor specified (-111).
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Someone may be attempting something evil.
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Error while processing get: Invalid request descriptor
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Invalid descriptor specified (-21).
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Someone may be attempting something evil.
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Error while processing disconnect: Invalid request descriptor
Jul 20 11:11:52 wsl-mx1 clurgmgrd[6362]: Resource Group Manager Starting
Jul 20 11:11:52 wsl-mx1 clurgmgrd[6362]: Loading Service Data
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Cluster is not quorate. Refusing connection.
Jul 20 11:11:52 wsl-mx1 ccsd[5387]: Error while processing connect: Connection refused
Jul 20 11:11:52 wsl-mx1 clurgmgrd[6362]: #5: Couldn't connect to ccsd!
Jul 20 11:11:52 wsl-mx1 clurgmgrd[6362]: #8: Couldn't initialize services
Jul 20 11:11:52 wsl-mx1 rgmanager: clurgmgrd startup failed

My guess is that something is amiss with the library linking, but I'm not sure where to begin looking.

[UPDATE: This issue has been resolved]

Last edited by briansrapier; 07-26-2007 at 08:25 AM.. Reason: resolved
Reply With Quote
  #2 (permalink)  
Old 07-25-2007, 10:43 AM
Active Member
 
Posts: 49
Default

I wound up rebuilding and sticking with the 2.6.9.55-ELsmp kernel as opposed to 2.6.9.55.0.2-ELsmp one. Everything installed fine using the single node cluster install, but fails to start.

At first it was complaining that it could not locate /opt/zimbra-cluster/bin/zmcluctl. I discovered that I needed to manually install the zimbra-cluster rpm. Next It complained about OCF_RESKEY_service_name not being set, so I set it and manually attempted a restart, but got:

standard in must be a tty

If I wait a while and attempt to run it again, I get:

This node is already running Zimbra service zimba.

But there are no zimbra processes running.

Zimbra Support hasn't been any help at all in getting this resolved. If anyone has experience resolving this or similar issues, I would appreciate your assistance.
Reply With Quote
  #3 (permalink)  
Old 07-26-2007, 04:49 AM
Moderator
 
Posts: 2,207
Default

I had the same problem with the "missing module" (on RHEL 4).

It was a bad up2date : some of the modules where updated but not all of them (plus it was correct on one of the two nodes while broken on the other one).

I did a new up2date and everything went OK.
Reply With Quote
  #4 (permalink)  
Old 07-26-2007, 05:43 AM
Active Member
 
Posts: 49
Default

The resolution was 2-fold. First, if you previously attempted a cluster install, running 'install -u' does not remove all of the pieces. Some you will have to remove by hand. Here are the steps I followed:

1. Disable the cluster services

# clusvcadm -d [CLUSTERSVCNAME]

2. Save a copy of cluster.conf

# cp /etc/cluster.config /etc/cluster.config.good

3. Removing the data in the SAN/iSCSI shared device

# mount [SAN] [MOUNTPOINT]
# rm -rf *

4. Erasing the zimbra-cluster RPM

# rpm -e zimbra-cluster (very important!!)

5. Un-install ZCS

# ./install.sh -u

6. Removing zimbra related directories

# cd /opt
# rm -rf zimbra-cluster
# rm -rf zimbra

Alternatively:
# cd zimbra
# rm -rf .* (to remove the .saveconfig, etc.)

7. Cleaning up the passwd and group files - including the shadow files (zimbra, postfix and postdrop)

# userdel zimbra
# groupdel zimbra

Secondly, I found that CMAN does not like service names longer than 16 characters, but the zimbra-cluster install requires the service name to be the same as the cluster hostname. In my case I have 2 nodes, 'mx.domain.com' and 'mx2.domain.com'. My cluster name is 'mx.domain.com'. I used 'mx.domain.com' for all the service name related items during the zimbra installation, but during the last step, 'configure-cluster', I called it 'zimbra'.

I'm not sure if there was a conflict with the naming conventions, but it appears to be working. I haven't run any serious test, yet.
Reply With Quote
  #5 (permalink)  
Old 07-26-2007, 07:47 AM
Moderator
 
Posts: 2,207
Default

Quote:
Originally Posted by briansrapier View Post
First, if you previously attempted a cluster install, running 'install -u' does not remove all of the pieces.
I should have mentionned that, sorry.
Bug 17209 - ./install.sh -u does not delete cluster RPM
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Similar Threads

Why Join?

Registering let's you ask questions, makes it easier to search, displays any files attached to posts, and notifies you about replies.

blog.zimbra.com




 

SEO by vBSEO ©2011, Crawlability, Inc.