Zimbra offers Open Source email server software and shared calendar for Linux and the Mac
Go Back   Zimbra :: Forums > Zimbra Collaboration Suite > Installation

Welcome to the Zimbra :: Forums!
Welcome, if you would like to post a comment please register. We also encourage you to explore all things Zimbra with our team and members of the community.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 12-17-2007, 04:47 PM
Loyal Member
 
Posts: 91
Default DRBD & Heartbeat not quite working as expected

After several days (heartbeat and DRBD are new to me) I've gotten Zimbra working with heartbeat, mostly.

If Zimbra is working off Server-B and Server-B goes down, Zimbra transfers over to Server-A. The problem is that the servers reboot so quickly during a test (less than a minute) that Zimbra is about 90% started on Server-A when it receives a heartbeat command to transfer back to Server-B. Server-A takes a while to unmount /opt and both server's DRBD ends up going to Secondary/Secondary, the shared IP is never assigned again. I end up rebooting both servers and everything comes back up.

auto_failback off is set to off on both servers, and heartbeat is set to prefer Server-A to start with.

I've been pulling my hair out on this one, and these are new servers.
2.66G 64bit Pentium Ds
1G of RAM
1 mailbox (I was still testing heartbeat and haven't setup the mailboxes yet)

Does anyone know what I need to tweak?

Doug
Reply With Quote
  #2 (permalink)  
Old 12-17-2007, 04:59 PM
Moderator
 
Posts: 6,237
Default

Quote:
Originally Posted by DougWare View Post
The problem is that the servers reboot so quickly during a test (less than a minute) that Zimbra is about 90% started on Server-A when it receives a heartbeat command to transfer back to Server-B.
Did you remove zimbra from your runlevels on Server-A? (/etc/rc#.d/S99zimbra)
Reply With Quote
  #3 (permalink)  
Old 12-17-2007, 05:22 PM
Loyal Member
 
Posts: 91
Default

I did, but then I reinstalled Zimbra on Server-B.

I guess I forgot to remove them again. I've removed them and I am restarting now to see if that corrects the problem.

Thank you for pointing that out!

Doug
Reply With Quote
  #4 (permalink)  
Old 12-17-2007, 05:25 PM
Loyal Member
 
Posts: 91
Default

Same outcome....

Dec 17 20:23:48 mailserver1B heartbeat: [2506]: info: mailserver1a wants to go standby [foreign]
Dec 17 20:23:49 mailserver1B heartbeat: [2506]: info: standby: acquire [foreign] resources from mailserver1a
Dec 17 20:23:49 mailserver1B heartbeat: [2842]: info: acquire local HA resources (standby).
Dec 17 20:23:49 mailserver1B heartbeat: [2842]: info: local HA resource acquisition completed (standby).
Dec 17 20:23:49 mailserver1B heartbeat: [2506]: info: Standby resource acquisition done [foreign].
Dec 17 20:23:49 mailserver1B heartbeat: [2506]: info: remote resource transition completed.

Doug
Reply With Quote
  #5 (permalink)  
Old 12-17-2007, 05:26 PM
Loyal Member
 
Posts: 91
Default

Here's the same output from Server-A....

Dec 17 20:23:11 mailserver1A heartbeat: [2498]: WARN: T_STARTING received during takeover.
Dec 17 20:23:11 mailserver1A heartbeat: [2498]: info: remote resource transition completed.
Dec 17 20:23:13 mailserver1A ResourceManager[18922]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.2.20/24/bond0 stop
Dec 17 20:23:13 mailserver1A IPaddr[24657]: INFO: ifconfig bond0:0 down
Dec 17 20:23:13 mailserver1A IPaddr[24628]: INFO: Success
Dec 17 20:23:13 mailserver1A ResourceManager[18922]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /opt reiserfs stop
Dec 17 20:23:13 mailserver1A Filesystem[24719]: INFO: Running stop for /dev/drbd0 on /opt
Dec 17 20:23:13 mailserver1A Filesystem[24719]: INFO: Trying to unmount /opt
Dec 17 20:23:13 mailserver1A Filesystem[24719]: ERROR: Couldn't unmount /opt; trying cleanup with SIGTERM
Dec 17 20:23:14 mailserver1A Filesystem[24719]: INFO: Some processes on /opt were signalled
Dec 17 20:23:15 mailserver1A Filesystem[24719]: INFO: unmounted /opt successfully
Dec 17 20:23:15 mailserver1A Filesystem[24708]: INFO: Success
Dec 17 20:23:15 mailserver1A ResourceManager[18922]: info: Running /etc/ha.d/resource.d/drbddisk r0 stop
Dec 17 20:23:15 mailserver1A kernel: drbd0: role( Primary -> Secondary )
Dec 17 20:23:15 mailserver1A kernel: drbd0: Writing meta data super block now.
Dec 17 20:23:15 mailserver1A heartbeat: [18896]: info: local HA resource acquisition completed (standby).
Dec 17 20:23:15 mailserver1A heartbeat: [2498]: info: Standby resource acquisition done [all].
Dec 17 20:23:15 mailserver1A harc[24828]: info: Running /etc/ha.d/rc.d/status status
Dec 17 20:23:15 mailserver1A mach_down[24844]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
Dec 17 20:23:15 mailserver1A mach_down[24844]: info: mach_down takeover complete for node mailserver1b.
Dec 17 20:23:15 mailserver1A heartbeat: [2498]: info: mach_down takeover complete.
Dec 17 20:23:15 mailserver1A harc[24878]: info: Running /etc/ha.d/rc.d/status status
Dec 17 20:23:15 mailserver1A harc[24894]: info: Running /etc/ha.d/rc.d/status status
Dec 17 20:23:15 mailserver1A harc[24910]: info: Running /etc/ha.d/rc.d/status status
Dec 17 20:23:45 mailserver1A hb_standby[24946]: Going standby [foreign].
Dec 17 20:23:45 mailserver1A heartbeat: [2498]: info: mailserver1a wants to go standby [foreign]
Dec 17 20:23:45 mailserver1A heartbeat: [2498]: info: standby: mailserver1b can take our foreign resources
Dec 17 20:23:45 mailserver1A heartbeat: [24960]: info: give up foreign HA resources (standby).
Dec 17 20:23:45 mailserver1A heartbeat: [24960]: info: foreign HA resource release completed (standby).
Dec 17 20:23:45 mailserver1A heartbeat: [2498]: info: Local standby process completed [foreign].
Dec 17 20:23:46 mailserver1A heartbeat: [2498]: WARN: 1 lost packet(s) for [mailserver1b] [46:48]
Dec 17 20:23:46 mailserver1A heartbeat: [2498]: info: remote resource transition completed.
Dec 17 20:23:46 mailserver1A heartbeat: [2498]: info: No pkts missing from mailserver1b!
Dec 17 20:23:46 mailserver1A heartbeat: [2498]: info: Other node completed standby takeover of foreign resources.
Reply With Quote
  #6 (permalink)  
Old 10-12-2010, 08:19 AM
Senior Member
 
Posts: 56
Default

can you please tell me what's in your /etc/heartbeat/haresources file?
I can't get zimbra to start and get it mounted from drbd with heartbeat

Thanks,
Tibby
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Similar Threads

Why Join?

Registering let's you ask questions, makes it easier to search, displays any files attached to posts, and notifies you about replies.

blog.zimbra.com




 

SEO by vBSEO ©2011, Crawlability, Inc.