Remus-fu with DRBD, OCFS2 on Debian with No Redhat Clustering

My earlier blog post at http://www.loonjuice.com/2010/06/13/remus-fu-with-drbd-gfs2-redhat-clustering-on-debian/ described using Xen Remus with shared disk via Redhat Cluster & Redhat’s GFS2 on a dual-primary DRBD device between two Xen Remus Hypervisors.

But I found that the overhead & workload of managing a Redhat Cluster between the 2 x Xen Hypervisors to be excessive & the failure/recovery modes & actions required whenever a Hypervisor host went offline, to be unreliable & requiring manual intervention (reset button) to hypervisors under some failure modes.

I’ve now moved to using OCFS2 on dual-primary DRBD between the Hypervisors & so far this has performed far better in coping with OCFS Peer loss & restart.

Actions req’d to install & setup OCFS2 on the Debian based Xen Remus Hypervisors was:

(on both hypervisor systems, after Xen Remus install)

apt-get install ocfs2-tools ocfs2console

Setup /etc/ocfs2/cluster.conf the same on both Hypervisors, eg:

node:
ip_port = 7777
ip_address = 192.168.X.Y
number = 0
name = debremus2
cluster = ocfs2

node:
ip_port = 7777
ip_address = 192.168.X.Z
number = 1
name = debremus1
cluster = ocfs2

cluster:
node_count = 2
name = xenocfs2cluster

Then initialise OCFS2 on both hosts with:

dpkg-reconfigure o2cb
/etc/init.d/o2cb restart

/etc/init.d/ocfs2 restart

Ensure that the same /etc/ocfs2/cluster.conf file is on both Hypervisors.

Ensure that DRBD is primary/primary on both Hypervisors.

Check the status (after initialisation) of OCFS2 on both hypervisors with:

/etc/init.d/o2cb status

Assuming you see something like the following, then proceed to make the OCFS2 filesystem!

Driver for “configfs”: Loaded
Filesystem “configfs”: Mounted
Stack glue driver: Loaded
Stack plugin “o2cb”: Loaded
Driver for “ocfs2_dlmfs”: Loaded
Filesystem “ocfs2_dlmfs”: Mounted
Checking O2CB cluster xenocfs2cluster: Online
Heartbeat dead threshold = 31
Network idle timeout: 30000
Network keepalive delay: 2000
Network reconnect delay: 2000
Checking O2CB heartbeat: Active

mkfs -t ocfs2 -N 2 -L ocfs2_drbd0 /dev/drbd0

Then you can mount that filesystem on both Hypervisors with:

mount -t ocfs2 /dev/drbd0 /usr/ocfs2

Now, finally, you can start your Remus enabled guests, using a ‘disk=’ line like:

disk = ['tap:remus:192.168.X.Y:9000|aio:/usr/ocfs2/vm1-hvm.img,hda,w']

Where 192.168.X.Y is the IP address of the 2nd/failover-to Hypervisor host!

RSS feed for comments on this post. TrackBack URI

Leave a Reply

You must be logged in to post a comment.