A DRBD Dual-Primary setup done right

Cluster filesystems like GFS2 gained a lot popularity lately.
Higher Performance. Scalability. Fault tolerance. These features sound like heaven for any system architect. But they should be used with caution, because they add another layer of complexity to the cluster environment, which one should know to handle properly.

Phil Reisner, author of DRBD, points out three major questions, you should ask yourself prior to start hacking your console:

Alternatively, but not necessarily, you may want to sleep over between question two and three.

Note: if you do not know what STONITH means, you definitely need to read into the topic of dual primary setups first. Download here:

Dual Primary: Think Twice

Be sure that the testing time is well invested. A wrong configured dual primary cluster can not only misbehave, it can kill the high availability of your system, because the failover mechanism simply will not work in case of system failure and even more important, it can easily cause a corruption of data.

You assume you have 99.99% availability. Well, you may check your config again.

Let us show you how it’s done right!

Our new tech-guide is based on the thoughts of dual primary setups explained in the previous tech-guides and shows the complete process of implementing GFS2 with DRBD data replication.

We guide you through the configuration of

Fencing strategies

You may ask what’s all about the fencing and STONITH stuff. Why do I need all this?
Well here comes the answer:
GFS2 permits you access to your data from two different servers at the same time. To be sure that the data is not altered by both nodes at the same time, GFS2 has implements a locking mechanism called glocks. These glocks are continuously synchronized within both cluster node to prevent data corruption.
In case of a network outage between the nodes this synchronization is interrupted and both nodes consider themselfs as alone. In this situation each node can modify it’s data independently as no locking synchronization is possible. You just have created a split brain situation, which means that you have diverging datasets. To resolve this problem you will then have to decide, which dataset is going to survive and drop the other one.
You can imagine that this decision won’t always be a painless one.
The only way to prevent this kind of situation, is to bring you cluster into a defined state before the split brain can even occur and this simply done by killing one node of the cluster, so that it can not generate it’s own dataset. This mechanism is called: STONITH (“Shoot The Other Node in The Head”)

Cluster enabled applications

The truth is that a dual primary DRBD cluster with a GFS2 on top is not fully transparent to the service. The application you are planning to implement has to support a clustered environment. In short the application has to know that it is not the only one. Unfortunately there are only a handful of applications which do support that, so the use cases are fairly limited.

Nevertheless there are some interesting scenarios where a dual primary DRBD setup makes a very sense.

Deploying DRBD with Citrix XenServer

Now, the only thing missing for you is to download our GFS2 on a dual-primary DRBD tech-guide and get to work. Keep in mind that this tech-guide only focuses on this very topic and can not replace knowledge and other considerations on cluster configuration.


For any feedback regarding this tech-guide just drop us a line.

Leave a Reply

Your email address will not be published. Required fields are marked *