Posts

Monitoring: better safe than sorry…

Stumbling upon the Holy time-travellin’ DRBD, batman! blog post there’s only one thing to be said …

Be strict in what you emit, liberal in what you accept[1. Thanks, Larry]

is simply not true when dealing with mission-critical systems.

It’s ok to be alerted on upgrading a machine because the “old, working” RegEx that did the parsing doesn’t match anymore[1. eg. because /proc/drbd got an additional field]; it’s not a problem to get an email when someone adds the 100th DRBD resource and causes the grep to fail; and so on. Read more

DRBD resources need different monitor intervals

As briefly mentioned in Pacemaker Explained, DRBD devices need two different values set for their monitor intervals:

primitive pacemaker-resource-name ocf:linbit:drbd         \
        params drbd_resource="drbd-resource"              \
        op monitor interval="61s" role="Slave"            \
        op monitor interval="59s" role="Master"

The reason is that Pacemaker distinguishes monitor operations by their resource and their interval – but not by their role. So, if this distinction is not done “manually”, Pacemaker will monitor only one of the two (and, with DRBD 9, more) nodes, which is not what you want (usually).