LINBIT participates in the German Cloud (“Deutsche Wolke”)

 

Deutsche Wolke (“German Cloud”) was founded to establish Federal Cloud Infrastructure in Germany.

This infrastructure will provide additional legal and security protections for hosted data.  No longer will small businesses be exposed to the legal risk of losing their website presence without a trial (an unfortunate reality when doing business on transatlantic clouds).

The natural partner for backend storage infrastructure is LINBIT; as authors and maintainers of DRBD, we are best suited to provide the technical expertise to achieve High Availability.  Also, LINBIT DR is the obvious choice for off-site or disaster recovery replication (from the office into the cloud).

We at LINBIT look forward to seeing this project grow and prosper!

Monitoring: better safe than sorry…

Stumbling upon the Holy time-travellin’ DRBD, batman! blog post there’s only one thing to be said …

Be strict in what you emit, liberal in what you accept[1. Thanks, Larry]

is simply not true when dealing with mission-critical systems.

It’s ok to be alerted on upgrading a machine because the “old, working” RegEx that did the parsing doesn’t match anymore[1. eg. because /proc/drbd got an additional field]; it’s not a problem to get an email when someone adds the 100th DRBD resource and causes the grep to fail; and so on. Read more

Maximum volume size on DRBD

From time to time we get asked things like this:

I want to use a 10TiB volume with DRBD, is that supported”?

The easiest way to answer things like that is to say look for yourself on the public DRBD usage page – the biggest public device size is ~220TiB, so go figure 😉 (JUNE 2019 UPDATE: The largest device is now up to 600 TiB.) Read more

Trust, but verify

DRBD tries to ensure data integrity across different computers, and it’s quite good at it.

But, as per the old saying Trust, But Verify[1. attributed either to Lenin or Kennedy] it might be a good idea to periodically test whether the nodes really have identical data, similar to the checks that are[1. or at least can be] done for RAID sets. Read more

DRBD and the sync rate controller (8.3.9 and above)

The sync-rate controller is used for controlling the used bandwidth during resynchronization (not normal replication); it runs in the SyncTarget state, ie. on the (inconsistent) receiver side. Read more

DRBD causes too much CPU-load

The TL;DR version: don’t use data-integrity-alg in a production setup. Read more

“al-extents” explained

There is quite a bit of confusion about the DRBD configuration value al-extents (activity log extents), so here’s another shot at explaining it. Read more

Make the kernel start write-out earlier

Similar to the recent post about setting the vm.min_free_kbytes value there’s another sysctl that might improve the behaviour: the dirty ratio. Read more

DRBD resources need different monitor intervals

As briefly mentioned in Pacemaker Explained, DRBD devices need two different values set for their monitor intervals:

primitive pacemaker-resource-name ocf:linbit:drbd         \
        params drbd_resource="drbd-resource"              \
        op monitor interval="61s" role="Slave"            \
        op monitor interval="59s" role="Master"

The reason is that Pacemaker distinguishes monitor operations by their resource and their interval – but not by their role. So, if this distinction is not done “manually”, Pacemaker will monitor only one of the two (and, with DRBD 9, more) nodes, which is not what you want (usually).

Increase vm.min_free_kbytes for better OOM resistance

Depending on your setup and your workload (eg. within a virtual machine with little memory and much I/O) you could get into the situation that the kernel has little memory left, so wants to write some dirty pages to disk, but cannot, because for that it would need some memory free! Read more