DRBD tries to ensure data integrity across different computers, and it’s quite good at it.
But, as per the old saying Trust, But Verify[1. attributed either to Lenin or Kennedy] it might be a good idea to periodically test whether the nodes really have identical data, similar to the checks that are[1. or at least can be] done for RAID sets.
verify-alg digest is used to save bandwidth during online verification; while without this setting the whole data has to be transferred[1. unless you opt to verify only a part of it], a value of
md5 means that only 20 bytes are needed for each 4KiByte block, resulting in bandwidth savings of about 99.5%.
If the volume you’re checking is actively used, you might see a few false positives in the log messages:
kernel: block drbd0: Out of sync: start=56079768, size=8 (sectors)
This is because data blocks might have been changed by the application in RAM after submitting the write request (but before getting it acknowledged!), and so would compare different generations of the data. If you do this check eg. every week and get different block numbers every time, you’re fine. If you get the same block number(s), your storage might have stuck bits, and be unable to correctly write data in these blocks!
Please note that the needed
verify-alg setting here sounds similar to the
data-integrity-alg option, but serves a different purpose.
data-integrity-alg means more CPU-usage for every write; but, similar to
verify-alg, it is subject to false-positives, see here for details on both of these points.