As an update to the earlier blog post, take a look below.
As a reminder: this is about resynchronization (ie. recovery after a node or network problem), not about the replication.
If you’ve got a demanding application it’s possible that it completely fills your I/O bandwidth, disk and/or network, leaving no room for the synchronization to complete. To make the synchronization slow down and let the application proceed, DRBD has the dynamically adaptive resync rate controller.
It is enabled by default with 8.4, and disabled by default with 8.3.
To explicitly enable or disable, set
20 (enable) or
Note that, while enabled, the setting for the old fixed sync rate is used only as initial guess for the controller. After that, only the
c-* settings are used, so changing the fixed sync rate while the controller is enabled won’t have much effect.
What it does
The resync controller tries to use up as much network and disk bandwidth as it can get, but no more than
c-max-rate, and throttles if either
- more resync requests are in flight than what amounts to
c-fill-target[1. Or, if
c-fill-targetis set to
0, if the current estimated response delay from the peer is more than
- it detects application IO (read or write), and the current estimated resync rate is above
c-min-rate with 8.4.x is 250 kiB/sec (the old default of the fixed
sync-rate), with 8.3.x it was 4MiB/sec.
This “throttle if application IO is detected” is active even if the fixed sync rate is used. You can (but should not, see below) disable this specific throttling by setting
Tuning the resync controller
It’s hard, or next to impossible, for DRBD to detect how much activity your backend can handle. But it is very easy for DRBD to know how much resync-activity it causes itself.
So, you tune how much resync-activity you allow during periods of application activity.
To do that you should
20(default with 8.4), or more if there’s a lot of latency on the connection (WAN link with protocol A);
- leave the fixed resync rate (the initial guess for the controller) at about 30% or less of what your hardware can handle;
c-max-rateto 100% (or slightly more) of what your hardware can handle;
c-fill-targetto the minimum (just as high as necessary) that gets your hardware saturated, if the system is otherwise idle.
Respectively, figure out the maximum possible resync rate in your setup while the system is idle, then set
c-fill-targetto the minimum setting that still reaches that rate.
- And finally, while checking application request latency/responsiveness, tune
c-min-rateto the maximum that still allows for acceptable responsiveness.
Most parts of this post were originally published as an ML post by Lars.