About nine years ago, LINBIT introduced the dynamic sync-rate controller in DRBD 8.3.9. The goal behind this was to tune the amount of resync traffic so as to not interfere or compete with application IO to the DRBD device. We did this by examining the current state of the resync, application IO, and network, ten times a second and then deciding on how many resync requests to generate.
We knew from experimentation that we could achieve higher resync throughput if we were to poll the situation during a resync more than ten times a second. However, we didn’t want to shorten this interval by default as this would uselessly consume CPU cycles for the DRBD installations on slower hardware. With DRBD 9.0.17, we now have some additional logic. The resync-rate controller will poll the situation both at 100ms AND when all resync requests are complete, even before the 100ms timer expires.
We estimate that these improvements will only be beneficial to storage and networks that are capable of going faster than 800MiB/s. For our benchmarks, I used two identical systems running CentOS 7.6.1810. Both are equipped with 3x Samsung 960 Pro M.2 NVMe drives. The drives are configured up in RAID0 via the Linux software RAID. My initial baseline test found that in this configuration, the disk can achieve a throughput around 3.1GiB/s (4k sequential writes). For the replication network, we have dual port 40Gb/s Mellanox ConnectX-5 devices. They’re configured as a bonded interface using mode 2 (Balance XOR). This crossover network was benchmarked at 37.7Gb/s (4.3GiB/s) using iperf3. These are actually the same systems used in my NVMe-oF vs. iSER test here.
Then, I ran through multiple resyncs as I found out how best to tune the DRBD configuration for this environment. It was no surprise that DRBD 9.0.17 looked to be the clear winner. However, this was fully-tuned. I wanted to find out what would happen if we scaled back the tuning slightly. I then decided to test while also using the default values for sndbuf-size and rcvbuf-size. The results were fairly similar, but surprisingly 9.0.16 did a little better. For my “fully tuned” test, my configuration was the default except for the following tuning:
resync-rate 3123M; # bytes/second, default
c-plan-ahead 20; # 1/10 seconds, default
c-delay-target 10; # 1/10 seconds, default
c-fill-target 1M; # bytes, default
c-max-rate 3123M; # bytes/second, default
c-min-rate 250k; # bytes/second, default
Three tests were ran. All tests were done with a 500GiB LVM volume using the /dev/md0 device as the physical volume. A blkdiscard was run against the LVM between each test. The first test was with no application IO to the /dev/drbd0 device and without the sndbuf-size and rcvbuf-size. The second was with no application IO, and fully-tuned as per the configuration above. The third test was again fully-tuned with the configuration above. However, immediately after the resource was promoted to primary, I began a loop that used ‘dd’ to write 1M chunks of zeros to the disk sequentially.
From the graph above, we can see that – when idle – DRBD 9.0.17 can resync at the speeds the physical disk will allow, where as 9.0.16 seems to top out around 2000MiB/s. DRBD 8.4.11 can’t seem to push past 1500MiB/s. However, when we introduce IO to the DRBD virtual disk, both DRBD 9 versions scale back the resync speeds to roughly the same speeds. Surprisingly, DRBD 8.4 doesn’t throttle down as much and hovers around 700MiB/s. This is most likely due to an increase in IO lockout granularity between versions 8 and 9. However, faster here is not necessarily desired. It is usually favorable to have the resync speeds throttled down in order to “step aside” and allow application IO to take priority.
Have questions? Submit them below. We’re happy to help!
LINBIT wants to make your testing of our software easy! We’ve started creating Ansible playbooks that automate the deployment of our most commonly clustered software stacks.
The HA NFS playbook will automate the deployment of a HA NFS Cluster, using DRBD9 for replicated storage and Pacemaker as the cluster resource manager; quicker than you can brew a pot of coffee.
Email us for your free trial!
Be sure to write “Ansible” in the subject line.
Build an HA NFS Cluster using Ansible with packages from LINBIT.
- An account at https://my.linbit.com(contact [email protected]).
- Deployment environment must have Ansible
- All target systems must have passwordless SSH access.
- All hostnames used in inventory file are resolvable (better to use IP addresses).
- Target systems are CentOS/RHEL 7.
More information is available: LINBIT Ansible NFS Cluster on GitHub.
LINBIT is opening up a new division to specifically address our community’s desire to turn the music up to 11! As you all know, LINBIT is famous for its DRBD, Software-Defined Storage (SDS) and Disaster Recovery (DR) solutions. In a paradigm-shifting turn of events by management, LINBIT has decided to expand into the music industry. Since there is so much business potential in playing live concerts, LINBIT has transformed five of their employees into LIN:BEAT – The Band. These concerts are of course made highly-available by utilizing LINBIT’s own DRBD software.
The Band will be touring all Cloud and Linux events around the globe in 2019. Band members use self-written code to produce their unique sound design, reminiscent of drum and bass and heavily influenced by folk punk. The urge to portray all the advantages of DRBD and LINSTOR is so strong they had to send a message to the world: Their songs tell the world about the ups and downs of administrators who handle big storage clusters. LIN:BEAT offers a variety of styles in their musical oeuvre: While “LINSTOR” is a heavy-driven rock song and “Snapshots” speaks to funk-loving people, even “Disaster Recovery,” a love ballad, made it into their repertoire. Lead singer, Phil Reisner, sings sotto voce about his lost love — a RAID rack called “SDS”. Reisner told the reporters, “Administrators are such underrated people. This is sadly unfortunate. We strive to give all administrators a voice. Even if it’s a musical one!”
Crowds will be jumping up and down in excitement when LIN:BEAT comes to town! Be there and code fair!
An Excerpt of the song “My first love is DRBD” written by Phil Reisner:
My first love is DRBD,
there has never been a fee
instead it serves proudly as open source
your replication has changed the course
It’s the crucial key for Linux’s destiny
Come and meet LINBIT at CloudFest 2019 in Europa-Park, Rust, Germany. (23rd – 29rd of March 2019)
CloudFest has made a name over the last few years as one of the the best cloud-focused industry events in which to network and have a good time. This year more than 7,000 people are attending the event. Attendees will hear from leaders in the business and get the latest industry buzz.
The speaker line-up includes names like Dr. Ye Huang, Head of Solution Architects at Alibaba, Will Pemble, CEO at Goal Boss, Bhavin Turakhia, CEO at Flock, or Brian Behlendorf, Inventor of the Apache Web server.
VISIT us at Booth H24!
LINBIT is announcing some exciting news at Cloudfest: NVMe-oF with LINSTOR! Meaning LINSTOR can now be used as a standalone product, independent from DRBD. NVMe-oF supports Infiniband with RDMA and allows ultrafast performance, easily handling workloads for Big Data Analytics or Artificial Intelligence. Come say hello at Cloudfest! Visit us at Booth H24.
We are looking forward to you!
Booth visitors will be rewarded with a surprise that even your family will love! 🙂
A few weeks ago, LINBIT publicly released the LINSTOR CSI (Container Storage Interface) plugin. This means LINSTOR now has a standardized way of working with any container orchestration platform that supports CSI. Kubernetes is one of those platforms, so our developers put in the work to make LINSTOR integration with Kubernetes easy, and I’ll show you how!
You’ll need a couple things to get started:
- Kubernetes Cluster (1.12.x or newer)
- LINSTOR Cluster
LINSTOR’s CSI plugin requires certain Kubernetes feature gates be enabled on the
kube-apiserver and each
CSIDriverRegistry feature gates on the
kube-apiserver by adding,
--feature-gates=KubeletPluginsWatcher=true,CSINodeInfo=true, to the list of arguments passed to the
kube-apiserver system pod in the
/etc/kubernetes/manifests/kube-apiserver.yaml manifest. It should look something like this:
# cat /etc/kubernetes/manifests/kube-apiserver.yaml apiVersion: v1 kind: Pod metadata: annotations: scheduler.alpha.kubernetes.io/critical-pod: "" creationTimestamp: null labels: component: kube-apiserver tier: control-plane name: kube-apiserver namespace: kube-system spec: containers: - command: - kube-apiserver ... snip ... - --feature-gates=KubeletPluginsWatcher=true,CSINodeInfo=true ... snip ...
To enable these feature gates on the Kubelet, you’ll need to add the following argument to the
KUBELET_EXTRA_ARGS variable located in the
--feature-gates=CSINodeInfo=true,CSIDriverRegistry=true. Your config should look something like this:
# cat /etc/sysconfig/kubelet KUBELET_EXTRA_ARGS="--feature-gates=CSINodeInfo=true,CSIDriverRegistry=true"
Once you’ve modified those two configurations, you can prepare your configuration for the CSI plugin’s sidecar containers.
curl down the latest version of the plugin definition:
# curl -O \ https://raw.githubusercontent.com/LINBIT/linstor-csi/master/examples/k8s/deploy/linstor-csi.yaml
value: of each instance of
LINSTOR-IP in the
linstor-csi.yaml to the IP address of your LINSTOR Controller. The placeholder IP in the example yaml is 192.168.100.100, so we can use the following command to update this address (or you can edit it with an editor), simply set
CON_IP to your controller’s IP address:
# CON_IP="x.x.x.x"; sed -i.example s/192\.168\.100\.100/$CON_IP/g linstor-csi.yaml
Finally, apply the yaml to the Kubernetes cluster:
# kubectl apply -f linstor-csi.yaml
You should now see the
linstor-csi sidecar pods running in the
# watch -n1 -d kubectl get pods --namespace=kube-system --output=wide
Once running, you can define storage classes in Kubernetes pointing to our LINSTOR storage pools that we can then provision persistent, and optionally replicated by DRBD, volumes from for our containers.
Here is an example yaml definition that describes a LINSTOR storage pool in my cluster named,
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: linstor-autoplace-1-thin-lvm provisioner: io.drbd.linstor-csi parameters: autoPlace: "1" storagePool: "thin-lvm"
And here is an example yaml definition for a persistent volume claim carved out of the above storage class:
apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: volume.beta.kubernetes.io/storage-class: linstor-autoplace-1-thin-lvm name: linstor-csi-pvc-0 spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi
Put it all together and you’ve got yourself an open source, high performance, block device provisioner for your persistent workloads in Kubernetes!
There are many ways to craft your storage class definitions for node selection, storage tiering, diskless attachments, or even off site replicas. We’ll be working on our documentation surrounding new features, so stay tuned, and don’t hesitate to reach out for the most UpToDate information about LINBIT’s software!
Read more: CSI Plugin for LINSTOR Complete.
With the LINSTOR volume driver for OpenStack, Linux storage created in OpenStack Cinder can be easily provisioned, managed and seamlessly replicated across a large Linux cluster.
LINSTOR is an open-source storage orchestrator designed to deliver easy-to-use software-defined storage in Linux environments. LINSTOR uses LINBIT’s DRBD to replicate block data with minimal overhead and CPU load. Managing a LINSTOR storage cluster is as easy as a few LINSTOR CLI commands or a few lines of Python code with the LINSTOR API.
LINSTOR pairs with Openstack
OpenStack paired with LINSTOR brings even greater power and flexibility by enabling Linux to become your SDS platform. Replicate storage wherever you need it with simple mouse clicks. Provision snapshots. Create new volumes with those snapshots. LINSTOR volumes can then be paired with the right compute nodes just as easily. Together, OpenStack and LINSTOR bring tremendous potential to provide robust infrastructure with ease, all powered by open-source.
Data replicated with LINSTOR can minimize downtime and data loss. Running your cloud on commodity hardware with the native Linux features underneath provides the most flexible, reliable, and cost-effective solution to hosting customized OpenStack deployment anywhere.
In addition to storage management and replication, LINBIT also offers Geo-Clustering solutions that work with LINSTOR to enable long-distance data replication inside private and public cloud environments.
For a quick recap, please check out this video on deploying LINSTOR volumes with OpenStack’s Horizon GUI.
More information about LINBIT’s DRBD and LINSTOR visit:
For LINSTOR OpenStack Drivers
For LINSTOR Driver Documentation:
For LINBIT’s LINSTOR webpage: