Kubernetes High Availability for Stateful Workloads

LINBIT® is a company with deep roots in Linux High Availability (HA). Because of this, LINBIT has some opinions on what HA is, and how it can be achieved.

Kubernetes’ approach to HA generally involves sprawling many replicas of an application across many cluster nodes, therefore making it less impactful when a single node or application instance fails. This approach is great for stateless applications, or applications that can tolerate the performance of shared storage, like front-end webapps or APIs.

In contrast, I/O demanding stateful applications and monolithic applications like certain databases or ERP systems often do not “sprawl” well, or at all. As a result, these applications are “on their own” in terms of achieving high availability in Kubernetes.

LINSTOR®’s High Availability Controller aims to provide high availability to pods in Kubernetes that cannot achieve this on their own.

StatefulSets, Deployments, and ReplicaSets in Kubernetes will eventually reschedule their pods, respectful to their defined replica counts, from failed nodes. The time and user intervention it takes to do that, however, is not what LINBIT typically considers highly available behavior. The pod eviction timeout in recent Kubernetes versions is 5 minutes, meaning a single outage would drop your application’s uptime beneath the “5 nines” threshold for the year.

Prior to Kubernetes v1.18, I would set the --pod-eviction-timeout on the kube-controller-manager for more aggressive pod eviction, but that was “forever ago”, and is no longer supported. Also, StatefulSets are “stickier” than Deployments or ReplicaSets, and require additional attention from their storage provider before they can be rescheduled.

LINSTOR’s HA Controller aims to improve pod eviction behavior for workloads backed by LINSTOR volumes. It does this by inspecting the quorum status of the DRBD® devices that LINSTOR provisions. If the replication network breaks, the active replica of the volume loses quorum, and LINSTOR’s HA Controller will move the StatefulSet’s pod to another worker that can access a replica of the volume.

Deployment of LINSTOR’s HA Controller for Stateful Workloads

Detailed steps for deployment of LINSTOR’s HA Controller can be found in the LINSTOR User’s Guide. The quick version of those steps is as follows:

  1. Deploy the HA Controller using Helm: helm install linstor-ha-controller linstor/linstor-ha-controller
  2. Add the following parameters to your LINSTOR StorageClasses:parameters: property.linstor.csi.linbit.com/DrbdOptions/auto-quorum: suspend-io property.linstor.csi.linbit.com/DrbdOptions/Resource/on-no-data-accessible: suspend-io property.linstor.csi.linbit.com/DrbdOptions/Resource/on-suspended-primary-outdated: force-secondary property.linstor.csi.linbit.com/DrbdOptions/Net/rr-conflict: retry-connect

If you need to omit some pod from LINSTOR’s HA Controller for any reason, marking it with the following annotation will cause the controller to ignore the marked pod: kubectl annotate pod <podname> drbd.linbit.com/ignore-fail-over=""

Example Using LINSTOR’s HA Controller

For example, here is a two replica LINSTOR StorageClass with the settings needed for LINSTOR’s HA Controller:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: "linstor-csi-lvm-thin-r2"
provisioner: linstor.csi.linbit.com
parameters:
  autoPlace: "2"
  storagePool: "lvm-thin"
  property.linstor.csi.linbit.com/DrbdOptions/auto-quorum: suspend-io
  property.linstor.csi.linbit.com/DrbdOptions/Resource/on-no-data-accessible: suspend-io
  property.linstor.csi.linbit.com/DrbdOptions/Resource/on-suspended-primary-outdated: force-secondary
  property.linstor.csi.linbit.com/DrbdOptions/Net/rr-conflict: retry-connect
reclaimPolicy: Delete

Using the following StatefulSet definition, create a workload backed by the linstor-csi-lvm-thin-r2 StorageClass:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: webapp
spec:
  selector:
    matchLabels:
      app: web
  serviceName: web-svc
  replicas: 1
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: web
        image: httpd:latest
        ports:
        - containerPort: 80
          hostPort: 2080
          name: http
        volumeMounts:
        - name: www
          mountPath: /usr/local/apache2/htdocs
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: linstor-csi-lvm-thin-r2
      resources:
        requests:
          storage: 1Gi

You can then “fail” the worker node running the pod, using echo b > sysrq-triggers to immediately reset it, and, thanks to the LINSTOR HA Controller, the StatefulSet managed pod should be migrated long before Kubernetes’ pod eviction would have kicked in.

Concluding Thoughts

LINSTOR’s HA Controller for Kubernetes can help lower the recovery time of Stateful workloads in Kubernetes, therefore increasing availability and helping to maintain SLAs.

The described software (LINSTOR, LINSTOR Operator for Kubernetes, and LINSTOR HA Controller) are all components of LINBIT SDS for Kubernetes. LINBIT SDS for Kubernetes is a bundle of access to prebuilt container images and 24×7 enterprise class support from the creators and maintainers of LINSTOR. The components are also freely available from the open source CNCF-sandboxed Piraeus project.

Matt Kereczman

Matt Kereczman

Matt Kereczman is a Solutions Architect at LINBIT with a long history of Linux System Administration and Linux System Engineering. Matt is a cornerstone in LINBIT's technical team, and plays an important role in making LINBIT and LINBIT's customer's solutions great. Matt was President of the GNU/Linux Club at Northampton Area Community College prior to graduating with Honors from Pennsylvania College of Technology with a BS in Information Security. Open Source Software and Hardware are at the core of most of Matt's hobbies.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.

Talk to us

LINBIT is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested from us. From time to time, we would like to contact you about our products and services, as well as other content that may be of interest to you. If you consent to us contacting you for this purpose, please tick above to say how you would like us to contact you.

You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.

By clicking submit below, you consent to allow LINBIT to store and process the personal information submitted above to provide you the content requested.