Posts

Kubernetes Operator: LINSTOR’s Little Helper

Before we describe what our LINSTOR Operator does, it is a good idea to discuss what a Kubernetes Operator actually is. If you are already familiar with Kubernetes Operators, feel free to skip the introduction.

Introduction

CoreOS describes Operators like this:

An Operator is a method of packaging, deploying and managing a Kubernetes application. A Kubernetes application is an application that is both deployed on Kubernetes and managed using the Kubernetes APIs and  kubectl tooling.

That is quite a lot to grasp if you are new to the concept of Operators. What users want to do is their work, without having to worry about setting up infrastructure. Still, there is more to a software lifecycle than just firing up the application in a cluster once. This is where an Operator comes into play. In my opinion, a good analogy is to think of a Kubernetes Operator as an actual human operator. So what would the responsibilities of such a human operator be?

A human operator would be an expert in the business logic of the software that she runs and the dependencies that need to be fulfilled to run the software. Additionally, the operator would be responsible for configuring the software: to scale it, to upgrade it, to make backups, and so on. This is also the responsibility of a Kubernetes Operator implemented in software. It is a software component that is built by experts for a particular containerized software. This Operator is executed by an administrator who isn’t necessarily an expert for that particular software.

A Kubernetes Operator is executed in the Kubernetes cluster itself. In contrast to shell scripts or Ansbile playbooks, which are pretty generic, the Operator framework has one specific purpose, as well as access to Kubernetes cluster information. Additionally, they are managed by standard Kubernetes tools and not some external tools for configuration management. An Operator has a managed lifecycle on its own and is managed by the lifecycle manager.

LINSTOR Operator

CoreOS FAQ contains the following sentence:

Experience has shown that the creation of an Operator typically starts by automating an application’s installation and self-service provisioning capabilities, and then evolves to take on more complex automation.

This pretty much sums up the current state of the linstor-operator. The project is still very young and the current focus was on automating the setup of the LINSTOR cluster. If you are familiar with LINSTOR, you know that there is a central component named the LINSTOR controller, and workers — named LINSTOR satellites — that actually create LVM volumes and configure DRBD to provide data redundancy.

Currently, the Operator can add new nodes/kubelets to the LINSTOR cluster by registering them with their name and network interface configuration by the LINSTOR controller. A second important task is that a LINSTOR satellite can actually provide storage to the cluster. Therefore, the linstor-operator can also register one or multiple storage pools. Metrics of the LINSTOR satellite set’s storage pools are exposed by the operator. Therefore the system administrator saves a lot of time, because the LINSTOR Kubernetes Operator automates a lot of standard tasks, which were previously performed manually.

Future work

There is a lot of possible future work that can be done. Some tasks are obvious, some will be driven by actual input of our users. For example we think that configuring storage is one of the pain points for our uses. Therefore, having a sidecar container that can discover and prepare storage pools to be consumed by LINSTOR might be a good idea. In a dynamic environment such as Kubernetes it might be worthwhile to handle nodes failures in clever ways. We also have container images that can inject the DRBD kernel module into the running host kernel, which could help users with getting started with DRBD. High Availability is always an important topic and related to that using etcd as database backend. Further, we want to tackle one of the core tasks of Operators, which is managing upgrades from one LINSTOR controller version to the next.

Thanks to Hayley for reviewing this blog post, she is too busy doing the actual work. This is just a subset of the capabilities we plan for the linstor-operator. Stay tuned for more information!

 

Roland Kammerer
Software Engineer at Linbit
Roland Kammerer studied technical computer science at the Vienna University of Technology and graduated with distinction. Currently, he is a PhD candidate with a research focus on time-triggered realtime-systems and works for LINBIT in the DRBD development team.
LINSTOR High Level Resource API

High Level Resource API – The simplicity of creating replicated volumes

In this blog post, we present one of our recent extensions to the LINSTOR ecosystem: A high-level, user-friendly Python API that allows simple DRBD resource management via LINSTOR.

Background: So far LINSTOR components communicated by the following means: Via Protocol Buffers, or via the Python API that is used in the linstor command line client. Protocol Buffers are a great way to transport serialized structured data between LINSTOR components, but by themselves they don’t provide the necessary abstraction for developers.

That is not the job of Protocol Buffers. Since the early days we split the command line client into the client logic (parsing configuration files, parsing command line arguments…), and a Python library (python-linstor). This Python library provides all the bits and pieces to interact with LINSTOR. For example it provides a MultiLinstor class that handles TCP/IP communication to the LINSTOR controller. Additionally, it allows all the operations that are possible with LINSTOR (e.g. creating nodes, creating storage pools…). For perfectly valid reasons this API is very low level and pretty close to the actual Protocol Buffer messages sent to the LINSTOR controller.

By developing more and more plugins to integrate LINSTOR into other projects like OpenStack, OpenNebula, Docker Volumes, and many more, we saw that there is need for a higher level abstraction.

Finding the Right Abstraction

The first dimension of abstraction is to abstract from LINSTOR internals. For example it perfectly makes sense that recreating an existing resource is an error on a low level (think of it as EEXIST). On a higher level, depending on the actual object, trying to recreate an object might be perfectly fine and one wants to get the existing object (i.e. idem-potency).

The second dimension of abstraction is from DRBD and LINSTOR as a whole. Developers dealing with storage already have a good knowledge about concepts like nodes, storage pools, resource, volumes, placement policies… This is the part where we can make LINSTOR and DRBD accessible for new developers.

The third goal was to only provide a set of objects that are important in the context of the user/developer. This, for example, means that we can assume that the LINSTOR cluster is already set up, so we do not need to provide a high-level API to add nodes. For the higher-level API we can focus on [LINSTOR] resources. This allows us to satisfy the KISS (keep-it-simple-stupid) principle. A forth goal was to introduce new, higher-level concepts like placement policies. Placement policies/templates are concepts currently developed in core LINSTOR, but we can already provide basics on a higher level.

Demo Time

We start by creating a 10 GB big replicated LINSTOR/DRBD volume in a 3 node cluster. We want the volume to be 2 times redundant. Then we increase the size of the volume to 20 GB.

$ python
>> import linstor

>> foo = linstor.Resource('foo')

>> foo.volumes[0] = linstor.Volume("10 GB")

There are multiple ways to specify the size.

>> foo.placement.redundancy = 2

>> foo.autoplace()

>> foo.volumes[0].size += 10 * (2 ** 30)

This line is enough to resize a replicated volume cluster wide.

We needed 5 lines of code to create a replicated DRBD volume in a cluster! Let that sink in for a moment and compare it to the steps that were necessary without LINSTOR: Creating backing devices on all nodes, writing and synchronizing DRBD res(ource) files, creating meta-data on all nodes, drbdadm up the resource and force one to the Primary role to start the initial sync.

For the next step we assume that the volume is replicated and that we are a storage plugin developer. Our goal is to make sure the volume is accessible on every node because the block device should be used in a VM. So, A) make sure we can access the block device, and B) find out what the name of the block device of the first volume actually is:

>>> foo.activate(socket.gethostname())

>>> print(foo.volumes[0].device_path)

The method activate is one of these methods that shows how we intended abstraction. Note that we autoplaced the resource 2 times in a 3-node cluster. So LINSTOR chose the nodes that fit best. But now we want the resource to be accessible on every node without increasing the redundancy to 3 (because that would need additional storage and 2 times replicated data is good enough).

Diskless clients

Fortunately DRBD has us covered as it has the concept of diskless clients. These nodes provide a local block device as usual, but they read and write data from/to their peers only over the network (i.e. no local storage). Creating this diskless assignment is not necessary if the node was already part of the replication in the first place (then it already has access to the data locally).

This is exactly what activate does: If the node can already access the data – fine, if not, create a diskless assignment. Now assume we are done and we do not need access to the device anymore. We want to do some cleanup because we do not need a diskless assignment:

>>> foo.deactivate(socket.gethostname()) 

The semantic of this method is to remove the assignment if it is diskless (as it does not contribute to actual redundancy), but if it is a node that stores actual data, deactivate does nothing and keeps the data as redundant as it was. This is only a very small subset of the functionality the high-level API provides, there is a lot more to know like creating snapshots, converting diskless assignments to diskful ones and vice versa, or managing DRBD Proxy. For more information check the online documentation.

If you want to go deeper into the LINSTOR universe, please visit our youtube channel.

Roland Kammerer
Software Engineer at Linbit
Roland Kammerer studied technical computer science at the Vienna University of Technology and graduated with distinction. Currently, he is a PhD candidate with a research focus on time-triggered realtime-systems and works for LINBIT in the DRBD development team.