CSI Plugin for LINSTOR Complete

This CSI plug-in allows for the use of LINSTOR volumes on Container Orchestrators that implement CSI, such as Kubernetes.

Preliminary work on the CSI plugin for LINSTOR is now complete and is capable of operating as an independent driver. CSI is a project by the Cloud Native Computing Foundation which aims to serve as an industry standard interface for container orchestration platforms. This allows storage vendors to write one storage driver and have it work across multiple platforms with little or no modification.

In practical terms, this means that LINSTOR is primed to work with current and emerging cloud technologies that implement CSI. Currently, work is being done to provide example deployments for Kubernetes, which should allow an easy way for Kubernetes admins to deploy the LINSTOR CSI plug-in.  We expect full support for Kubernetes integration in early 2019.

Get the code on GitHub.


Demo of Extending LINSTOR Managed DRBD Volume to a DR Node

In this video Matt Kereczman from LINBIT combines components of LINBIT SDS and LINBIT to demonstrate extending an existing LINSTOR managed DRBD volume to a disaster recovery node, located in a geographically-separated datacenter via LINSTOR and DRBD proxy.

Watch the video:

He’s already created a LINSTOR cluster on four nodes: linstor-a, linstor-b, linstor-c and linstor-dr.

You can see that linstor-dr is in a different network than our other three nodes. This network exists in the DR DC, which is connected to our local DC via a 40Mb/s WAN link.

He has a single DRBD resource defined, which is currently replicated synchronously between the three peers in our local datacenter. He’s listed out his LINSTOR-managed resources and volumes which is currently mounted on linstor-a:

Before he adds a replica of this volume to the DR node in his DR datacenter, he’ll quickly test the write throughput of his DRBD device, so he has a baseline of how well it should perform.

He uses the dd to test. Read more

Hifi linstor controller

Highly available LINSTOR Controller with Pacemaker

Part of the design of LINSTOR is that if the central LINSTOR Controller goes down, all the storage still remains up and accessible. This should allow ample time to repair the downed system hosting the LINSTOR Controller. Still, in the majority of cases, it is preferred to run the LINSTOR Controller in a container within your cloud or as a VM in your hypervisor platform. However, there may exist a situation where you want to keep the LINSTOR Controller up and highly available, but do not have a container or VM platform in place to rely upon.  For situations like this we can easily leverage DRBD and the Pacemaker/Corosync stack.

If familiar with Pacemaker, setting up a clustered LINSTOR Controller should seem pretty straightforward. The only really tricky bit here is that we first need to install LINSTOR to create the DRBD storage that will provide the storage for LINSTOR. Sounds a little bit chicken-and-egg, I know, but this allows LINSTOR to be aware of, and manage, all DRBD resources.

The below example is for only two nodes, but it could be easily adapted for more nodes. Make sure to install both the LINSTOR Controller and LINSTOR Satellite software to both nodes. The below instructions are by no means a step-by-step guide, but rather just the “special sauce” needed for a HA LINSTOR Controller cluster.

If using the LINBIT provided package repositories an Ansible playbook is available to entirely automate the deployment of this cluster on a RHEL7 or CentOS7 system.

Create a DRBD resource for the LINSTOR database

We’ll name this resource linstordb, and use the already already configured pool0 storage pool.

[[email protected] ~]# linstor resource-definition create linstordb
[[email protected] ~]# linstor volume-definition create linstordb 250M
[[email protected] ~]# linstor resource create linstora linstordb --storage-pool pool0
[[email protected] ~]# linstor resource create linstorb linstordb --storage-pool pool0

Stop the LINSTOR Controller and move the database to the DRBD device

Move the database temporarily, mount the DRBD device where LINSTOR expects the database, and move it back.

[[email protected] ~]# systemctl stop linstor-controller
[[email protected] ~]# rsync -avp /var/lib/linstor /tmp/
[[email protected] ~]# mkfs.xfs /dev/drbd/by-res/linstordb/0
[[email protected] ~]# rm -rf /var/lib/linstor/*
[[email protected] ~]# mount /dev/drbd/by-res/linstordb/0 /var/lib/linstor
[[email protected] ~]# rsync -avp /tmp/linstor/ /var/lib/linstor/

Cluster everything up in Pacemaker

Please note that we strongly encourage you utilize tested and working STONITH in all Pacemaker cluster. This example omits it simply because these VMs did not have any fencing devices available.

primitive p_drbd_linstordb ocf:linbit:drbd \
        params drbd_resource=linstordb \
        op monitor interval=29 role=Master \
        op monitor interval=30 role=Slave \
        op start interval=0 timeout=240s \
        op stop interval=0 timeout=100s
primitive p_fs_linstordb Filesystem \
        params device="/dev/drbd/by-res/linstordb/0" directory="/var/lib/linstor" \
        op start interval=0 timeout=60s \
        op stop interval=0 timeout=100s \
        op monitor interval=20s timeout=40s
primitive p_linstor-controller systemd:linstor-controller \
        op start interval=0 timeout=100s
        op stop interval=0 timeout=100s
        op monitor interval=30s timeout=100s
ms ms_drbd_linstordb p_drbd_linstordb \
        meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
group g_linstor p_fs_linstordb p_linstor-controller
order o_drbd_before_linstor inf: ms_drbd_linstordb:promote g_linstor:start
colocation c_linstor_with_drbd inf: g_linstor ms_drbd_linstordb:Master
property cib-bootstrap-options: \
        stonith-enabled=false \

We still usually advise leveraging the features already built into your cloud or VM platform for high availability if one is available, but if not, you can always the above to leverage pacemaker to make your LINSTOR Controller highly available.


Devin Vance on Linkedin
Devin Vance
First introduced to Linux back in 1996, and using Linux almost exclusively by 2005, Devin has years of Linux administration and systems engineering under his belt. He has been deploying and improving clusters with LINBIT since 2011. When not at the keyboard, you can usually find Devin wrenching on an american motorcycle or down at one of the local bowling alleys.

How to migrate manually created resources to LINSTOR

In many cases, existing DRBD resources that were created manually some time in the past can be migrated, so that LINSTOR can be used to manage those resources afterwards.

While resources managed by LINSTOR can coexist with resources that were created manually, it can often make sense to migrate such existing resources. When resources are migrated to LINSTOR manually, there are some settings that must be carefully adjusted to match the resource’s existing configuration in LINSTOR.

Here is an overview of the migration process. It is, however, important to note that there are countless special cases, such as existing resources backed by different storage pools on different nodes or resources that use external metadata, where additional steps may be required for a successful migration.

This article describes the migration of an existing DRBD resource with a rather typical configuration.


LINSTOR must be installed, configured and running already.

Sample configuration assumed by this article for the sample commands:

1 existing DRBD resource:

Name legacy

TCP/IP port number 15500

1 Volume

Volume number 10

Minor number 9100 (/dev/drbd9100)

LVM volume group datastore

Logical volume name legacy (/dev/datastore/legacy)

Logical volume size 1,340 MiB

2 Nodes

Node eagle Node ID 4

Node rabbit Node ID 8

Meta data type internal

Peer slots 4

Create a resource definition for the existing resource

This may require renaming the resource in some cases, because LINSTOR’s naming rules are stricter than DRBD’s naming rules. The TCP/IP port number can be assigned automatically by LINSTOR if you don’t mind the change, and if a short disconnect/reconnect cycle during the migration is not going to cause problems. Otherwise, you can specify the port number manually to set it to the value that the resource is currently using.


resource-definition create legacy -p 15500

Adjust the DRBD peer slots count for the resource definition

If LINSTOR needs to create additional resources, or has to recreate DRBD meta data, the current peer count used by the resource’s volumes can be important, because it has a direct influence on the net size of the DRBD device that is created.


Note: If the peer count of a DRBD volume is unknown, it can be extracted from a meta data dump, which can be displayed using the drbdadm dump-md command (this requires the volume to be detached, or the resource to be stopped, and will normally require execution of the drbdadm apply-al command first).


resource-definition set-property legacy PeerSlotsNewResource 4

Create a volume definition

While manual resource creation typically involves specifying the size of the backend storage device, in this case a logical volume, which will result in DRBD using whatever space is left after meta data creation, LINSTOR works with the net size of a DRBD volume.

To figure out the correct size to use for the LINSTOR volume definition, you can query the net size of the DRBD device using the blockdev utility:

blockdev –getsize64 /dev/drbd9100

LINSTOR stores volume sizes in kiB internally, and the value yielded by the blockdev utility is in bytes, so you will have to divide by 1024. There should be no rest (modulo-division by 1024 should yield zero), unless you are dealing with a special case that might require additional steps to migrate successfully.

In the case of the sample configuration assumed by this article, a 1,340 MiB logical volume with DRBD meta data with a peer count of 4 will result in a net size of 1,371,956 kiB. This is the value to use for the volume definition in LINSTOR.

The volume number (10) and the minor number (9100) should also be specified manually, unless you want LINSTOR to allocate new numbers automatically, which will typically require a reconfiguration of the applications or filesystem entries that are using the existing DRBD resource.

volume-definition create legacy -n 10 -m 9100 1371956KiB

If, for some reason, the backing storage volume is significantly larger than what would be required to fit the net size reported by the DRBD volume, then there is an additional property that can be set on the volume definition:

volume-definition set-property legacy 10 AllowLargerVolumeSize true

Setting this property is not normally required. It should be set if the LINSTOR satellite presents an error regarding the backing storage volume’s size being larger than expected.

Set the name of the manually created backing storage volume

LINSTOR normally generates the name for the backing storage volume, but the name of that volume can be overridden by setting a property on the volume definition as long as the backing storage volume name is the same on each node.


volume-definition set-property legacy 10 OverrideVlmId legacy

Create resources from the resource definition

Finally, resources must be created from the resource definition in LINSTOR. When creating resources, the resource’s DRBD node id on each node must match the node id used by the manually created resource. You can lookup the node-id in the existing, manually created DRBD resource configuration file.

Before the final step of creating LINSTOR resources from the resource definition can be executed, the existing manually created DRBD resource configuration file must be moved away, otherwise drbdadm will complain about duplicate definitions.

mv /etc/drbd.d/legacy.res /etc/drbd.d/legacy.res.disabled

It is a good idea to stop the LINSTOR satellites before creating the resources, otherwise the DRBD resource will disconnect from other nodes as the first resource is created, and will then reconnect as the other resources (on other nodes) are added.

By creating the resources first and starting the satellites after all resources have been created, the satellites will immediately configure all connections, thereby normally avoiding a disconnect/reconnect cycle.

resource create --node-id 4 -s fatpool eagle legacy
resource create --node-id 8 -s fatpool rabbit legacy

(If the satellites are currently stopped, add the –async parameter to the command line to avoid having the client wait for the creation of each resource, which would not take place if the satellite is offline)


After finishing the migration, LINSTOR can manage the DRBD resource just like resources that were originally created by LINSTOR.


Robert Altnoeder on Linkedin
Robert Altnoeder
Robert joined the LINBIT development team in 2013. He had worked with
DRBD at a startup company in the SaaS field before joining LINBIT. His
current primary field of work is the architecture and implementation of
LINSTOR, the cluster management component of LINBIT's SDS software.



Replicating storage volumes on Scaleway ARM with LINSTOR

I’ve been using Scaleway for a while as a platform to spin-up both personal and work machines, mainly because they’re good value and easy to use. Scaleway offers a wide selection of Aarch64 and x86 machines at various price points, however none of these VMs are replicated – not even with RAID at the hardware level – you’re expected to handle that all yourself. Since ARM servers have been making headlines for several years as a competing architecture to x86 in the data center, I thought it would be interesting to set up replication across two ARM Scaleway VMs with DRBD and LINSTOR.

It’s worth pointing out here that if you’re planning on building a production HA environment on Scaleway, you should also reach out to their support team and have them confirm that your replicated volumes aren’t actually sitting on the same spinning disk in case of drive failure, as advised in their FAQ.

Preparing VMs

Linstor scaleway drbd-arm 6

First, we need a couple of VMs with additional storage volumes to replicate. The ARM64-2GB VM doesn’t allow for mounting additional volumes, so let’s go for the next one up, and add an additional 50GB LSSD volume.

Linstor scaleway drbd-arm 2

I’ve gone with an Ubuntu image, if you selected an RPM-based image, substitute package manager commands accordingly. I want to run the following commands on all VMs (in my case I have two, and will be using the first as both my controller and also a satellite node).

$ sudo apt update && sudo apt upgrade

In this case we’ll be deploying DRBD nodes with LINSTOR. We need DRBD9 to do this, but we can’t build a custom kernel module without first getting some prerequisite files for Scaleway’s custom kernel and preparing for a custom kernel module build. Scaleway provides a recommended script to run – we need to save that script and run it before installing DRBD9. I’ve put it in a file on github to make things simple:

$ sudo apt install -y build-essential libssl-dev
$ wget https://raw.githubusercontent.com/dabukalam/scalewaycustommodule/master/scalewaycustommodule
$ chmod +x scalewaycustommodule && sudo ./scalewaycustommodule


Once that’s done, we can add the LINBIT community repository and install DRBD, LINSTOR, and LVM:

$ sudo add-apt-repository -y ppa:linbit/linbit-drbd9-stack
$ sudo apt update
$ sudo apt install drbd-dkms linstor-satellite linstor-client lvm2

Now I can start the LINSTOR satellite service with:

$ sudo systemctl enable --now linstor-satellite

And make sure the VMs can see each other by adding the other node to each hosts file:

Linstor scaleway drbd-arm 3

Let’s make sure LVM is running and create a volume group for LINSTOR on our additional volume:

$ systemctl enable --now lvm2-lvmetad.service
$ systemctl enable --now lvm2-lvmetad.socket
$ sudo vgcreate sw_ssd /dev/vdb

That’s it for commands you need to run on both nodes. From now on we’ll be running commands on our favorite VM. LINSTOR has four node types – Controller, Auxiliary, Combined, and Satellite. Since I only have two nodes, one will be Combined, and one will be a Satellite. Combined here means that the node is both a Controller and a Satellite.

Adding nodes to the LINSTOR cluster

So on our favorite VM, which we’re going to use as the combined node, we add the local host to the LINSTOR cluster as a combined node, and the other as a satellite:

$ sudo apt install -y linstor-controller
$ sudo systemctl enable --now linstor-controller
$ linstor node create --node-type Combined drbd-arm
$ linstor node create --node-type Satellite drbd-arm-2
$ linstor node list

It’s worth noting here that you can run commands to manage LINSTOR on any node, just make sure you have the controller node exported as a variable

drbd-arm-2:~$ export LS_CONTROLLERS=drbd-arm

You should now have something that looks like this:

Linstor scaleway drbd-arm 4

Now we have our LINSTOR cluster setup, we can create a storage-pool across the nodes with the same name ‘swpool’, referencing the node name, specifying we want lvm, and the volume group name:

$ linstor storage-pool create drbd-arm swpool lvm sw_ssd
$ linstor storage-pool create drbd-arm-2 swpool lvm sw_ssd

We can then define new resource and volume types, and use them to create the resource. You can perform a whole range of operations at this point including manual node placement and specifying storage pools. Since we only have one storage pool, LINSTOR will automatically select that for us. I only have two nodes so I’ll just autoplace my storage cluster across two.

$ linstor resource-definition create backups
$ linstor volume-definition create backups 40G
$ linstor resource create backups --auto-place 2

LINSTOR will now handle all the resource creation automagically across all our nodes, including dealing with LVM and DRBD. If all succeeds, you should now be able to see your resources. They’ll be inconsistent while DRBD syncs them up. You can also now see the DRBD resources by running drbdmon. Once it’s finished syncing you’ll see a list of your replicated nodes as below (only drbd-arm-2 in my case):

You can now mount the drive on any of the nodes and write to your new replicated storage cluster.

$ linstor resource list-volumes

Linstor scaleway drbd-arm 7

In this case the device name is /dev/drbd1000, so once we create a filesystem on it and mount it I can now write to my new new replicated storage cluster.

$ sudo mkfs /dev/drbd1000
$ sudo mount /dev/drbd1000 /mnt
$ sudo touch /mnt/file



Danny Abukalam on Linkedin
Danny Abukalam
Danny is a Solutions Architect at LINBIT based in Manchester, UK. He works in conjunction with the sales team to support customers with LINBIT's products and services. Danny has been active in the OpenStack community for a few years, organising events in the UK including the Manchester OpenStack Meetup and OpenStack Days UK. In his free time, Danny likes hunting for extremely hoppy IPAs and skiing, not at the same time.

A Highly Available LINSTOR Controller for Proxmox

For the High Availability setup we describe in this blog post, we assume that you installed LINSTOR and the Proxmox Plugin as described in the Proxmox section of the users guide or our blog post.

The idea is to execute the LINSTOR controller within a VM that is controlled by Proxmox and its HA features, where the storage resides on DRBD, managed by LINSTOR itself.

Preparing the Storage

The first step is to allocate storage for the VM by creating a VM and selecting “Do not use any media” on the “OS” section. The hard disk should reside on DRBD (e.g., “drbdstorage”). Disk space should be at least 2GB, and for RAM we chose 1GB. These are the minimal requirements for the appliance LINBIT provides to its customers (see below). If you set up your own controller VM, or resources are not constrained, increase these minimal values. In the following, we assume that the controller VM was created with ID 100, but it is fine if this VM is created later (after you have already created other VMs).

LINSTOR Controller Appliance

LINBIT provides an appliance for its customers that can be used to populate the created storage. For the appliance to work, we first create a “Serial Port.” First, click on “Hardware” and then on “Add” and finally on “Serial Port.” See image below:


If everything worked as expected, the VM definition should then look like this:


The next step is to copy the VM appliance to the created storage. This can be done with qemu-img. Make sure to replace the VM ID with the correct one:

# qemu-img dd -O raw if=/tmp/linbit-linstor-controller-amd64.img \

After that, you can start the VM and connect to it via the Proxmox VNC viewer. The default user name and password are both “linbit”. Note that we kept the defaults for SSH, so you will not be able to log in to the VM via SSH and username/password. If you want to enable that (and/or “root” login), enable these settings in /etc/ssh/sshd_config and restart the ssh service. As this VM is based on “Ubuntu Bionic”, you should change your network settings (e.g., static IP) in /etc/netplan/config.yaml. After that you should be able to ssh to the VM:


Adding the Controller VM to the existing Cluster

In the next step, you add the controller VM to the existing cluster:

# linstor node create --node-type Controller \

As this special VM will be not be managed by the Proxmox Plugin, make sure all hosts have access to that VM’s storage. In our test cluster, we checked the linstor resource list to confirm where the storage was already deployed and then created further assignments via linstor resource create. In our lab consisting of four nodes, we made all resource assignments diskful, but diskless assignments are fine as well. As a rule of thumb keep the redundancy count at “3” (more usually does not make sense), and assign the rest diskless.

As the storage for this particular VM has to be made available (i.e., drbdadm up), enable the drbd.service on all nodes:

# systemctl enable drbd
# systemctl start drbd

At startup, the `linstor-satellite` service deletes all of its resource files (*.res) and regenerates them. This conflicts with the drbd services that needs these resource files to start the controller VM. It is good enough to first bring up the resources via drbd.service and then start linstor-satellite.service. To make the necessary changes, you need to create a drop-in for the linstor-satellite.service via systemctl (do
not edit the file directly).

# systemctl edit linstor-satellite

Switching to the New Controller

Now, it is time for the final steps — namely switching from the existing controller to the new one in the VM. Stop the old controller service on the old host, and copy the LINSTOR controller database to the VM:

# systemctl stop linstor-controller
# systemctl disable linstor-controller
# scp /var/lib/linstor/* [email protected]:/var/lib/linstor/

Finally, we can enable the controller in the VM:

# systemctl start linstor-controller # in the VM
# systemctl enable linstor-controller # in the VM

To check if everything worked as expected, you can query the cluster nodes on a host by asking the controller in the VM: linstor --controllers= node list. It is perfectly fine that the controller (which is just a controller and not “combined”) is shown as “OFFLINE”. Still, this might change in the future to something more appropriate.

As the last – but crucial – step, you need to add the “controllervm” option to /etc/pve/storage.cfg, and change the controller IP:

drbd: drbdstorage
  content images,rootdir
  redundancy 3
  controllervm 100

By setting the “controllervm” parameter the plugin will ignore (or act accordingly) if there are actions on the controller VM. Basically, this VM should not be managed by the plugin, so the plugin mainly ignores all actions on the given controller VM ID. However, there is one exception. When you delete the VM in the GUI, it is removed from the GUI. We did not find a way to return/kill it in a way that would keep the VM in the GUI. Yet such requests are ignored by the plugin, so the VM will not be deleted from the LINSTOR cluster. Therefore, it is possible to later create a VM with the ID of the old controller. The plugin will just return “OK”, and the old VM with the old data can be used again. To keep it simple, be careful to not delete the controller VM.

Enabling HA for the Controller VM in Proxmox

Currently, we have the controller executed as VM, but we should make sure that one instance of the VM is started at all times. For that we use Proxmox’s HA feature. Click on the VM; then on “More”; and then on “Manage HA.” We set the following parameters for our controller VM:


Final Considerations

As long as there are surviving nodes in your Proxmox cluster, everything should be fine. In case the node hosting the controller VM is shut down or lost, Proxmox HA will make sure the controller is started on another host. The IP of the controller VM should not change. It is up to you as admin to make sure this is the case (e.g., setting a static IP, or always providing the same IP via dhcp on the bridged interface).

One limitation that is not fully handled with this setup is a total cluster outage (e.g., common power supply failure) with a restart of all cluster nodes. Proxmox is unfortunately pretty limited in that regard. You can enable the “HA Feature” for a VM, and you can define “Start and Shutdown Order” constraints. But both are completely separated from each other. Therefore it is difficult to ensure that the controller VM is up and all other VMs are started.

It might be possible to work around that by delaying VM startup in the Proxmox plugin until the controller VM is up (i.e., if the plugin is asked to start the controller VM it does it, otherwise it waits and pings the controller). While this is a nice idea, it would be a huge failure in a serialized, non-concurrent VM start/plugin call event stream where some VM should be started (which then blocks) before the controller VM is scheduled to be started. That would obviously result in a deadlock.

We will discuss options with Proxmox, but we think the presented solution is valuable in typical use cases as is, especially compared to the complexity of a Pacemaker setup. Use cases where one can expect that not the whole cluster goes down at the same time are (will be??) covered. And even if that is the case, only automatic startup of the VMs would not work when the whole cluster is started. In such a scenario, the admin just has to wait until the Proxmox HA service starts the controller VM. After that, all VMs can be started manually/scripted on the command line.

Roland Kammerer
Software Engineer at Linbit
Roland Kammerer studied technical computer science at the Vienna University of Technology and graduated with distinction. Currently, he is a PhD candidate with a research focus on time-triggered realtime-systems and works for LINBIT in the DRBD development team.


Control and Data plane Linstor

The advantage of separate control and data planes

Many storage systems have a monolithic design that combines the control plane and the data plane into a single application and a single protocol, but LINBIT’s more modular solution comes with a number of advantages.

What is a control plane or a data plane?

The most important task that any storage system must perform is providing access to the storage volumes that are used for various workloads, for example, databases, file servers or virtualization environments. This is what we refer to as the data plane – all the components that are necessary to actually get data from the storage to the user and from the user to the storage.

Another task is the management of the configuration of storage volumes, which is what we refer to as the control plane . With the rise of more dynamic systems like containerization, virtualization and cloud environments, and the associated software defined storage systems, where storage volumes are frequently reconfigured, this task is becoming increasingly important.

Data and Control plane Linstor

Why it is important: Availability

If you need to shut down part of your infrastructure, because you are updating hardware, for instance it is important when the most fundamental services remain available. Storage is probably one of those fundamental and important services, since most of the other systems rely on it.

A storage system with a modular design that provides independent control and data planes brings your infrastructure one step closer to high availability.

Independent control and data plane

Many storage systems can only provide access to storage volumes if all of their subsystems are online. The design may even be completely monolithic, so that the management functions and the storage access functions are contained within a single application that uses a single network protocol.

In LINBIT’s DRBD-based storage systems, only the most fundamental control plane functions are tightly coupled with the data plane and the operation of storage volumes. High-level control functions, like managing storage volumes and their configuration, managing cluster nodes, or providing automatic selection of cluster nodes for the creation of storage volumes, are provided by the LINSTOR storage management software. These two components, DRBD and LINSTOR, are fundamentally independent of each other.

DRBD storage volumes, even those that are managed by LINSTOR, are kept accessible even if the LINSTOR software is unavailable. This means that the LINSTOR software can be shut down, restarted or upgraded while users retain their access to existing storage volumes. While it is less useful, the same is even true the other way around: a LINSTOR controller that does not rely on storage provided by DRBD and will continue to service storage management requests even if the storage system itself is unavailable. The changed configuration will simply be applied whenever the actual storage system is online again.

Robert Altnoeder on Linkedin
Robert Altnoeder
Robert joined the LINBIT development team in 2013. He had worked with
DRBD at a startup company in the SaaS field before joining LINBIT. His
current primary field of work is the architecture and implementation of
LINSTOR, the cluster management component of LINBIT's SDS software.

How to setup LINSTOR on Proxmox VE

In this technical blog post, we show you how to integrate DRBD volumes in Proxmox VE via a storage plugin developed by LINBIT. The advantages of using DRBD include a configurable number of data replicas (e.g., 3 copies in a 5 node cluster), access to the data on every node and therefore very fast VM live-migrations (usually takes only a few seconds, depending on memory pressure). Download Linstor Proxmox Plugin


The rest of this post assumes that you have already set up Proxmox VE (the LINBIT example uses 4 nodes), and have created a PVE cluster consisting of all nodes. While this post is not meant to  replace the DRBD User’s Guide, we try to show a complete setup.

The setup consists of two important components:

  1. LINSTOR manages DRBD resource allocation
  2. linstor-proxmox plugin that implements the Proxmox VE storage plugin API and executes LINSTOR commands.

In order for the plugin to work, you must first create a LINSTOR cluster.


We have assumed here that you have already set up the LINBIT Proxmox repository as described in the User’s guide. If you have not completed this set up, execute the following commands on all cluster nodes. First, we need the low-level infrastructure (i.e., the DRBD9 kernel module and drbd-utils):

apt install pve-headers
apt install drbd-dkms drbd-utils
rmmod drbd; modprobe drbd
grep -q drbd /etc/modules || echo "drbd" >> /etc/module

The next step is to install LINSTOR:

apt install linstor-controller linstor-satellite linstor-client
systemctl start linstor-satellite
systemctl enable linstor-satellite

Now, decide which of your hosts should be the current controller node and enable the linstor-controller service on that particular node only:

systemctl start linstor-controller

Volume creation

Obviously, DRBD needs storage to create volumes. In this post we assume a setup where all nodes contain an LVM-thinpool called drbdpool. In our sample setup, we created it on the pve volume group, but in your setup, you might have a different storage topology. On the node that runs the controller service, execute the following commands to add your nodes:

linstor node create alpha --node-type Combined
linstor node create bravo --node-type Combined
linstor node create charlie --node-type Combined
linstor node create delta --node-type Combined

“Combined” means that this node is allowed to execute a LINSTOR controller and/or a satellite, but a node does not have to execute both. So it is safe to specify “Combined”; it does not influence the performance or the number of services started.

The next step is to configure a storage pool definition. As described in the User’s guide, most LINSTOR objects consist of a “definition” and then concrete instances of such a definition:

linstor storage-pool-definition create drbdpool

By now it is time to mention that the LINSTOR client provides handy shortcuts for its sub-commands. The previous command could have been written as linstor spd c drbdpool. The next step is to register every node’s storage pool:

for n in alpha bravo charlie delta; do \
linstor storage-pool create $n drbdpool lvmthin pve/drbdpool; \

DRBD resource creation

After that we are ready to create our first real DRBD resource:

linstor resource-definition create first
linstor volume-definition create first 10M --storage-pool drbdpool
linstor resource create alpha first
linstor resource create bravo first

Now, check with drbdadm status that  “alpha” and “bravo” contain a replicated DRBD resource called “first”. After that this dummy resource can be deleted on all nodes by deleting its resource definition:

linstor resource-definition delete -q first

LINSTOR Proxmox VE Plugin Setup

As DRBD and LINSTOR are already set up, the only things missing is installing the plugin itself and its configuration.

apt install linstor-proxmox

The plugin is configured via the file /etc/pve/storage.cfg:

drbd: drbdstorage
content images, rootdir
redundancy 2 controller

It is not necessary to copy that file to the other nodes, as /etc/pve is already a replicated file system. After the configuration is done, you should restart the following service:

systemctl restart pvedaemon

After this setup is done, you are able to create virtual machines backed by DRBD from the GUI. To do so, select “drbdstorage” as storage in the “Hard Disk” section of the VM. LINSTOR selects the nodes that have the most free storage to create the replicated backing devices.


The interested reader can check which ones were selected via LINSTOR resource list. While interesting, it is important to know that the storage can be accessed by all nodes in the cluster via a DRBD feature called “diskless clients”. So let’s assume “alpha” and “bravo” had the most free space and were selected, and the VM was created on node “bravo”. Via the low level tool drbdadm status we now see that the resource is created on two nodes (i.e., “alpha” and “bravo”) and the DRBD resource is in “Primary” role on “bravo”.

Now we want to migrate the VM from “bravo” to node “charlie”. This is again done via a few clicks in the GUI, but the interesting steps happen behind the scene: The storage plugin realizes that it has access to the data on “alpha” and “bravo” (our two replicas) but also needs access on “charlie” to execute the VM. The plugin therefore creates a diskless assignment on “charlie”. When you execute drbdadm status on “charlie”, you see that now three nodes are involved in the overall picture:

• Alpha with storage in Secondary role
• Bravo with storage in Secondary role
• Charlie as a diskless client in Primary role

Diskless clients are created (and deleted) on demand without further user interaction, besides moving around VMs in the GUI. This means that if you now move the VM back to “bravo”, the diskless assignment on “charlie” gets deleted as it is no longer needed.

If you would have moved the VM from “charlie” to “delta”, the diskless assignment for “charlie” would have been deleted, and a new one for “delta” would have been created.

For you it is probably even more interesting that all of this including VM migration happens within seconds without moving the actual replicated storage contents.

Next Steps

So far, we created a replicated and highly-available setup for our VMs, but the LINSTOR controller and especially its database are not highly-available. In a future blog post, we will describe how to make the controller itself highly-available by only using software already included in Proxmox VE (i.e., without introducing complex technologies like Pacemaker). This will be achieved with a dedicated controller VM that will be provided by LINBIT as an appliance.

Roland Kammerer
Software Engineer at Linbit
Roland Kammerer studied technical computer science at the Vienna University of Technology and graduated with distinction. Currently, he is a PhD candidate with a research focus on time-triggered realtime-systems and works for LINBIT in the DRBD development team.

The Technology Inside LINSTOR (Part II)

In our first look into LINSTOR you learned a lot about the single communication protocol, transaction-safety and modularity features. In the next chapter you can dive deeper into the construction.

Fault Tolerance

Keeping the software responsive is one of the more difficult problems that we have to deal with in LINSTOR’s design and implementation. The Controller/Satellite split is one fundamental part of LINSTOR’s design toward fault tolerance, but there are many other design and implementation details that improve the software’s robustness, and many of them are virtually invisible to the user.

On the Controller side, communication and persistence are the two main areas that can lead to the software becoming unresponsive. The following problems could lead to an unusable network communication service on the Controller side:

  • Stopping or reconfiguring a network interface
  • Address conflicts
  • In-use TCP/IP ports

All network I/O in LINSTOR is non-blocking, so that unresponsive network peers do not lead to a lockup of LINSTOR’s network communication service. While the network communication service has been designed to recover from many kinds of problems, it additionally allows the use of multiple independent network connectors, so that the system remains accessible even in the case where a network connector requires reconfiguration to recover. The network connectors can also stop and start independently, allowing reinitialization of failed connectors.

The Controller can obviously not continue normal operation while the database service is inoperative, which could of course happen if an external database is used, for example, due to a downtime of the database server or due to a network problem. Once the database service becomes available again, the Controller will recover automatically, without requiring any operator intervention.

Satellites in LINSTOR

The Satellite side of LINSTOR does not run a database, and a single unresponsive Satellite is less critical for the system as a whole than an unresponsive Controller. Nonetheless, if a Satellite satellite-linstorencounters a failure during the configuration of one storage resource, that should still not temporarily prevent it from being able to service requests for the configuration of other resources.

The biggest challenge regarding fault tolerance on the Satellite side is the fact that the Satellite interacts with lots of external programs and processes that are neither part of LINSTOR nor under the direct control of the Satellite process. These external components include system utilities required for the configuration of backend storage, such as LVM or ZFS commands, processes observing events generated by the DRBD kernel module whenever the state of a resource changes, block device files that appear or disappear when storage devices are reconfigured, and similar kinds of objects.

To achieve fault tolerance on the Satellite side, the software has been designed to deal with many possible kinds of malfunctions of the external environment that LINSTOR interacts with. This includes the time-boxing and the enforcement of size limits on the amount of data that is read back when executing external processes, as well as recovery procedures that attempt to abort external processes that have become unresponsive. There is even a fallback that reports a malfunctioning operating system kernel if the operating system is unable to end an unresponsive process. The LINSTOR code also contains a mechanism that can run critical operations, such as the attempt to open a device file ( which may block forever due to faulty operating system drivers) asynchronously, so that even if the operation blocks, LINSTOR would normally at least be able to detect and report the problem.


With feature richness, customizability and flexibility, also comes complexity. The only thing that can be done to make the system as easy to understand and use as possible is to attempt to make the system intuitive, self-explaining and unambiguous.

Clarity in the naming scheme of objects turned out to be an important factor for a user’s ability to use the software intuitively. In our previous product, drbdmanage, users would typically look for commands to either create a “resource” or a “volume.” However, the corresponding commands, “new-resource” and “new-volume”, only define a resource and its volumes, but do not actually create storage resources on any of the cluster nodes. Another command, “assign”, was required to assign the resource to cluster nodes, thereby creating the actual storage resource, and users sometimes had a hard time finding this command.

For this reason, the naming of objects was changed in LINSTOR. A user looking for a command to create a resource will find the command that actually creates a storage resource, and one of the required parameters for this command is the so-called resource definition. It is quite obvious that the next step would be to look for a command that creates a resource definition. This kind of naming convention is supposed to make it easier for users to figure out how to intuitively use the application.

LINSTOR is also explicit with replies to user commands, as well as with return codes for API calls. The software typically replies with a message that describes whether or not the command was successful, what the software did, and to which objects the message refers. Error messages that include a description of the problem cause or hints for possible correction measures also follow a uniform structure.

Similar ideas also applies to return codes, which include not only the error code (e.g., Object exists), but also information on what objects the error refers to (e.g., the type of object and the identifier specified by the user).

Reporting System

To make diagnosing errors easier, LINSTOR also generates a unique identifier for every error that is logged. The traditional logging and error reporting on Unix/Linux systems basically consists of single text lines logged to one large logfile, sometimes even a single logfile for many different applications. An application could log multiple lines for each error, but support for logging multiple lines atomically (instead of interleaved with log lines for other errors, possibly from other applications) is virtually nonexistent.

For this reason, LINSTOR logs a single-line short description of the error, including the error identifier, to the system log, but also logs the details of the error to a report file that can be found using the error identifier. The detailed log report also contains information such as the component where the error occured, the exact version of the software that was used, debug information, nested errors, and many other details that may help with problem mitigation.

Implementation Quality

While the various design characteristics are important factors for creating a powerful and robust software system, even the best design cannot produce a reliable application if it is not implemented with high quality.

The first step, even before we wrote the code, was to choose a programming language that would be suitable for the task. While our previous product, drbdmanage, and the current LINSTOR client are implemented in Python, the LINSTOR server-side components (the Controller and Satellite) are implemented in Java. A server application that manages highly available storage systems should obviously be designed and implemented much more carefully than the typical single-user desktop application. Java is a very strict programming language that provides strong static typing, checked exceptions and allows only few implicit type conversions – which are all features that also enable IDEs to perform static checking of the code while it is being written.

Obviously, while it can make writing high quality code easier, the choice of programming language alone does not automatically lead to better code. To keep LINSTOR’s code clean, readable, self-explaining and maintainable, we apply many of the best practices that have proven successful in the creation of mission-critical software systems. This includes more important things like choosing descriptive variable names or maintaining a clear and logical control flow, but even extends to less technical details like consistent formatting of the source code. The coding standard that we apply to produce high-quality code is based on standards from the aviation industry and is among the strictest coding standards that exist today.

checking-checklist-linstorEasy Validity Checks

There is also a strong focus on correctness and strict checking in the way LINSTOR is implemented. As an example, the name of objects like nodes, resources or storage pools is not simply a String, but an object that can only be constructed with a name that is valid for that kind of object. It is impossible to create a resource name object that contains invalid characters, or to accidentally use a resource name object as the identifier for the creation of a storage pool. As a result, developers cannot forget to perform a validity check on a node name or on a volume number, and they also cannot apply the wrong check by accident.

All those considerations, design characteristics and implementation methods are important factors that helped us create a dependable and user friendly software that we hope will prove useful and valuable to its users like you.


If you have any questions or suggestions concerning LINSTOR, please leave a comment or write email to [email protected] .


Robert Altnoeder on Linkedin
Robert Altnoeder
Robert joined the LINBIT development team in 2013. He had worked with
DRBD at a startup company in the SaaS field before joining LINBIT. His
current primary field of work is the architecture and implementation of
LINSTOR, the cluster management component of LINBIT's SDS software.
control room linstor management

The technology inside LINSTOR (Part I)

Spotlight on LINSTOR’s design and technology: What we do and how we do it to create a powerful, flexible and robust storage cluster management software

LINSTOR is an application that is typically integrated with highly automated systems, such as software defined storage systems or virtualization environments. Users often interact with the management interface of some other application that uses LINSTOR to manage the storage required for that application’s use case, which also means that the users may not have direct access to the storage systems or to the LINSTOR user interface.

A single storage cluster can be the backend of multiple independent application systems, so the biggest challenge for a software like LINSTOR is to remain responsive even if some actions or components of the cluster fail. At the same time, the software should be flexible enough to cover all use cases, to enable future extension or modification, and despite all the complexity that is the result of these requirements, it should at the same time be easy to understand and easy to use for the administrators who are tasked with installing and maintaining the storage system.

It is quite clear to anyone who has worked on a bigger software project as a developer that many of those requirements work against each other. Customizability, flexibility, an abundance of features cause complexity, but complexity is the natural enemy of usability, reliability and maintainability. When we started the development of LINSTOR, our challenge was to design and implement the software so that it would achieve our goals with regards to feature richness and flexibility while at the same time remaining reliable and easy to use.


One of the most important aspects of LINSTOR’s design is its modularity. We divided the system into two components, the Controller and the Satellite, so that the Controller component could remain as independent as possible from the Satellite component – and vice versa.

Even inside those two components, many parts of the software are exchangeable – the communication layer, the serialization protocol, the database layer, all of its API calls, even all of the debug commands that we use for internal development, as well as many other implementation details are exchangeable parts of the software. This provides not only a maximum of flexibility for future extensions, it also acts as a sort of safety net. For example, if support for the database or the serialization protocol that we use currently were dropped by their maintainers, we could simply exchange those parts without having to modify every single source code file of the project, because implementation details are hidden behind generic interfaces that connect various parts of our software.

Another positive side effect is that many of those components, being modular, are naturally able to run multiple differently configured instances. For example, it is possible to configure multiple network connectors in LINSTOR, each bound to different network interfaces or ports.

Linstor Linbit Opennebula openstack

A single communication protocol

As a cluster software, LINSTOR must of course have some mechanism to communicate with all of the nodes that are part of the cluster. Integration with other applications also requires some means of communication between those applications and the LINSTOR processes, and the same applies to any kind of user interface for LINSTOR.

There are lots of different technologies available, but many of them are only suitable for certain kinds of communication. Some clusters use distributed key/value stores like etcd for managing their configuration, but use D-Bus for command line utilities and a REST interface for connecting other applications.

Instead of using many different technologies, LINSTOR uses a single versatile network protocol for communication with all peers. The protocol used for communication between the Controller and the Satellites is the same as the one used for communication between the Controller and the command line interface or any other application. Since this protocol is implemented on top of standard TCP/IP connections, it also made all aspects of LINSTOR’s communication network transparent. An optional SSL layer can provide secure encrypted communication. Using a single mechanism for communication also means less complexity, as the same code can be used for implementing different communication channels.


Even though LINSTOR keeps its configuration objects in memory, there is an obvious need for some kind of persistence. Ideally, what is kept in memory should match what is persisted, which means that any change should be a transaction, both in memory and on persistent storage.

Most Unix/Linux applications have traditionally favored line-based text files for the configuration of the software and for persisting its state, whereas LINSTOR keeps its configuration in a database. Apart from the fact that a fully ACID-compliant database is an ideal foundation for a building a transaction-safe application, using a database also has other advantages. For example, if an upgrade of the software requires changes to the persistent data structures, the upgrade of the data can be performed as a single transaction, so that the result is either the old version or the new version of the data, but not some broken state in between. Database constraints also provide an additional safeguard that helps ensuring the consistency of the data. Assuming that there were a bug in our software, so that it would fail to detect duplicate volume numbers being assigned to storage volumes, the database would abort the transaction for creating the volume due to constraint violations, thereby preventing inconsistencies in the corresponding data structures.

To avoid requiring users to set up and maintain a database server, LINSTOR uses its own integrated database by default – it is simply started as an integral part of the Controller component. Optionally, the Controller can also access a centralized database by means of a JDBC driver.

Read more in the second blog post! 

Find LINSTOR in our Github repository

Robert Altnoeder on Linkedin
Robert Altnoeder
Robert joined the LINBIT development team in 2013. He had worked with
DRBD at a startup company in the SaaS field before joining LINBIT. His
current primary field of work is the architecture and implementation of
LINSTOR, the cluster management component of LINBIT's SDS software.