In this technical blog post, we show you how to integrate DRBD volumes in Proxmox VE via a storage plugin developed by LINBIT. The advantages of using DRBD include a configurable number of data replicas (e.g., 3 copies in a 5 node cluster), access to the data on every node and therefore very fast VM live-migrations (usually takes only a few seconds, depending on memory pressure).[download id=”6076″ total_downloads=false]
The rest of this post assumes that you have already set up Proxmox VE (the LINBIT example uses 4 nodes), and have created a PVE cluster consisting of all nodes. While this post is not meant to replace the DRBD User’s Guide, we try to show a complete setup.
The setup consists of two important components:
- LINSTOR manages DRBD resource allocation
- linstor-proxmox plugin that implements the Proxmox VE storage plugin API and executes LINSTOR commands.
In order for the plugin to work, you must first create a LINSTOR cluster.
We have assumed here that you have already set up the LINBIT Proxmox repository as described in the User’s guide. If you have not completed this set up, execute the following commands on all cluster nodes. First, we need the low-level infrastructure (i.e., the DRBD9 kernel module and
apt install pve-headers apt install drbd-dkms drbd-utils rmmod drbd; modprobe drbd grep -q drbd /etc/modules || echo "drbd" >> /etc/modules
The next step is to install LINSTOR:
apt install linstor-controller linstor-satellite linstor-client systemctl start linstor-satellite systemctl enable linstor-satellite
Now, decide which of your hosts should be the current controller node and enable the linstor-controller service on that particular node only:
systemctl start linstor-controller
Obviously, DRBD needs storage to create volumes. In this post we assume a setup where all nodes contain an LVM-thinpool called drbdpool. In our sample setup, we created it on the pve volume group, but in your setup, you might have a different storage topology. On the node that runs the controller service, execute the following commands to add your nodes:
linstor node create alpha 10.0.0.1 --node-type Combined linstor node create bravo 10.0.0.2 --node-type Combined linstor node create charlie 10.0.0.3 --node-type Combined linstor node create delta 10.0.0.4 --node-type Combined
“Combined” means that this node is allowed to execute a LINSTOR controller and/or a satellite, but a node does not have to execute both. So it is safe to specify “Combined”; it does not influence the performance or the number of services started.
The next step is to configure a storage pool definition. As described in the User’s guide, most LINSTOR objects consist of a “definition” and then concrete instances of such a definition:
linstor storage-pool-definition create drbdpool
By now it is time to mention that the LINSTOR client provides handy shortcuts for its sub-commands. The previous command could have been written as linstor spd c drbdpool. The next step is to register every node’s storage pool:
for n in alpha bravo charlie delta; do \ linstor storage-pool create $n drbdpool lvmthin pve/drbdpool; \ done
DRBD resource creation
After that we are ready to create our first real DRBD resource:
linstor resource-definition create first linstor volume-definition create first 10M --storage-pool drbdpool linstor resource create alpha first linstor resource create bravo first
Now, check with
drbdadm status that “alpha” and “bravo” contain a replicated DRBD resource called “first”. After that this dummy resource can be deleted on all nodes by deleting its resource definition:
linstor resource-definition delete -q first
LINSTOR Proxmox VE Plugin Setup
As DRBD and LINSTOR are already set up, the only things missing is installing the plugin itself and its configuration.
apt install linstor-proxmox
The plugin is configured via the
drbd: drbdstorage content images, rootdir redundancy 2 controller 10.0.0.1
It is not necessary to copy that file to the other nodes, as
/etc/pve is already a replicated file system. After the configuration is done, you should restart the following service:
systemctl restart pvedaemon
After this setup is done, you are able to create virtual machines backed by DRBD from the GUI. To do so, select “drbdstorage” as storage in the “Hard Disk” section of the VM. LINSTOR selects the nodes that have the most free storage to create the replicated backing devices.
The interested reader can check which ones were selected via LINSTOR resource list. While interesting, it is important to know that the storage can be accessed by all nodes in the cluster via a DRBD feature called “diskless clients”. So let’s assume “alpha” and “bravo” had the most free space and were selected, and the VM was created on node “bravo”. Via the low level tool
drbdadm status we now see that the resource is created on two nodes (i.e., “alpha” and “bravo”) and the DRBD resource is in “Primary” role on “bravo”.
Now we want to migrate the VM from “bravo” to node “charlie”. This is again done via a few clicks in the GUI, but the interesting steps happen behind the scene: The storage plugin realizes that it has access to the data on “alpha” and “bravo” (our two replicas) but also needs access on “charlie” to execute the VM. The plugin therefore creates a diskless assignment on “charlie”. When you execute
drbdadm status on “charlie”, you see that now three nodes are involved in the overall picture:
• Alpha with storage in Secondary role
• Bravo with storage in Secondary role
• Charlie as a diskless client in Primary role
Diskless clients are created (and deleted) on demand without further user interaction, besides moving around VMs in the GUI. This means that if you now move the VM back to “bravo”, the diskless assignment on “charlie” gets deleted as it is no longer needed.
If you would have moved the VM from “charlie” to “delta”, the diskless assignment for “charlie” would have been deleted, and a new one for “delta” would have been created.
For you it is probably even more interesting that all of this including VM migration happens within seconds without moving the actual replicated storage contents.
So far, we created a replicated and highly-available setup for our VMs, but the LINSTOR controller and especially its database are not highly-available. In a future blog post, we will describe how to make the controller itself highly-available by only using software already included in Proxmox VE (i.e., without introducing complex technologies like Pacemaker). This will be achieved with a dedicated controller VM that will be provided by LINBIT as an appliance.