Posts

penguin-linux-linstor

How to migrate manually created resources to LINSTOR

In many cases, existing DRBD resources that were created manually some time in the past can be migrated, so that LINSTOR can be used to manage those resources afterwards.

While resources managed by LINSTOR can coexist with resources that were created manually, it can often make sense to migrate such existing resources. When resources are migrated to LINSTOR manually, there are some settings that must be carefully adjusted to match the resource’s existing configuration in LINSTOR.

Here is an overview of the migration process. It is, however, important to note that there are countless special cases, such as existing resources backed by different storage pools on different nodes or resources that use external metadata, where additional steps may be required for a successful migration.

This article describes the migration of an existing DRBD resource with a rather typical configuration.

Prerequisites

LINSTOR must be installed, configured and running already.

Sample configuration assumed by this article for the sample commands:

1 existing DRBD resource:

Name legacy

TCP/IP port number 15500

1 Volume

Volume number 10

Minor number 9100 (/dev/drbd9100)

LVM volume group datastore

Logical volume name legacy (/dev/datastore/legacy)

Logical volume size 1,340 MiB

2 Nodes

Node eagle Node ID 4

Node rabbit Node ID 8

Meta data type internal

Peer slots 4

Create a resource definition for the existing resource

This may require renaming the resource in some cases, because LINSTOR’s naming rules are stricter than DRBD’s naming rules. The TCP/IP port number can be assigned automatically by LINSTOR if you don’t mind the change, and if a short disconnect/reconnect cycle during the migration is not going to cause problems. Otherwise, you can specify the port number manually to set it to the value that the resource is currently using.

 

resource-definition create legacy -p 15500

Adjust the DRBD peer slots count for the resource definition

If LINSTOR needs to create additional resources, or has to recreate DRBD meta data, the current peer count used by the resource’s volumes can be important, because it has a direct influence on the net size of the DRBD device that is created.

 

Note: If the peer count of a DRBD volume is unknown, it can be extracted from a meta data dump, which can be displayed using the drbdadm dump-md command (this requires the volume to be detached, or the resource to be stopped, and will normally require execution of the drbdadm apply-al command first).

 

resource-definition set-property legacy PeerSlotsNewResource 4

Create a volume definition

While manual resource creation typically involves specifying the size of the backend storage device, in this case a logical volume, which will result in DRBD using whatever space is left after meta data creation, LINSTOR works with the net size of a DRBD volume.

To figure out the correct size to use for the LINSTOR volume definition, you can query the net size of the DRBD device using the blockdev utility:

blockdev –getsize64 /dev/drbd9100

LINSTOR stores volume sizes in kiB internally, and the value yielded by the blockdev utility is in bytes, so you will have to divide by 1024. There should be no rest (modulo-division by 1024 should yield zero), unless you are dealing with a special case that might require additional steps to migrate successfully.

In the case of the sample configuration assumed by this article, a 1,340 MiB logical volume with DRBD meta data with a peer count of 4 will result in a net size of 1,371,956 kiB. This is the value to use for the volume definition in LINSTOR.

The volume number (10) and the minor number (9100) should also be specified manually, unless you want LINSTOR to allocate new numbers automatically, which will typically require a reconfiguration of the applications or filesystem entries that are using the existing DRBD resource.

volume-definition create legacy -n 10 -m 9100 1371956KiB

If, for some reason, the backing storage volume is significantly larger than what would be required to fit the net size reported by the DRBD volume, then there is an additional property that can be set on the volume definition:

volume-definition set-property legacy 10 AllowLargerVolumeSize true

Setting this property is not normally required. It should be set if the LINSTOR satellite presents an error regarding the backing storage volume’s size being larger than expected.

Set the name of the manually created backing storage volume

LINSTOR normally generates the name for the backing storage volume, but the name of that volume can be overridden by setting a property on the volume definition as long as the backing storage volume name is the same on each node.

 

volume-definition set-property legacy 10 OverrideVlmId legacy

Create resources from the resource definition

Finally, resources must be created from the resource definition in LINSTOR. When creating resources, the resource’s DRBD node id on each node must match the node id used by the manually created resource. You can lookup the node-id in the existing, manually created DRBD resource configuration file.

Before the final step of creating LINSTOR resources from the resource definition can be executed, the existing manually created DRBD resource configuration file must be moved away, otherwise drbdadm will complain about duplicate definitions.

mv /etc/drbd.d/legacy.res /etc/drbd.d/legacy.res.disabled

It is a good idea to stop the LINSTOR satellites before creating the resources, otherwise the DRBD resource will disconnect from other nodes as the first resource is created, and will then reconnect as the other resources (on other nodes) are added.

By creating the resources first and starting the satellites after all resources have been created, the satellites will immediately configure all connections, thereby normally avoiding a disconnect/reconnect cycle.

resource create --node-id 4 -s fatpool power legacy

resource create --node-id 8 -s fatpool open legacy

(If the satellites are currently stopped, add the –async parameter to the command line to avoid having the client wait for the creation of each resource, which would not take place if the satellite is offline)

 

After finishing the migration, LINSTOR can manage the DRBD resource just like resources that were originally created by LINSTOR.

 

Robert Altnoeder on Linkedin
Robert Altnoeder
Robert joined the LINBIT development team in 2013. He had worked with
DRBD at a startup company in the SaaS field before joining LINBIT. His
current primary field of work is the architecture and implementation of
LINSTOR, the cluster management component of LINBIT's SDS software.

 

art-bridge-linstor-proxmox-plugin

How to setup LINSTOR on Proxmox VE

In this technical blog post, we show you how to integrate DRBD volumes in Proxmox VE via a storage plugin developed by LINBIT. The advantages of using DRBD include a configurable number of data replicas (e.g., 3 copies in a 5 node cluster), access to the data on every node and therefore very fast VM live-migrations (usually takes only a few seconds, depending on memory pressure). Download Linstor Proxmox Plugin

Setup

The rest of this post assumes that you have already set up Proxmox VE (the LINBIT example uses 4 nodes), and have created a PVE cluster consisting of all nodes. While this post is not meant to  replace the DRBD User’s Guide, we try to show a complete setup.

The setup consists of two important components:

  1. LINSTOR manages DRBD resource allocation
  2. linstor-proxmox plugin that implements the Proxmox VE storage plugin API and executes LINSTOR commands.

In order for the plugin to work, you must first create a LINSTOR cluster.

LINSTOR Cluster

We have assumed here that you have already set up the LINBIT Proxmox repository as described in the User’s guide. If you have not completed this set up, execute the following commands on all cluster nodes. First, we need the low-level infrastructure (i.e., the DRBD9 kernel module and drbd-utils):

apt install pve-headers
apt install drbd-dkms drbd-utils
rmmod drbd; modprobe drbd
grep -q drbd /etc/modules || echo "drbd" >> /etc/module

The next step is to install LINSTOR:

apt install linstor-controller linstor-satellite linstor-client
systemctl start linstor-satellite
systemctl enable linstor-satellite

Now, decide which of your hosts should be the current controller node and enable the linstor-controller service on that particular node only:

systemctl start linstor-controller

Volume creation

Obviously, DRBD needs storage to create volumes. In this post we assume a setup where all nodes contain an LVM-thinpool called drbdpool. In our sample setup, we created it on the pve volume group, but in your setup, you might have a different storage topology. On the node that runs the controller service, execute the following commands to add your nodes:

linstor node create alpha 10.0.0.1 --node-type Combined
linstor node create bravo 10.0.0.2 --node-type Combined
linstor node create charlie 10.0.0.3 --node-type Combined
linstor node create delta 10.0.0.4 --node-type Combined

“Combined” means that this node is allowed to execute a LINSTOR controller and/or a satellite, but a node does not have to execute both. So it is safe to specify “Combined”; it does not influence the performance or the number of services started.

The next step is to configure a storage pool definition. As described in the User’s guide, most LINSTOR objects consist of a “definition” and then concrete instances of such a definition:

linstor storage-pool-definition create drbdpool

By now it is time to mention that the LINSTOR client provides handy shortcuts for its sub-commands. The previous command could have been written as linstor spd c drbdpool. The next step is to register every node’s storage pool:

for n in alpha bravo charlie delta; do \
linstor storage-pool create $n drbdpool lvmthin pve/drbdpool; \
done

DRBD resource creation

After that we are ready to create our first real DRBD resource:

linstor resource-definition create first
linstor volume-definition create first 10M --storage-pool drbdpool
linstor resource create alpha first
linstor resource create bravo first

Now, check with drbdadm status that  “alpha” and “bravo” contain a replicated DRBD resource called “first”. After that this dummy resource can be deleted on all nodes by deleting its resource definition:

linstor resource-definition delete -q first

LINSTOR Proxmox VE Plugin Setup

As DRBD and LINSTOR are already set up, the only things missing is installing the plugin itself and its configuration.

apt install linstor-proxmox

The plugin is configured via the file /etc/pve/storage.cfg:

drbd: drbdstorage
content images, rootdir
redundancy 2 controller 10.0.0.1

It is not necessary to copy that file to the other nodes, as /etc/pve is already a replicated file system. After the configuration is done, you should restart the following service:

systemctl restart pvedaemon

After this setup is done, you are able to create virtual machines backed by DRBD from the GUI. To do so, select “drbdstorage” as storage in the “Hard Disk” section of the VM. LINSTOR selects the nodes that have the most free storage to create the replicated backing devices.

Distribution

The interested reader can check which ones were selected via LINSTOR resource list. While interesting, it is important to know that the storage can be accessed by all nodes in the cluster via a DRBD feature called “diskless clients”. So let’s assume “alpha” and “bravo” had the most free space and were selected, and the VM was created on node “bravo”. Via the low level tool drbdadm status we now see that the resource is created on two nodes (i.e., “alpha” and “bravo”) and the DRBD resource is in “Primary” role on “bravo”.

Now we want to migrate the VM from “bravo” to node “charlie”. This is again done via a few clicks in the GUI, but the interesting steps happen behind the scene: The storage plugin realizes that it has access to the data on “alpha” and “bravo” (our two replicas) but also needs access on “charlie” to execute the VM. The plugin therefore creates a diskless assignment on “charlie”. When you execute drbdadm status on “charlie”, you see that now three nodes are involved in the overall picture:

• Alpha with storage in Secondary role
• Bravo with storage in Secondary role
• Charlie as a diskless client in Primary role

Diskless clients are created (and deleted) on demand without further user interaction, besides moving around VMs in the GUI. This means that if you now move the VM back to “bravo”, the diskless assignment on “charlie” gets deleted as it is no longer needed.

If you would have moved the VM from “charlie” to “delta”, the diskless assignment for “charlie” would have been deleted, and a new one for “delta” would have been created.

For you it is probably even more interesting that all of this including VM migration happens within seconds without moving the actual replicated storage contents.

Next Steps

So far, we created a replicated and highly-available setup for our VMs, but the LINSTOR controller and especially its database are not highly-available. In a future blog post, we will describe how to make the controller itself highly-available by only using software already included in Proxmox VE (i.e., without introducing complex technologies like Pacemaker). This will be achieved with a dedicated controller VM that will be provided by LINBIT as an appliance.

Roland Kammerer
Software Engineer at Linbit
Roland Kammerer studied technical computer science at the Vienna University of Technology and graduated with distinction. Currently, he is a PhD candidate with a research focus on time-triggered realtime-systems and works for LINBIT in the DRBD development team.