bandwidth-close-up-computer-1148820

Replicating storage volumes on Scaleway ARM with LINSTOR

I’ve been using Scaleway for a while as a platform to spin-up both personal and work machines, mainly because they’re good value and easy to use. Scaleway offers a wide selection of Aarch64 and x86 machines at various price points, however none of these VMs are replicated – not even with RAID at the hardware level – you’re expected to handle that all yourself. Since ARM servers have been making headlines for several years as a competing architecture to x86 in the data center, I thought it would be interesting to set up replication across two ARM Scaleway VMs with DRBD and LINSTOR.

It’s worth pointing out here that if you’re planning on building a production HA environment on Scaleway, you should also reach out to their support team and have them confirm that your replicated volumes aren’t actually sitting on the same spinning disk in case of drive failure, as advised in their FAQ.

Preparing VMs

Linstor scaleway drbd-arm 6

First, we need a couple of VMs with additional storage volumes to replicate. The ARM64-2GB VM doesn’t allow for mounting additional volumes, so let’s go for the next one up, and add an additional 50GB LSSD volume.

Linstor scaleway drbd-arm 2

I’ve gone with an Ubuntu image, if you selected an RPM-based image, substitute package manager commands accordingly. I want to run the following commands on all VMs (in my case I have two, and will be using the first as both my controller and also a satellite node).

$ sudo apt update && sudo apt upgrade

In this case we’ll be deploying DRBD nodes with LINSTOR. We need DRBD9 to do this, but we can’t build a custom kernel module without first getting some prerequisite files for Scaleway’s custom kernel and preparing for a custom kernel module build. Scaleway provides a recommended script to run – we need to save that script and run it before installing DRBD9. I’ve put it in a file on github to make things simple:

$ sudo apt install -y build-essential libssl-dev
$ wget https://raw.githubusercontent.com/dabukalam/scalewaycustommodule/master/scalewaycustommodule
$ chmod +x scalewaycustommodule && sudo ./scalewaycustommodule

Getting LINSTOR

Once that’s done, we can add the LINBIT community repository and install DRBD, LINSTOR, and LVM:

$ sudo add-apt-repository -y ppa:linbit/linbit-drbd9-stack
$ sudo apt update
$ sudo apt install drbd-dkms linstor-satellite linstor-client lvm2

Now I can start the LINSTOR satellite service with:

$ sudo systemctl enable --now linstor-satellite

And make sure the VMs can see each other by adding the other node to each hosts file:

Linstor scaleway drbd-arm 3

Let’s make sure LVM is running and create a volume group for LINSTOR on our additional volume:

$ systemctl enable --now lvm2-lvmetad.service
$ systemctl enable --now lvm2-lvmetad.socket
$ sudo vgcreate sw_ssd /dev/vdb

That’s it for commands you need to run on both nodes. From now on we’ll be running commands on our favorite VM. LINSTOR has four node types – Controller, Auxiliary, Combined, and Satellite. Since I only have two nodes, one will be Combined, and one will be a Satellite. Combined here means that the node is both a Controller and a Satellite.

Adding nodes to the LINSTOR cluster

So on our favorite VM, which we’re going to use as the combined node, we add the local host to the LINSTOR cluster as a combined node, and the other as a satellite:

$ sudo apt install -y linstor-controller
$ sudo systemctl enable --now linstor-controller
$ linstor node create --node-type Combined drbd-arm 10.10.43.13
$ linstor node create --node-type Satellite drbd-arm-2 10.10.25.5
$ linstor node list

It’s worth noting here that you can run commands to manage LINSTOR on any node, just make sure you have the controller node exported as a variable

drbd-arm-2:~$ export LS_CONTROLLERS=drbd-arm

You should now have something that looks like this:

Linstor scaleway drbd-arm 4

Now we have our LINSTOR cluster setup, we can create a storage-pool across the nodes with the same name ‘swpool’, referencing the node name, specifying we want lvm, and the volume group name:

$ linstor storage-pool create drbd-arm swpool lvm sw_ssd
$ linstor storage-pool create drbd-arm-2 swpool lvm sw_ssd

We can then define new resource and volume types, and use them to create the resource. You can perform a whole range of operations at this point including manual node placement and specifying storage pools. Since we only have one storage pool, LINSTOR will automatically select that for us. I only have two nodes so I’ll just autoplace my storage cluster across two.

$ linstor resource-definition create backups
$ linstor volume-definition create backups 40G
$ linstor resource create backups --auto-place 2

LINSTOR will now handle all the resource creation automagically across all our nodes, including dealing with LVM and DRBD. If all succeeds, you should now be able to see your resources. They’ll be inconsistent while DRBD syncs them up. You can also now see the DRBD resources by running drbdmon. Once it’s finished syncing you’ll see a list of your replicated nodes as below (only drbd-arm-2 in my case):

You can now mount the drive on any of the nodes and write to your new replicated storage cluster.

$ linstor resource list-volumes

Linstor scaleway drbd-arm 7

In this case the device name is /dev/drbd1000, so once we create a filesystem on it and mount it I can now write to my new new replicated storage cluster.

$ sudo mkfs /dev/drbd1000
$ sudo mount /dev/drbd1000 /mnt
$ sudo touch /mnt/file

 

 

Danny Abukalam on Linkedin
Danny Abukalam
Danny is a Solutions Architect at LINBIT based in Manchester, UK. He works in conjunction with the sales team to support customers with LINBIT's products and services. Danny has been active in the OpenStack community for a few years, organising events in the UK including the Manchester OpenStack Meetup and OpenStack Days UK. In his free time, Danny likes hunting for extremely hoppy IPAs and skiing, not at the same time.

A Highly Available LINSTOR Controller for Proxmox

For the High Availability setup we describe in this blog post, we assume that you installed LINSTOR and the Proxmox Plugin as described in the Proxmox section of the users guide or our blog post.

The idea is to execute the LINSTOR controller within a VM that is controlled by Proxmox and its HA features, where the storage resides on DRBD, managed by LINSTOR itself.

Preparing the Storage

The first step is to allocate storage for the VM by creating a VM and selecting “Do not use any media” on the “OS” section. The hard disk should reside on DRBD (e.g., “drbdstorage”). Disk space should be at least 2GB, and for RAM we chose 1GB. These are the minimal requirements for the appliance LINBIT provides to its customers (see below). If you set up your own controller VM, or resources are not constrained, increase these minimal values. In the following, we assume that the controller VM was created with ID 100, but it is fine if this VM is created later (after you have already created other VMs).

LINSTOR Controller Appliance

LINBIT provides an appliance for its customers that can be used to populate the created storage. For the appliance to work, we first create a “Serial Port.” First, click on “Hardware” and then on “Add” and finally on “Serial Port.” See image below:

proxmox_serial1_controller_vm

If everything worked as expected, the VM definition should then look like this:

proxmox_add_serial2_controller_vm

The next step is to copy the VM appliance to the created storage. This can be done with qemu-img. Make sure to replace the VM ID with the correct one:

# qemu-img dd -O raw if=/tmp/linbit-linstor-controller-amd64.img \
 of=/dev/drbd/by-res/vm-100-disk-1/0

After that, you can start the VM and connect to it via the Proxmox VNC viewer. The default user name and password are both “linbit”. Note that we kept the defaults for SSH, so you will not be able to log in to the VM via SSH and username/password. If you want to enable that (and/or “root” login), enable these settings in /etc/ssh/sshd_config and restart the ssh service. As this VM is based on “Ubuntu Bionic”, you should change your network settings (e.g., static IP) in /etc/netplan/config.yaml. After that you should be able to ssh to the VM:

proxmox_ssh_controller_vm

Adding the Controller VM to the existing Cluster

In the next step, you add the controller VM to the existing cluster:

# linstor node create --node-type Controller \
 linstor-controller 10.43.7.254

As this special VM will be not be managed by the Proxmox Plugin, make sure all hosts have access to that VM’s storage. In our test cluster, we checked the linstor resource list to confirm where the storage was already deployed and then created further assignments via linstor resource create. In our lab consisting of four nodes, we made all resource assignments diskful, but diskless assignments are fine as well. As a rule of thumb keep the redundancy count at “3” (more usually does not make sense), and assign the rest diskless.

As the storage for this particular VM has to be made available (i.e., drbdadm up), enable the drbd.service on all nodes:

# systemctl enable drbd
# systemctl start drbd

At startup, the `linstor-satellite` service deletes all of its resource files (*.res) and regenerates them. This conflicts with the drbd services that needs these resource files to start the controller VM. It is good enough to first bring up the resources via drbd.service and then start linstor-satellite.service. To make the necessary changes, you need to create a drop-in for the linstor-satellite.service via systemctl (do
not edit the file directly).

# systemctl edit linstor-satellite
[Unit]
After=drbd.service

Switching to the New Controller

Now, it is time for the final steps — namely switching from the existing controller to the new one in the VM. Stop the old controller service on the old host, and copy the LINSTOR controller database to the VM:

# systemctl stop linstor-controller
# systemctl disable linstor-controller
# scp /var/lib/linstor/* [email protected]:/var/lib/linstor/

Finally, we can enable the controller in the VM:

# systemctl start linstor-controller # in the VM
# systemctl enable linstor-controller # in the VM

To check if everything worked as expected, you can query the cluster nodes on a host by asking the controller in the VM: linstor --controllers=10.43.7.254 node list. It is perfectly fine that the controller (which is just a controller and not “combined”) is shown as “OFFLINE”. Still, this might change in the future to something more appropriate.

As the last – but crucial – step, you need to add the “controllervm” option to /etc/pve/storage.cfg, and change the controller IP:

drbd: drbdstorage
  content images,rootdir
  redundancy 3
  controller 10.43.7.254
  controllervm 100

By setting the “controllervm” parameter the plugin will ignore (or act accordingly) if there are actions on the controller VM. Basically, this VM should not be managed by the plugin, so the plugin mainly ignores all actions on the given controller VM ID. However, there is one exception. When you delete the VM in the GUI, it is removed from the GUI. We did not find a way to return/kill it in a way that would keep the VM in the GUI. Yet such requests are ignored by the plugin, so the VM will not be deleted from the LINSTOR cluster. Therefore, it is possible to later create a VM with the ID of the old controller. The plugin will just return “OK”, and the old VM with the old data can be used again. To keep it simple, be careful to not delete the controller VM.

Enabling HA for the Controller VM in Proxmox

Currently, we have the controller executed as VM, but we should make sure that one instance of the VM is started at all times. For that we use Proxmox’s HA feature. Click on the VM; then on “More”; and then on “Manage HA.” We set the following parameters for our controller VM:

promox_manage_ha_controller_vm

Final Considerations

As long as there are surviving nodes in your Proxmox cluster, everything should be fine. In case the node hosting the controller VM is shut down or lost, Proxmox HA will make sure the controller is started on another host. The IP of the controller VM should not change. It is up to you as admin to make sure this is the case (e.g., setting a static IP, or always providing the same IP via dhcp on the bridged interface).

One limitation that is not fully handled with this setup is a total cluster outage (e.g., common power supply failure) with a restart of all cluster nodes. Proxmox is unfortunately pretty limited in that regard. You can enable the “HA Feature” for a VM, and you can define “Start and Shutdown Order” constraints. But both are completely separated from each other. Therefore it is difficult to ensure that the controller VM is up and all other VMs are started.

It might be possible to work around that by delaying VM startup in the Proxmox plugin until the controller VM is up (i.e., if the plugin is asked to start the controller VM it does it, otherwise it waits and pings the controller). While this is a nice idea, it would be a huge failure in a serialized, non-concurrent VM start/plugin call event stream where some VM should be started (which then blocks) before the controller VM is scheduled to be started. That would obviously result in a deadlock.

We will discuss options with Proxmox, but we think the presented solution is valuable in typical use cases as is, especially compared to the complexity of a Pacemaker setup. Use cases where one can expect that not the whole cluster goes down at the same time are (will be??) covered. And even if that is the case, only automatic startup of the VMs would not work when the whole cluster is started. In such a scenario, the admin just has to wait until the Proxmox HA service starts the controller VM. After that, all VMs can be started manually/scripted on the command line.

Roland Kammerer
Software Engineer at Linbit
Roland Kammerer studied technical computer science at the Vienna University of Technology and graduated with distinction. Currently, he is a PhD candidate with a research focus on time-triggered realtime-systems and works for LINBIT in the DRBD development team.

 

LINBIT Supports High Availability for Microsoft Azure and Amazon Web Services Users

Proven Open-Source High Availability For Leading Cloud Platforms

Beaverton, OR, August 14, 2018 – LINBIT, the pioneer in open source High Availability (HA), Disaster Recovery (DR) and Software-Defined Storage (SDS), today announced the general availability of LINBIT HA software on Microsoft Azure and Amazon Web Services (AWS) Elastic Cloud 2 (EC2). Protecting block data and creating always-on applications in the cloud is a growing need, but it is not automatic and requires software infrastructure tools such as those provided by LINBIT. The public cloud services market is projected to grow 21.4 percent in 2018 and LINBIT HA fills a critical need in the cloud market by supporting cloud user’s storage availability requirements without installing proprietary software.

Typical HA offerings from cloud providers generally only recover from failures at the virtual machine level. LINBIT HA goes beyond that to  recover applications from any failure that impacts application execution, including application level failures. itself. This makes LINBIT’s open-source software compelling for both cloud providers and cloud end-users.

“Microsoft has worked closely with LINBIT and is excited that it has expanded its technology to support High Availability needs for Azure Linux customers,” said Hosung Song, Senior Software Engineer, working in Azure Compute, Open Source Software Workloads Innovation Team.

“LINBIT software supports always-on applications and always-on data in OS-native and VM-native, and now Cloud-native environments,” said Brian Hellman, COO of LINBIT. “We are excited to support high availability for end-users of both Azure and EC2.”

LINBIT has created a new tech guide and video which demonstrate how to cluster any Linux distribution in Azure using the LINBIT DRBD software, which has been a part of the Linux kernel for almost a decade. With a few simple clicks, users can create and configure an HA NFS cluster using Microsoft’s quickstart template.

“Digital transformation means the kind of availability and scalability that can keep up with increasing storage demands,” said Phillip Reisner, CEO of LINBIT. “With more than 1.7 million downloads of the DRBD software, LINBIT is proud to deliver robust open source solutions that are imperative to keeping the digital world running.”

About LINBIT
LINBIT is the force behind DRBD and the de facto open standard for High Availability (HA) software for enterprise and cloud computing. The LINBIT DRBD software is deployed in millions of mission-critical environments worldwide to provide High Availability (HA), Geo Clustering for Disaster Recovery (DR), and Software Defined Storage (SDS) for OpenStack and OpenNebula based clouds.

https://www.gartner.com/newsroom/id/3871416

Control and Data plane Linstor

The advantage of separate control and data planes

Many storage systems have a monolithic design that combines the control plane and the data plane into a single application and a single protocol, but LINBIT’s more modular solution comes with a number of advantages.

What is a control plane or a data plane?

The most important task that any storage system must perform is providing access to the storage volumes that are used for various workloads, for example, databases, file servers or virtualization environments. This is what we refer to as the data plane – all the components that are necessary to actually get data from the storage to the user and from the user to the storage.

Another task is the management of the configuration of storage volumes, which is what we refer to as the control plane . With the rise of more dynamic systems like containerization, virtualization and cloud environments, and the associated software defined storage systems, where storage volumes are frequently reconfigured, this task is becoming increasingly important.

Data and Control plane Linstor

Why it is important: Availability

If you need to shut down part of your infrastructure, because you are updating hardware, for instance it is important when the most fundamental services remain available. Storage is probably one of those fundamental and important services, since most of the other systems rely on it.

A storage system with a modular design that provides independent control and data planes brings your infrastructure one step closer to high availability.

Independent control and data plane

Many storage systems can only provide access to storage volumes if all of their subsystems are online. The design may even be completely monolithic, so that the management functions and the storage access functions are contained within a single application that uses a single network protocol.

In LINBIT’s DRBD-based storage systems, only the most fundamental control plane functions are tightly coupled with the data plane and the operation of storage volumes. High-level control functions, like managing storage volumes and their configuration, managing cluster nodes, or providing automatic selection of cluster nodes for the creation of storage volumes, are provided by the LINSTOR storage management software. These two components, DRBD and LINSTOR, are fundamentally independent of each other.

DRBD storage volumes, even those that are managed by LINSTOR, are kept accessible even if the LINSTOR software is unavailable. This means that the LINSTOR software can be shut down, restarted or upgraded while users retain their access to existing storage volumes. While it is less useful, the same is even true the other way around: a LINSTOR controller that does not rely on storage provided by DRBD and will continue to service storage management requests even if the storage system itself is unavailable. The changed configuration will simply be applied whenever the actual storage system is online again.

Robert Altnoeder on Linkedin
Robert Altnoeder
Robert joined the LINBIT development team in 2013. He had worked with
DRBD at a startup company in the SaaS field before joining LINBIT. His
current primary field of work is the architecture and implementation of
LINSTOR, the cluster management component of LINBIT's SDS software.