Hifi linstor controller

Highly available LINSTOR Controller with Pacemaker

Part of the design of LINSTOR is that if the central LINSTOR Controller goes down, all the storage still remains up and accessible. This should allow ample time to repair the downed system hosting the LINSTOR Controller. Still, in the majority of cases, it is preferred to run the LINSTOR Controller in a container within your cloud or as a VM in your hypervisor platform. However, there may exist a situation where you want to keep the LINSTOR Controller up and highly available, but do not have a container or VM platform in place to rely upon.  For situations like this we can easily leverage DRBD and the Pacemaker/Corosync stack.

If familiar with Pacemaker, setting up a clustered LINSTOR Controller should seem pretty straightforward. The only really tricky bit here is that we first need to install LINSTOR to create the DRBD storage that will provide the storage for LINSTOR. Sounds a little bit chicken-and-egg, I know, but this allows LINSTOR to be aware of, and manage, all DRBD resources.

The below example is for only two nodes, but it could be easily adapted for more nodes. Make sure to install both the LINSTOR Controller and LINSTOR Satellite software to both nodes. The below instructions are by no means a step-by-step guide, but rather just the “special sauce” needed for a HA LINSTOR Controller cluster.

If using the LINBIT provided package repositories an Ansible playbook is available to entirely automate the deployment of this cluster on a RHEL7 or CentOS7 system.

Create a DRBD resource for the LINSTOR database

We’ll name this resource linstordb, and use the already already configured pool0 storage pool.

[[email protected] ~]# linstor resource-definition create linstordb
[[email protected] ~]# linstor volume-definition create linstordb 250M
[[email protected] ~]# linstor resource create linstora linstordb --storage-pool pool0
[[email protected] ~]# linstor resource create linstorb linstordb --storage-pool pool0

Stop the LINSTOR Controller and move the database to the DRBD device

Move the database temporarily, mount the DRBD device where LINSTOR expects the database, and move it back.

[[email protected] ~]# systemctl stop linstor-controller
[[email protected] ~]# rsync -avp /var/lib/linstor /tmp/
[[email protected] ~]# mkfs.xfs /dev/drbd/by-res/linstordb/0
[[email protected] ~]# rm -rf /var/lib/linstor/*
[[email protected] ~]# mount /dev/drbd/by-res/linstordb/0 /var/lib/linstor
[[email protected] ~]# rsync -avp /tmp/linstor/ /var/lib/linstor/

Cluster everything up in Pacemaker

Please note that we strongly encourage you utilize tested and working STONITH in all Pacemaker cluster. This example omits it simply because these VMs did not have any fencing devices available.

primitive p_drbd_linstordb ocf:linbit:drbd \
        params drbd_resource=linstordb \
        op monitor interval=29 role=Master \
        op monitor interval=30 role=Slave \
        op start interval=0 timeout=240s \
        op stop interval=0 timeout=100s
primitive p_fs_linstordb Filesystem \
        params device="/dev/drbd/by-res/linstordb/0" directory="/var/lib/linstor" \
        op start interval=0 timeout=60s \
        op stop interval=0 timeout=100s \
        op monitor interval=20s timeout=40s
primitive p_linstor-controller systemd:linstor-controller \
        op start interval=0 timeout=100s
        op stop interval=0 timeout=100s
        op monitor interval=30s timeout=100s
ms ms_drbd_linstordb p_drbd_linstordb \
        meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
group g_linstor p_fs_linstordb p_linstor-controller
order o_drbd_before_linstor inf: ms_drbd_linstordb:promote g_linstor:start
colocation c_linstor_with_drbd inf: g_linstor ms_drbd_linstordb:Master
property cib-bootstrap-options: \
        stonith-enabled=false \
        no-quorum-policy=ignore

We still usually advise leveraging the features already built into your cloud or VM platform for high availability if one is available, but if not, you can always the above to leverage pacemaker to make your LINSTOR Controller highly available.

 

Devin Vance on Linkedin
Devin Vance
First introduced to Linux back in 1996, and using Linux almost exclusively by 2005, Devin has years of Linux administration and systems engineering under his belt. He has been deploying and improving clusters with LINBIT since 2011. When not at the keyboard, you can usually find Devin wrenching on an american motorcycle or down at one of the local bowling alleys.
penguin-linux-linstor

How to migrate manually created resources to LINSTOR

In many cases, existing DRBD resources that were created manually some time in the past can be migrated, so that LINSTOR can be used to manage those resources afterwards.

While resources managed by LINSTOR can coexist with resources that were created manually, it can often make sense to migrate such existing resources. When resources are migrated to LINSTOR manually, there are some settings that must be carefully adjusted to match the resource’s existing configuration in LINSTOR.

Here is an overview of the migration process. It is, however, important to note that there are countless special cases, such as existing resources backed by different storage pools on different nodes or resources that use external metadata, where additional steps may be required for a successful migration.

This article describes the migration of an existing DRBD resource with a rather typical configuration.

Prerequisites

LINSTOR must be installed, configured and running already.

Sample configuration assumed by this article for the sample commands:

1 existing DRBD resource:

Name legacy

TCP/IP port number 15500

1 Volume

Volume number 10

Minor number 9100 (/dev/drbd9100)

LVM volume group datastore

Logical volume name legacy (/dev/datastore/legacy)

Logical volume size 1,340 MiB

2 Nodes

Node eagle Node ID 4

Node rabbit Node ID 8

Meta data type internal

Peer slots 4

Create a resource definition for the existing resource

This may require renaming the resource in some cases, because LINSTOR’s naming rules are stricter than DRBD’s naming rules. The TCP/IP port number can be assigned automatically by LINSTOR if you don’t mind the change, and if a short disconnect/reconnect cycle during the migration is not going to cause problems. Otherwise, you can specify the port number manually to set it to the value that the resource is currently using.

 

resource-definition create legacy -p 15500

Adjust the DRBD peer slots count for the resource definition

If LINSTOR needs to create additional resources, or has to recreate DRBD meta data, the current peer count used by the resource’s volumes can be important, because it has a direct influence on the net size of the DRBD device that is created.

 

Note: If the peer count of a DRBD volume is unknown, it can be extracted from a meta data dump, which can be displayed using the drbdadm dump-md command (this requires the volume to be detached, or the resource to be stopped, and will normally require execution of the drbdadm apply-al command first).

 

resource-definition set-property legacy PeerSlotsNewResource 4

Create a volume definition

While manual resource creation typically involves specifying the size of the backend storage device, in this case a logical volume, which will result in DRBD using whatever space is left after meta data creation, LINSTOR works with the net size of a DRBD volume.

To figure out the correct size to use for the LINSTOR volume definition, you can query the net size of the DRBD device using the blockdev utility:

blockdev –getsize64 /dev/drbd9100

LINSTOR stores volume sizes in kiB internally, and the value yielded by the blockdev utility is in bytes, so you will have to divide by 1024. There should be no rest (modulo-division by 1024 should yield zero), unless you are dealing with a special case that might require additional steps to migrate successfully.

In the case of the sample configuration assumed by this article, a 1,340 MiB logical volume with DRBD meta data with a peer count of 4 will result in a net size of 1,371,956 kiB. This is the value to use for the volume definition in LINSTOR.

The volume number (10) and the minor number (9100) should also be specified manually, unless you want LINSTOR to allocate new numbers automatically, which will typically require a reconfiguration of the applications or filesystem entries that are using the existing DRBD resource.

volume-definition create legacy -n 10 -m 9100 1371956KiB

If, for some reason, the backing storage volume is significantly larger than what would be required to fit the net size reported by the DRBD volume, then there is an additional property that can be set on the volume definition:

volume-definition set-property legacy 10 AllowLargerVolumeSize true

Setting this property is not normally required. It should be set if the LINSTOR satellite presents an error regarding the backing storage volume’s size being larger than expected.

Set the name of the manually created backing storage volume

LINSTOR normally generates the name for the backing storage volume, but the name of that volume can be overridden by setting a property on the volume definition as long as the backing storage volume name is the same on each node.

 

volume-definition set-property legacy 10 OverrideVlmId legacy

Create resources from the resource definition

Finally, resources must be created from the resource definition in LINSTOR. When creating resources, the resource’s DRBD node id on each node must match the node id used by the manually created resource. You can lookup the node-id in the existing, manually created DRBD resource configuration file.

Before the final step of creating LINSTOR resources from the resource definition can be executed, the existing manually created DRBD resource configuration file must be moved away, otherwise drbdadm will complain about duplicate definitions.

mv /etc/drbd.d/legacy.res /etc/drbd.d/legacy.res.disabled

It is a good idea to stop the LINSTOR satellites before creating the resources, otherwise the DRBD resource will disconnect from other nodes as the first resource is created, and will then reconnect as the other resources (on other nodes) are added.

By creating the resources first and starting the satellites after all resources have been created, the satellites will immediately configure all connections, thereby normally avoiding a disconnect/reconnect cycle.

resource create --node-id 4 -s fatpool eagle legacy
resource create --node-id 8 -s fatpool rabbit legacy

(If the satellites are currently stopped, add the –async parameter to the command line to avoid having the client wait for the creation of each resource, which would not take place if the satellite is offline)

 

After finishing the migration, LINSTOR can manage the DRBD resource just like resources that were originally created by LINSTOR.

 

Robert Altnoeder on Linkedin
Robert Altnoeder
Robert joined the LINBIT development team in 2013. He had worked with
DRBD at a startup company in the SaaS field before joining LINBIT. His
current primary field of work is the architecture and implementation of
LINSTOR, the cluster management component of LINBIT's SDS software.

 

business-server-woman-SDS

OS driven Software-Defined Storage (SDS) with Linux

In the tech world, a “layered cake” approach allows IT shops to enable new software capabilities while minimizing disruption. Need a new feature? Add a new layer. But like “too much of a good thing,” this approach has caused the IT stack to grow over the years. Software layer bloat and complexity can lead to poor system performance.

Keeping all the layers of a tall stack version-tested and conflict-free can be both complex and costly.  This is especially the case when you deal with virtualization and containers. You want VM-native and container-native semantics and ease of use, but you also want speed, robustness, and maintainability. What’s the best path? Do you add a new layer, or add functionality to existing layers?

LINSTOR uses existing features

There is a better way. The idea is to fully leverage the functionality that already exists and has been taken through its paces. So, before adding anything, users should see what features are natively supported in their environment. You can start with the OS kernel and work your way up. We went through this analysis when we built LINBIT SDS and LINSTOR.

It turns out there is a lot of SDS-type functionality already inside the Linux kernel and the OS. Software RAID adds drive protection, LVM does snapshotting; there is replication, deduplication, compression, SSD Caching, etc. So, no need to reinvent the wheel when building a storage system. But these efficient and reliable tools need to be presented in a VM-native or container-native way so that the software is easy to use and fits the way users interact with VMs or containers. This is where LINSTOR comes in.

Integrations to Kubernetes and OpenStack

LINSTOR is a Linux block device provisioner that lets you create storage pools, defined by which Linux block storage software you want to use. From these storage pools you can easily provision volumes either through a command-line interface (CLI) or your favorite cloud/container front-end (Kubernetes, OpenStack, etc). Simply create your LINSTOR cluster, define your storage pool(s), and start creating  resources where they’re needed on-demand.

With LINSTOR, you can:

  •       Scale out by designating nodes as controllers or satellites
  •       Display usage statistics from participating nodes
  •       Define storage pools for different classes of storage
  •       Set security/encryption for volumes
  •       Deploy in Non-Converged or Hyper-Converged infrastructures
  •       Define redundancy for each volume
  •       etc.

In short: you can do everything a more complicated SDS product provides, but in a lighter weight, more maintainable, faster, and more reliable fashion.

Don’t just take our word for it. In a recent report titled, “Will the Operating System Kill SDS?“, Storage Switzerland covers this approach.

 

Try it out. Test it. Play with it. Commit code to it. Send us your feedback. As always, we’d love to hear from you about how you use the software so that we can make the software better in future revs. Our goal is the same as it has been for nearly 20 years. We are reducing the cost and complexity of enterprise storage by enabling customers to choose Linux for their back-end storage in any environment.

 

Building on open-source software and leveraging the capabilities in the Linux kernel results lower TCO, always-on apps, and always-available data for enterprises everywhere.

Greg Eckert on Linkedin
Greg Eckert
In his role as the Director of Business Development for LINBIT America and Australia, Greg is responsible for building international relations, both in terms of technology and business collaboration. Since 2013, Greg has connected potential technology partners, collaborated with businesses in new territories, and explored opportunities for new joint ventures.