Migrating LVM thin {pools,volumes,snapshots} to a software RAID environment

As part of Octeron's (my VM server) migration from Debian "Wheezy" 7.9 to Debian "Jessie" 8.2 I analysed and refined my typical VM lifecycle workflow (i.e. creation/importing/cloning, usage, snapshot(s), exporting, deletion) in an effort to improve VM flexibility, maintainability, and resilience. By starting anew I was able to examine the various limitations imposed by my previous VMM (Virtual Machine Management) environment and consider viable alternatives that would better suit my needs.

Intro

My original Debian "Wheezy" 7.9 deployment employed the versatile QCOW2 virtual disk images for housing all my VMs. The virtual disk images themselves resided upon a performance tuned Ext4 file system (writeback caching, no barriers, etc.). This Ext4 file system was in turn situated upon a LVM based Logical Volume which, in turn, resided entirely within a large GPT based partition sat atop a 250GB Samsung 840 SSD. VMs that didn't require SSD speeds and/or had notably larger space requirements would be stored on the WD Red 3TB HDD.

While I received the numerous benefits QCOW2 offered such as: saving system state, taking internal snapshots, utilising backing disks for space efficient clones of desired VMs (a.k.a external snapshots), and many others I found that my original Ext4 + LVM deployment wasn't suited for my planned growth (i.e. adding SSDs) with Octeron. While it was feasible to extend the Volume Group across newly added SSDs, grow the Logical Volume, and ultimately expand the performance centric Ext4 filesystem, it didn't provide any resilience against disk failures or any potential performance increases. Instead these benefits would come from employing software/hardware RAID or altering LVM to perform striping/mirroring on the existing deployment.

One big "irk" however that drove me to investigating, and ultimately deploying LVM Thin pools/volumes/snapshots, was a particular external snapshot chaining limitation I found from using QCOW2 virtual disk images. Kashyap Chamarthy, an OSS developer focusing on improving the Virtualisation stack within Linux for Red Hat, explains the concept of an "External Snapshot" from the QCOW2 disk format perspective (source):

"external snapshots are a type of snapshots where, there’s a base image(which is the original disk image), and then its difference/delta (aka, the snapshot image) is stored in a new QCOW2 file. Once the snapshot is taken, the original disk image will be in a ‘read-only’ state, which can be used as backing file for other guests."

In the past I would have had a minimal installation (i.e. no options chosen at the tasksel prompt) of a Debian VM serve as a "base image" as a means of creating space efficient clones for various Debian driven VM environments. The main issue I encountered was that over time the base image would become out of date and consequently utilising this stale image for new VMs would require updating the APT cache in tandem with an upgrade of an ever growing set of outdated packages within each thinly provisioned VM.
A combination of old VM environments that required an out-of-date base image as their "backing disk" and my own futile attempts in regularly creating a new up-to-date Debian base image (by performing a "full-fat" cloning of the original base disk and updating it) for future VMs to use as a backing disk resulted in a messy environment that would not be sustainable nor scalable for future larger scale experimentations.

Ultimately I wanted a space efficient solution that allowed a VM which would served as a base disk to to keep itself updated without it adversely affecting other "snapshot" derived VMs. In addition to this, I wanted to be able to employ recursive snapshots when necessary in a manner similar to chaining external snapshots with QCOW2 images. Unlike chained QCOW2 images however, I desired the ability to remove a virtual disk image from an arbitrarily long chain without it rendering all "child" snapshots (from the removed snapshot's perspective) nonoperational.

I found that the capabilities offered by LVM Thin pools/volumes/snapshots adequately resolved my issues and therefore sought about configuring my VM server to properly accommodate this type of storage backend.
Nevertheless my investigation and ultimate deployment of a LVM Thin storage backend revealed some additional complexities and restrictions when compared to the traditional QCOW2 approach. I've outlined these "disadvatages" in a dedicated section at the end of this guide so as to provide a complete picture of its usage in a virtualisisation storage role and whether it is suitable in your environment.

Given that the LVM "Thin" environment was being considered as a replacement for my limited QCOW2 workflow on Octeron I decided to simulate the addition of future SSDs within a Debian Jessie VM. This guide therefore serves as a walkthrough for those wishing to relocate their LVM "Thin" environment to a software RAID (mdadm) storage target.
Please note that while this guide provides the example of a software RAID environment being the migration target, this need not be the restoration endpoint. The exportation process permits the restoration on to various storage backends.

As usual I have included a TL;DR section at the bottom of this page for those wanting to skip the reasoning of the commands used at each stage.

Note: For brevity sake this guide does not demonstrate the necessary creation steps for Thin pools/volumes/snapshots. Such guides are in abundance on the web (e.g. here, here, and here) as well as being outlined in sufficient depth within the respective man pages (e.g. man lvmthin).

Important Note: A bug was discovered when attempting to import a previously exported LVM Thin based * environment on top of a RAID 5 environment. This appears to only affect the Jessie kernel (linux-image-3.16.0-4-amd64 -> 3.16.7-ckt11-1+deb8u3) but results in the inability to write to any of the imported LVM Thin volumes/snapshots. This issue does not appear to affect the Debian Stretch kernel (linux-image-4.2.0-1-amd64 -> 4.2.6-3) and above however.

Packages used

lvm2: 2.02.111-2.2
thin-provisioning-tools: 0.3.2-1
dmsetup: 2:1.02.09-2.2
mdadm: 3.3.2-5
kpartx: 0.5.0-6+deb8u2
gdisk: 0.8.10-2
linux-image-4.2.0-1-amd64: 4.2.6-3

LVM Thin {Pools,Volumes,Snapshots}

A brief summary of each "thin provisioning" component that makes up LVM's Thin provisioning offering is outlined here so as to give the reader a high level understanding of each component and how they interact with one another. A special mention should be given to the Gentoo Wiki writers/contributors from which I have derived (and directly copied in some cases) their succinctly summarised meanings of the components.

  • Thin Pool: "A special type of logical volume, which itself can host logical volumes." (Gentoo Wiki). Upon creation it dictates the total number of extents that can be consumed by hosted thin volumes/snapshots. If all available extents in a pool are consumed then "any process that would cause the thin pool to allocate more (unavailable) extents will be stuck in "killable sleep" state until either the thin pool is extended or the process receives SIGKILL" (Gentoo Wiki)
    Note: A thin pool can be grown (online/offline) but not shrunk. (source)

  • Thin Volume: "Thin volumes are to block devices what sparse files are to file systems" (Gentoo Wiki). Analogous to the behavior of sparse files in the sense that blocks (extents in this case) are assigned when requested as opposed to being preallocated upon creation; thin volumes consequently adhere to the "over-committed" storage principle. This inherent characteristic means that thin volumes can be allocated more extents than what is presently available in thin pool. (source)
    Note: A thin volume can be grown (online/offline) and shrunk (online/offline) as desired. (source)

  • Thin Snapshot: Operates on the traditional COW (Copy-On-Write) behaviour that is similar to traditional LVM write-able snapshots with the additional functionality of arbitrarily deep chaining of thin snapshots>. If the base thin volume (origin) is removed from the thin pool the snapshot will automatically transition to a standalone thin volume. (source)
    Note: A thin snapshot is not explicitly assigned a size upon creation (as is possible traditional LVM snapshots) as by design it will always be the same size as the thin volume it is snapshotting. (source)

For those wishing to learn more about LVM and LVM's Thin provisioning components I recommend reading the freely available RHEL 7 LVM administration documentation (here) in addition to the comprehensive Gentoo Wiki (here).

1. Disk array configuration

For the purpose of this guide the disk layout utilised inside the Debian GNU/Linux "Jessie" VM roughly mimics that of the next "step" in my planned storage expansion within Octeron.

  • 4 * 20GiB virtual disks are attached to a Debian GNU/Linux Jessie VM via the paravirtualised virtio bus so as to leverage the increased I/O performance and reduced CPU overhead by the KVM hypervisor.

  • 2 of the virtual disks: /dev/vdc and /dev/vdd are to be considered minified, virtual equivalents of the additional SSDs planned for eventual inclusion within Octeron. They share exactly the same GPT table as the first enumerated virtual disk (using sgdisk for backing up and restoring GPT partition tables).

  • The first enumerated virtual disk /dev/vda contains a 10GiB partition (/dev/vda3) where an example LVM "Thin" environment has been configured. This simple environment should be considered as the start point for this guide.

  • The LVM "Thin" environment, outlined in more detail below, contains:
    1 * LVM Thin Pool.
    3 * LVM Thin Volumes
    1 * LVM Thin Snapshot

  • The LVM "Thin Pool" actually consists of three distinct LVM Volumes:

    1. A Thin Pool data volume /dev/mapper/vms-thinpool_data that stores all "standard" data for both Thin Volumes and Thin Snapshots.

    2. A Thin Pool metadata volume /dev/mapper/vms-thinpool_tmeta that keeps track of block changes between Thin volumes and their respective derivative Thin snapshots. The greater the difference (i.e. block changes) between a Thin Volume and its subsequent Thin Snapshot(s) the more metadata information required for tracking said differences.

    3. A "spare" Thin Pool metadata volume (no /dev node as the spare metadata volume is not 'activated' and subsequently exposed in the same manner as a typical Logical Volume) that is automatically created (unless explicitly specified otherwise) during Thin Pool creation. It provides the means for recovering a Thin Pool should the main Thin Pool's metadata volume become corrupted/damaged.

  • 3 * Thin Volumes /dev/mapper/vms-thinvol{1,2,3} and 1 * Thin Snapshot /dev/mapper/vms-thinvol1_snap0 (an aptly named Thin snapshot of the Thin volume vms-thnvol1). Each Thin Volume and Thin Snapshot contains a GPT partition table which in turn contains a single Ext4 formatted partition. Debian GNU/Linux exposes such a structure as: /dev/mapper/vms-thinvol{1,2,3}p1 and /dev/mapper/vms-thinvol1snap0p1 respectively.

  • Finally, the second enumerated virtual disk /dev/vdb mimics Octeron's 3TB WD Red drive and has been simplified to that of a single Ext4 formatted partition. This particular mount; /mount/storage, will store the necessary LVM "Thin" environment data required for eventual restoration on the targeted software RAID environment.

In an effort to help illustrate the aforementioned environment I've included the console output (below) for the block device listings (lsblk) and the Logical Volume setup (lvs).

[email protected]:~$ lsblk  
NAME                           MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT  
sr0                             11:0    1 1024M  0 rom  
vda                            254:0    0   20G  0 disk  
|-vda1                         254:1    0    1M  0 part  
|-vda2                         254:2    0   10G  0 part  
| `-md0                          9:0    0   10G  0 raid1 /
`-vda3                         254:3    0   10G  0 part  
  |-vms-thinpool_tmeta         252:0    0    8M  0 lvm   
  | `-vms-thinpool-tpool       252:2    0    8G  0 lvm   
  |   |-vms-thinpool           252:3    0    8G  0 lvm   
  |   |-vms-thinvol1           252:4    0    2G  0 lvm   
  |   | `-vms-thinvol1p1       252:9    0    2G  0 part  
  |   |-vms-thinvol2           252:5    0    2G  0 lvm   
  |   | `-vms-thinvol2p1       252:8    0    2G  0 part  
  |   |-vms-thinvol3           252:6    0    2G  0 lvm   
  |   | `-vms-thinvol3p1       252:7    0    2G  0 part  
  |   |-vms-thinpool-tpool1    252:10   0    2G  0 part  
  |   `-vms-thinvol1_snap0     252:12   0    2G  0 lvm   
  |     `-vms-thinvol1_snap0p1 252:13   0    2G  0 part  
  `-vms-thinpool_tdata         252:1    0    8G  0 lvm   
    |-vms-thinpool-tpool       252:2    0    8G  0 lvm   
    | |-vms-thinpool           252:3    0    8G  0 lvm   
    | |-vms-thinvol1           252:4    0    2G  0 lvm   
    | | `-vms-thinvol1p1       252:9    0    2G  0 part  
    | |-vms-thinvol2           252:5    0    2G  0 lvm   
    | | `-vms-thinvol2p1       252:8    0    2G  0 part  
    | |-vms-thinvol3           252:6    0    2G  0 lvm   
    | | `-vms-thinvol3p1       252:7    0    2G  0 part  
    | |-vms-thinpool-tpool1    252:10   0    2G  0 part  
    | `-vms-thinvol1_snap0     252:12   0    2G  0 lvm   
    |   `-vms-thinvol1_snap0p1 252:13   0    2G  0 part  
    `-vms-thinpool_tdata1      252:11   0    2G  0 part
vdb                            254:16   0   20G  0 disk  
`-vdb1                         254:17   0   20G  0 part  /mnt/storage
vdc                            254:32   0   20G  0 disk  
|-vdc1                         254:33   0    1M  0 part  
|-vdc2                         254:34   0   10G  0 part  
| `-md0                          9:0    0   10G  0 raid1 /
`-vdc3                         254:35   0   10G  0 part  
vdd                            254:48   0   20G  0 disk  
|-vdd1                         254:49   0    1M  0 part  
|-vdd2                         254:50   0   10G  0 part  
| `-md0                          9:0    0   10G  0 raid1 /
`-vdd3                         254:51   0   10G  0 part  
[email protected]:~$ sudo lvs --all --options lv_name,vg_name,attr,lv_size,data_percent  
  LV               VG   Attr       LSize Data% 
  [lvol0_pmspare]  vms  ewi------- 8.00m       
  thinpool         vms  twi-aotz-- 7.99g 6.10  
  [thinpool_tdata] vms  Twi-ao---- 7.99g       
  [thinpool_tmeta] vms  ewi-ao---- 8.00m       
  thinvol1         vms  Vwi-aotz-- 2.00g 7.35  
  thinvol1_snap0   vms  Vwi-aotz-- 2.00g 7.39  
  thinvol2         vms  Vwi-aotz-- 2.00g 8.17  
  thinvol3         vms  Vwi-aotz-- 2.00g 8.07  

1. Exporting the LVM "Thin" environment

As mentioned earlier; this particular step of the guide aims to be agnostic such that the exported LVM "Thin" environment can be migrated on a wide range of block device backed targets.
Note: Despite having personally tested the following steps for successfully exporting a LVM "Thin" environment (i.e. without data corruption) I strongly recommended backing up any data from the LVM Thin volumes and Thin snapshots before proceeding.

1. Unmount any active (i.e. mounted) partitions that reside on LVM Thin Volume(s) or Thin Snapshot(s):
sudo umount /dev/mapper/vms-thinvol{1..3}p1
sudo umount /dev/mapper/vms-thinvol1_snap0p1

2. Remove any partition derived device mappings from both the LVM Thin Volume(s) and Thin Snapshot(s):
sudo kpartx -d /dev/mapper/vms-thinvol1
sudo kpartx -d /dev/mapper/vms-thinvol2
sudo kpartx -d /dev/mapper/vms-thinvol3
sudo kpartx -d /dev/mapper/vms-thinvol1_snap0
Note: You can skip this section if you did not utilise GPT (or MBR) within your LVM Thin Volume(s) or Thin Snapshot(s).

3. Temporarily deactivate all Thin Volume(s) and Thin Snapshot(s). While this removes the availability of the Logical Volume (LV) for use (e.g. /dev/mapper/vms-thinvol1 is no longer exposed) it does ensure that any I/O to the Logical Volume (LV) syncs fully:
sudo lvchange --activate n vms/thinvol1
sudo lvchange --activate n vms/thinvol2
sudo lvchange --activate n vms/thinvol3
sudo lvchange --activate n vms/thinvol1_snap0
Note: If you were to attempt to deactivate the LVM Thin Volume or Thin Snapshot during any form of I/O the lvchange command would block until the external operation had completed. Therefore the purpose of this step is to ensure all read/write operations have completed before exporting the LVM Thin Volume(s)/Snapshot(s).

4. Re-activate all Thin Volume(s) and Thin Snapshot(s) ensuring to enable them in a readonly mode to prevent any alterations during the exporting process:
sudo lvchange --activate y --permission r vms/thinvol1
sudo lvchange --activate y --permission r vms/thinvol2
sudo lvchange --activate y --permission r vms/thinvol3
sudo lvchange --activate y --permission r vms/thinvol1_snap0
Note: You can confirm that the LVM Thin volume(s) and Thin snapshot(s) are in a readonly state by examining the second Attributes column in lvs; if it is 'r' then the Logical Volume (LV) is deemed readonly.

5. Perform a sparse block copy of all LVM Thin Volume(s) and Thin Snapshot(s) saving the resulting disk images to an external storage location. To further save space I used gzip to compress the data contents stored within the LVM Thin Volume(s) and Thin Snapshot(s) before saving them:
sudo dd if=/dev/mapper/vms-thinvol1 bs=4M conv=sparse | gzip --stdout --best > /mnt/storage/images/vms-thinvol1.raw.gz
sudo dd if=/dev/mapper/vms-thinvol2 bs=4M conv=sparse | gzip --stdout --best > /mnt/storage/images/vms-thinvol2.raw.gz
sudo dd if=/dev/mapper/vms-thinvol3 bs=4M conv=sparse | gzip --stdout --best > /mnt/storage/images/vms-thinvol3.raw.gz
sudo dd if=/dev/mapper/vms-thinvol1_snap0 bs=4M conv=sparse | gzip --stdout --best > /mnt/storage/images/vms-thinvol1_snap0.raw.gz

6. Backup the LVM Volume Group (VG) "descriptor area" (i.e. metadata pertaining to the VG and consequently the Logical Volumes sat atop) containing the "Thin" environment - again saving the resulting file to an external storage location. This particular VG metadata file will be required during the import phase of the LVM "Thin" environment at the end of this guide.
sudo vgcfgbackup --verbose --file /mnt/storage/metadata/vms_backup vms

7. Export the Thin Pool metadata that keeps track of block changes between Thin Volume(s) and their respective Thin Snapshot(s):
sudo thin_dump /dev/mapper/vms-thinpool_tmeta > /mnt/storage/metadata/vms-thinpool_tmeta.xml

8. Deactivate all LVM Logical Volume (LV) components within the LVM "Thin" environment contained within a given Volume Group (VG). Assuming that none of the LVs are in use (they shouldn't be at this point!) we can have all LVs including the VG deactivated in a single command:
sudo vgchange --activate n vms

9. Proceed to remove the recently deactivated Volume Group (VG). Respond with 'y' when prompted about the volume dependencies within the "Thin" environment.
sudo vgremove vms

10. Finally remove the Physical Volume (PV) the VG and consequently the LVM "Thin" environment had previously been built upon:
sudo pvremove /dev/vda3

At this stage we have explicitly removed all LVM layout configuration. These destructive alterations should be evident when compared to the initial LVM deployment:

[email protected]:~$ lsblk  
NAME    MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT  
sr0      11:0    1 1024M  0 rom  
vda     254:0    0   20G  0 disk  
|-vda1  254:1    0    1M  0 part  
|-vda2  254:2    0   10G  0 part  
| `-md0   9:0    0   10G  0 raid1 /
`-vda3  254:3    0   10G  0 part  
vdb     254:16   0   20G  0 disk  
`-vdb1  254:17   0   20G  0 part  /mnt/storage
vdc     254:32   0   20G  0 disk  
|-vdc1  254:33   0    1M  0 part  
|-vdc2  254:34   0   10G  0 part  
| `-md0   9:0    0   10G  0 raid1 /
`-vdc3  254:35   0   10G  0 part  
vdd     254:48   0   20G  0 disk  
|-vdd1  254:49   0    1M  0 part  
|-vdd2  254:50   0   10G  0 part  
| `-md0   9:0    0   10G  0 raid1 /
`-vdd3  254:51   0   10G  0 part  
[email protected]:~$ sudo lvs --all --options lv_name,vg_name,attr,lv_size,data_percent  
No volume groups found  

2. Configuring RAID

Now that we have successfully exported the necessary configuration/metadata files in addition to the actual data content of the LVM Thin environment we can proceed to establish the software RAID target. For this particular test environment I have decided to create a basic software RAID 5 setup across the 3 virtual SSD disks (vda,vdc, and vdd) on their third partition respectively.

11. Create the software RAID 5 array across the third partition on each of three SSD virtual disks:
sudo mdadm --create /dev/md1 --level=5 --raid-devices=3 /dev/vda3 /dev/vdc3 /dev/vdd3

12. Monitor the progress of RAID 5 array setup while it is constructed:
cat /proc/mdstat

Personalities : [raid1] [raid6] [raid5] [raid4]  
md1 : active raid5 vdd3[3] vdc3[1] vda3[0]  
      20947968 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
      [=================>...]  recovery = 88.1% (9237496/10473984) finish=0.3min speed=57908K/sec
...
[3940703.593615] md: md1: recovery done.

Note: The software RAID 5 array (/dev/md1) can be interacted with during the initial build however the responsiveness and overall I/O throughput will be notably less due to the competing creation process.

13. Examine the newly created software RAID 5 array to ensure that the Linux kernel has detected the multiple devices software RAID 5 target (/dev/md1) across the correct disks (vda,vdc, and vdd) and the correct partitions:

[email protected]:~$ lsblk  
NAME    MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT  
sr0      11:0    1 1024M  0 rom  
vda     254:0    0   20G  0 disk  
|-vda1  254:1    0    1M  0 part  
|-vda2  254:2    0   10G  0 part  
| `-md0   9:0    0   10G  0 raid1 /
`-vda3  254:3    0   10G  0 part  
  `-md1   9:1    0   20G  0 raid5 
vdb     254:16   0   20G  0 disk  
`-vdb1  254:17   0   20G  0 part  /mnt/storage
vdc     254:32   0   20G  0 disk  
|-vdc1  254:33   0    1M  0 part  
|-vdc2  254:34   0   10G  0 part  
| `-md0   9:0    0   10G  0 raid1 /
`-vdc3  254:35   0   10G  0 part  
  `-md1   9:1    0   20G  0 raid5 
vdd     254:48   0   20G  0 disk  
|-vdd1  254:49   0    1M  0 part  
|-vdd2  254:50   0   10G  0 part  
| `-md0   9:0    0   10G  0 raid1 /
`-vdd3  254:51   0   10G  0 part  
  `-md1   9:1    0   20G  0 raid5
[email protected]:~$ sudo blkid /dev/vd{a,c,d}3 | awk '{print $2}'  
UUID="88ed62e0-052b-d369-85a5-e78af70a9bed"  
UUID="88ed62e0-052b-d369-85a5-e78af70a9bed"  
UUID="88ed62e0-052b-d369-85a5-e78af70a9bed"  

14. Append the software RAID 5 target details to the system wide mdadm configuration file (/etc/mdadm/mdadm.conf) for enumeration once the root filesystem (located on /dev/md0) has been mounted:
sudo mdadm --detail --scan /dev/md1 | sudo tee --append /etc/mdadm/mdadm.conf

Note: The configured RAID 5 array will not be available during the early stage of the Linux boot process (initrd loading ~ pre-root mounting phase) given that I have not: a) Explicitly specified it be enumerated by the initrd in (/etc/default/grub.cfg), and b) subsequently regenerated the initrd (for the currently running kernel) to acknowledge the change made in a).

3. Restoring the LVM "Thin" environment

With a fully initialised software RAID 5 target we can proceed to restore the LVM "Thin" environment. I'd like to reiterate at this stage that the target migration storage backend does not have to be a software RAID setup. The motivation for a software RAID target for this guide was for understanding the steps necessary for my storage expansion plans on Octeron.

15. Create a new Physical Volume (PV) on the multiple disk RAID 5 block device node (/dev/md1):
sudo pvcreate /dev/md1

16. Obtain the UUID of the newly created Physical Volume (PV). The new PV's UUID is required during the import process of the Volume Group (VG) metadata configuration file:
sudo pvdisplay /dev/md1 | grep UUID

17. With your favourite text editor alter the Volume Group (VG) metadata configuration file (/mnt/storage/metadata/vms_backup) and update both the id and device values to reflect the new created PV:

pv0 {  
        id = "$UUID_OF_NEW_PV"
        device = "/dev/md1"

18. Restore the Physical Volume (PV) related metadata configuration from the exported Volume Group (VG) metadata configuration file using the newly created PV UUID:
sudo pvcreate --uuid $UUID_OF_NEW_PV --restorefile /mnt/storage/metadata/vms_backup /dev/md1

19. Check the Physical Volume (PV) LVM metadata to ensure that the imported metadata is consistent:
sudo pvck /dev/md1

Scanning /dev/md1  
Found label on /dev/md1, sector 1, type=LVM2 001  
Found text metadata area: offset=4096, size=1044480  

20. Restore the remaining Volume Group (VG) and Logical Volume (LV) metadata information from the exported VG metadata configuration file:
sudo vgcfgrestore --file /mnt/storage/metadata/vms_backup --force vms
Note: The --force flag is "Necessary to restore metadata with thin pool volumes." (source)

21. Check the Volume Group (VG) LVM metadata to ensure that the imported metadata is consistent:
sudo vgck --verbose vms

DEGRADED MODE. Incomplete RAID LVs will be processed.  
Using volume group(s) on command line  
Finding volume group "vms"  

23. Recover the LVM Thin pool 'thinpool':
sudo lvconvert --repair vms/thinpool
Important: If you receive an error regarding "mismatching transaction IDs" you will need to perform the steps outlined in the "Fixing mismatching transaction IDs" (below) to resolve the mismatch before proceeding with the remainder of the guide!


Fixing mismatching transaction IDs

"Thankfully" I encountered this issue when importing the test LVM environment on to the software RAID storage target. As mentioned in step 23 you only need to perform the following 4 steps to fix this issue should you have received an error similar to:

Transaction id 6 from pool "vms/thinpool" does not match repaired transaction id 0 from /dev/mapper/vms-lvol0_pmspare. Logical volume "lvol1" created WARNING: If everything works, remove "vms/thinpool_meta0". WARNING: Use pvmove command to move "vms/thinpool_tmeta" on the best fitting PV.

1. Deactivate the problematic Volume Group (VG):
sudo vgchange --activate n vms

2. Remove the problematic Volume Group (VG):
sudo vgremove --force --force vms
Note: We need to be "forceful" for this operation otherwise the same "mismatching transaction id" error resurfaces and prevents the removal!

3. With your favourite text editor alter the transaction_id's value to that of the expected transaction id (as stated in the error message) from within the problematic Volume Group's (VG) metadata configuration file /mnt/storage/metadata/vms_backup:

logical_volumes {  
    thinpool {
        ...
        transaction_id = $VALID_ID
        ...

4. Re-import the corrected Volume Group (VG) metadata configuration file:
sudo vgcfgrestore --file /mnt/storage/metadata/vms-thinpool.vg --force vms


24. Remove the temporary LVM Thin Pool metadata volume as directed after a successful import:
sudo lvremove vms/thinpool_meta0

25. Grow the Physical Volume (PV) (and therefore the contained Volume Group) to take advantage of the additional storage space on the software RAID 5 target:
sudo pvresize /dev/md1
Note: By default the size of the imported LVM environment (PVs + VGs + LVs) will be that of the size witnessed at the point of Volume Group metadata export (step 6.).

26. Check that the Physical Volume (PV) has grown to occupy the available disk space:
sudo pvdisplay /dev/md1 | grep "PV Size"

27. Check that the Volume Group (VG) situated atop the recently grown Physical Volume (PV) has also grown in parallel to the PV:
sudo vgdisplay vms | grep "VG Size"

28. Activate the LVM Thin Pool:
sudo lvchange --activate y vms/thinpool

[ 7022.717819] device-mapper: thin: Data device (dm-1) discard unsupported: Disabling discard passdown.

Important: When using a software RAID target (mdadm based) for the LVM Thin environment migration you will receive the following message (above) upon LVM Thinpool activation. Based off a RHEL thread (here) it appears that mdadm is the culprit for why "Discards" cannot be passed down from the LVM layer. This forces us to move the responsibility of "trimming" all LVM Thin volumes and Thin snapshots (in order to reclaim unused blocks) to another aspect of the system; for example: using the discard option when mounting a filesystem situated onto the LVM Thin volume/snapshot for usage.
Note: This particular example workaround has notable downsides (source) but would ensure that after each file deletion the unused blocks are reclaimed by the Thinpool data volume and can be used immediately by the Thin volume or Thin snapshot in question.

29. Import the LVM Thin Pool metadata contain details regarding Thin Volume(s) properties and their blocks shared with subsequent Thin Snapshot(s):
sudo thin_restore --input /mnt/storage/metadata/vms-thinpool_tmeta.xml --output /dev/mapper/vms-thinpool_tmeta
Note: The metadata XML file must be imported before any Thin Pool/Volume/Snapshot resizing operation otherwise block mappings are inconsistent which will lead to a faulty LVM environment

Growing the migrated LVM Thin environment & Expanding GPT Partitions

At this stage the LVM Thin environment detailed in section 1 has been successfully imported in a "skeletal" manner; the LVM layout is present but the actual content of the Logical Volumes (LV) has not been restored. If you have no intention on growing any aspect of the migrated LVM Thin environment you will only need to perform steps 40, 41, and 47 to complete the entire process.

30. Deactivate the LVM Thin Pool so we can proceed with growing the LVM Thin Pool data volume and corresponding metadata volume:
sudo lvchange --activate n vms/thinpool

31. Grow the LVM Thin Pool data volume to take advantage of the additional space in the Volume Group (VG):
sudo lvresize --extents 80%VG vms/thinpool

32. Grow the LVM Thin Pool metadata volume to accommodate for the size increase of the LVM Thin Pool data volume:
sudo lvresize --size +8M vms/thinpool_tmeta
Note: While no strict numbers exist I recommend growing the LVM Thin Pool metadata volume by the same ratio as the LVM Thin Pool data volume. In my test environment I doubled the size of LVM Thin Pool data volume and therefore doubled the size of the LVM Thin Pool metadata volume.
Important: Overflowing the LVM Thin Pool metadata volume will irrecoverably corrupt all LVM Thin Volumes and Thin Snapshots - be generous!

33. Remove the smaller LVM Thin Pool metadata spare volume:
sudo lvremove vms/lvol1_pmspare
Note: We perform this operation due to this particular LVM bug. Essentially the spare metadata volume does not grow to mirror the primary Thin Pool metadata volume; effectively rendering the spare metadata volume unusable during Thin Pool recovery.

34. Recreate the LVM Thin Pool metadata spare volume by "tricking" LVM into believing that the Thin Pool requires recovering. As per the HP support link (here), this will force LVM to recreate the spare metadata volume:
sudo lvconvert --repair vms/thinpool

35. Remove the extraneous LVM Thin metadata volume as instructed by the LVM Thin Pool recovery message in the previous step:
sudo lvremove vms/thinpool_meta0

36. Confirm that the recreated LVM Thin Pool metadata spare is the same size as the primary Thin Pool metadata volume:
sudo lvs --all --options lv_name,vg_name,attr,lv_size | grep 'spare\|meta'

[lvol1_pmspare]  vms  ewi------- 16.00m
[thinpool_tmeta] vms  ewi------- 16.00m

37. Activate the LVM Thin Pool and witness LVM identify the extra space for the dedicated Thin Pool data volume:
sudo lvchange --activate y vms/thinpool

38. Grow the LVM Thin Volume(s) and Thin Snapshot(s) as desired:
sudo lvresize --size +2G vms/thinvol1
sudo lvresize --size +2G vms/thinvol2
sudo lvresize --size +2G vms/thinvol3
sudo lvresize --size +2G vms/thinvol1_snap0

39. Confirm that the LVM Thin Volume(s) and Thin Snapshot(s) have grown accordingly:
sudo lvs --options lv_name,vg_name,attr,lv_size | grep thinvol*

thinvol1       vms  Vwi---tz--  4.00g  
thinvol1_snap0 vms  Vwi---tz--  4.00g  
thinvol2       vms  Vwi---tz--  4.00g  
thinvol3       vms  Vwi---tz--  4.00g  

40. Activate all LVM Thin Volume(s) and Thin Snapshot(s) ensuring to include the --setactivationskip flag so as to allow the selected LVM Thin Volume(s) and Thin Snapshot(s) (Logical Volumes) to be available (i.e. exposed block device nodes) at boot:
sudo lvchange --activate y --setactivationskip n vms/thinvol1
sudo lvchange --activate y --setactivationskip n vms/thinvol2
sudo lvchange --activate y --setactivationskip n vms/thinvol3
sudo lvchange --activate y --setactivationskip n vms/thinvol1_snap0

41. Restore the compressed disk images of the activated Thin Volume(s) and Thin Snapshot(s):
gunzip --stdout /mnt/storage/images/vms-thinvol1.raw.gz | sudo dd conv=sparse bs=4M of=/dev/mapper/vms-thinvol1
gunzip --stdout /mnt/storage/images/vms-thinvol2.raw.gz | sudo dd conv=sparse bs=4M of=/dev/mapper/vms-thinvol2
gunzip --stdout /mnt/storage/images/vms-thinvol3.raw.gz | sudo dd conv=sparse bs=4M of=/dev/mapper/vms-thinvol3
gunzip --stdout /mnt/storage/images/vms-thinvol1_snap0.raw.gz | sudo dd conv=sparse bs=4M of=/dev/mapper/vms-thinvol1_snap0
Note: Forgetting to use the sparse option (as part of the dd invocation) will result in the Thin Volume or Thin Snapshot having consumed 100% of its allocated space (assuming you did not grow the Thin Volume/Snapshot). The reason for this is that while the majority of the blocks (for this particular test environment case) are empty (i.e. filled with 0's) they are being classified as "allocated" and therefore counting towards the total block usage. To remedy this perceived "full" state either run the dd utility with the sparse option again or use the fstrim utility for discarding unused blocks.

42. For each Thin Volume and Thin Snapshot duplicate the main GPT partition table (i.e. the one located at the front of the restored disk image) to the "new" end of the respective Logical Volume:
sudo sgdisk --move-second-header /dev/mapper/vms-thinvol1
sudo sgdisk --move-second-header /dev/mapper/vms-thinvol2
sudo sgdisk --move-second-header /dev/mapper/vms-thinvol3
sudo sgdisk --move-second-header /dev/mapper/vms-thinvol1_snap0
Note: You may skip this section if you did not utilise GPT within your LVM Thin Volume(s) or Thin Snapshot(s).

43. As outlined in this guide's introduction section this particular test scenario has a single partition situated atop the GPT partition table which in turn resides on a Logical Volume (either as a Thin Volume or Thin Snapshot). My aim here was to expand the filesystem (contained within the single partition) in parallel with the growth of the underling Logical Volume (LV).
In order to grow a GPT partition you must first remove it and then recreate it setting the end sector value to the last physically available sector while simultaneously ensuring that the start sector aligns exactly where it had done so previously.

for id in 1 2 3 1_snap0  
  sudo sgdisk --delete=1 /dev/mapper/vms-thinvol"${id}"
  sudo sgdisk --largest-new=1 --typecode=1:8300 /dev/mapper/vms-thinvol"${id}"
  sudo sgdisk --print /dev/mapper/vms-thinvol"${id}" | tail -n 2
done  

Note: By default sgdisk will align the first partition to the 2048th sector (unless explicitly specified otherwise) which is what I used when originally creating the partition on the old LVM environment. You may skip this section if you did not utilise GPT within your LVM Thin Volume(s) or Thin Snapshot(s) - or simply do not wish to grow the single partition.

43. Recreate all device mappings to the single GPT partition located atop each of the activated LVM Thin Volumes and Thin Snapshot.
sudo kpartx -a /dev/mapper/vms-thinvol1
sudo kpartx -a /dev/mapper/vms-thinvol2
sudo kpartx -a /dev/mapper/vms-thinvol3
sudo kpartx -a /dev/mapper/vms-thinvol1_snap0
Note: You may skip this section if you did not utilise GPT within your LVM Thin Volume(s) or Thin Snapshot(s).

44. Check the integrity of the Ext4 filesystem on each Thin Volume and Thin Snapshot. This forced check was required before the resize2fs would permit the expansion of the underlying Ext4 filesystem:
sudo e2fsck -f -y /dev/mapper/vms-thinvol1p1 > /dev/null
sudo e2fsck -f -y /dev/mapper/vms-thinvol2p1 > /dev/null
sudo e2fsck -f -y /dev/mapper/vms-thinvol3p1 > /dev/null
sudo e2fsck -f -y /dev/mapper/vms-thinvol1_snap0p1 > /dev/null

45. Resize the Ext4 filesystem situated within the GPT partition to consume the additional space now available within the partition:
sudo resize2fs /dev/mapper/vms-thinvol1p1
sudo resize2fs /dev/mapper/vms-thinvol2p1
sudo resize2fs /dev/mapper/vms-thinvol3p1
sudo resize2fs /dev/mapper/vms-thinvol1_snap0p1
Note: The resize2fs utility only resizes the Ext family of filesystems, do not try to use it for other filesystems (e.g. XFS). Please refer to filesystem specific documentation/man pages with respect to resize operations for other filesystems.

46. Mount the block device mappings of the GPT partition that reside on the LVM Thin Volume(s) and Thin Snapshot(s) ensuring to pass the discard option so as to transparently handle the "cleanup" of unused blocks:
sudo mount --options discard /dev/mapper/vms-thinvol1p1 /mnt/thinvol1
sudo mount --options discard /dev/mapper/vms-thinvol2p1 /mnt/thinvol2
sudo mount --options discard /dev/mapper/vms-thinvol3p1 /mnt/thinvol3
sudo mount --options discard /dev/mapper/vms-thinvol1_snap0p1 /mnt/thinvol1_snap0

At this stage you should be able to navigate and manipulate files within the mount points for the newly migrated and resized LVM Thin volumes and Thin snapshot! Lets take a final look at the Linux's internal representation of the LVM environment:

[email protected]:~$ lsblk  
NAME                             MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT  
vdd                              254:48   0   20G  0 disk  
|-vdd2                           254:50   0   10G  0 part
| `-md0                            9:0    0   10G  0 raid1 /
|-vdd3                           254:51   0   10G  0 part 
| `-md1                            9:1    0   20G  0 raid5
|-vms-thinpool_tdata         252:1    0   16G  0 lvm
|   | `-vms-thinpool-tpool       252:2    0   16G  0 lvm
|   |   |-vms-thinvol3           252:6    0    4G  0 lvm   
|   |   | `-vms-thinvol3p1       252:10   0    4G  0 part  /mnt/thinvol3
|   |   |-vms-thinvol1           252:4    0    4G  0 lvm   
|   |   | `-vms-thinvol1p1       252:8    0    4G  0 part  /mnt/thinvol1
|   |   |-vms-thinvol1_snap0     252:7    0    4G  0 lvm   
|   |   | `-vms-thinvol1_snap0p1 252:11   0    4G  0 part  /mnt/thinvol1_snap0
|   |   |-vms-thinvol2           252:5    0    4G  0 lvm   
|   |   | `-vms-thinvol2p1       252:9    0    4G  0 part  /mnt/thinvol2
|   |   `-vms-thinpool           252:3    0   16G  0 lvm   
|   `-vms-thinpool_tmeta         252:0    0   16M  0 lvm   
|     `-vms-thinpool-tpool       252:2    0   16G  0 lvm   
|       |-vms-thinvol3           252:6    0    4G  0 lvm   
|       | `-vms-thinvol3p1       252:10   0    4G  0 part  /mnt/thinvol3
|       |-vms-thinvol1           252:4    0    4G  0 lvm   
|       | `-vms-thinvol1p1       252:8    0    4G  0 part  /mnt/thinvol1
|       |-vms-thinvol1_snap0     252:7    0    4G  0 lvm   
|       | `-vms-thinvol1_snap0p1 252:11   0    4G  0 part  /mnt/thinvol1_snap0
|       |-vms-thinvol2           252:5    0    4G  0 lvm   
|       | `-vms-thinvol2p1       252:9    0    4G  0 part  /mnt/thinvol2
|       `-vms-thinpool           252:3    0   16G  0 lvm   
`-vdd1                           254:49   0    1M  0 part  
vdb                              254:16   0   20G  0 disk  
`-vdb1                           254:17   0   20G  0 part  /mnt/storage
sr0                               11:0    1 1024M  0 rom  
vdc                              254:32   0   20G  0 disk  
|-vdc2                           254:34   0   10G  0 part  
| `-md0                            9:0    0   10G  0 raid1 /
|-vdc3                           254:35   0   10G  0 part  
| `-md1                            9:1    0   20G  0 raid5 
|   |-vms-thinpool_tdata         252:1    0   16G  0 lvm   
|   | `-vms-thinpool-tpool       252:2    0   16G  0 lvm   
|   |   |-vms-thinvol3           252:6    0    4G  0 lvm   
|   |   | `-vms-thinvol3p1       252:10   0    4G  0 part  /mnt/thinvol3
|   |   |-vms-thinvol1           252:4    0    4G  0 lvm   
|   |   | `-vms-thinvol1p1       252:8    0    4G  0 part  /mnt/thinvol1
|   |   |-vms-thinvol1_snap0     252:7    0    4G  0 lvm   
|   |   | `-vms-thinvol1_snap0p1 252:11   0    4G  0 part  /mnt/thinvol1_snap0
|   |   |-vms-thinvol2           252:5    0    4G  0 lvm   
|   |   | `-vms-thinvol2p1       252:9    0    4G  0 part  /mnt/thinvol2
|   |   `-vms-thinpool           252:3    0   16G  0 lvm   
|   `-vms-thinpool_tmeta         252:0    0   16M  0 lvm   
|     `-vms-thinpool-tpool       252:2    0   16G  0 lvm   
|       |-vms-thinvol3           252:6    0    4G  0 lvm   
|       | `-vms-thinvol3p1       252:10   0    4G  0 part  /mnt/thinvol3
|       |-vms-thinvol1           252:4    0    4G  0 lvm   
|       | `-vms-thinvol1p1       252:8    0    4G  0 part  /mnt/thinvol1
|       |-vms-thinvol1_snap0     252:7    0    4G  0 lvm   
|       | `-vms-thinvol1_snap0p1 252:11   0    4G  0 part  /mnt/thinvol1_snap0
|       |-vms-thinvol2           252:5    0    4G  0 lvm   
|       | `-vms-thinvol2p1       252:9    0    4G  0 part  /mnt/thinvol2
|       `-vms-thinpool           252:3    0   16G  0 lvm   
`-vdc1                           254:33   0    1M  0 part  
vda                              254:0    0   20G  0 disk  
|-vda2                           254:2    0   10G  0 part  
| `-md0                            9:0    0   10G  0 raid1 /
|-vda3                           254:3    0   10G  0 part  
| `-md1                            9:1    0   20G  0 raid5 
|   |-vms-thinpool_tdata         252:1    0   16G  0 lvm   
|   | `-vms-thinpool-tpool       252:2    0   16G  0 lvm   
|   |   |-vms-thinvol3           252:6    0    4G  0 lvm   
|   |   | `-vms-thinvol3p1       252:10   0    4G  0 part  /mnt/thinvol3
|   |   |-vms-thinvol1           252:4    0    4G  0 lvm   
|   |   | `-vms-thinvol1p1       252:8    0    4G  0 part  /mnt/thinvol1
|   |   |-vms-thinvol1_snap0     252:7    0    4G  0 lvm   
|   |   | `-vms-thinvol1_snap0p1 252:11   0    4G  0 part  /mnt/thinvol1_snap0
|   |   |-vms-thinvol2           252:5    0    4G  0 lvm   
|   |   | `-vms-thinvol2p1       252:9    0    4G  0 part  /mnt/thinvol2
|   |   `-vms-thinpool           252:3    0   16G  0 lvm   
|   `-vms-thinpool_tmeta         252:0    0   16M  0 lvm   
|     `-vms-thinpool-tpool       252:2    0   16G  0 lvm   
|       |-vms-thinvol3           252:6    0    4G  0 lvm   
|       | `-vms-thinvol3p1       252:10   0    4G  0 part  /mnt/thinvol3
|       |-vms-thinvol1           252:4    0    4G  0 lvm   
|       | `-vms-thinvol1p1       252:8    0    4G  0 part  /mnt/thinvol1
|       |-vms-thinvol1_snap0     252:7    0    4G  0 lvm   
|       | `-vms-thinvol1_snap0p1 252:11   0    4G  0 part  /mnt/thinvol1_snap0
|       |-vms-thinvol2           252:5    0    4G  0 lvm   
|       | `-vms-thinvol2p1       252:9    0    4G  0 part  /mnt/thinvol2
|       `-vms-thinpool           252:3    0   16G  0 lvm   
`-vda1                           254:1    0    1M  0 part  
[email protected]:~$ sudo lvs --all --options lv_name,vg_name,attr,lv_size,data_percent  
  LV               VG   Attr       LSize  Data% 
  [lvol1_pmspare]  vms  ewi------- 16.00m       
  thinpool         vms  twi-a-tz-- 15.98g 4.53  
  [thinpool_tdata] vms  Twi-ao---- 15.98g       
  [thinpool_tmeta] vms  ewi-ao---- 16.00m       
  thinvol1         vms  Vwi-aotz--  4.00g 4.59  
  thinvol1_snap0   vms  Vwi-aotz--  4.00g 4.52  
  thinvol2         vms  Vwi-aotz--  4.00g 4.91  
  thinvol3         vms  Vwi-aotz--  4.00g 4.85
[email protected]:/mnt$ df -h  
Filesystem                        Size  Used Avail Use% Mounted on  
udev                               10M     0   10M   0% /dev  
tmpfs                             401M  5.5M  396M   2% /run  
/dev/md0                          9.8G  1.1G  8.2G  12% /
tmpfs                            1003M     0 1003M   0% /dev/shm  
tmpfs                             5.0M     0  5.0M   0% /run/lock  
tmpfs                            1003M     0 1003M   0% /sys/fs/cgroup  
/dev/vdb1                          20G  922M   18G   5% /mnt/storage
/dev/mapper/vms-thinvol1p1        4.0G  120M  3.7G   4% /mnt/thinvol1
/dev/mapper/vms-thinvol2p1        4.0G  105M  3.7G   3% /mnt/thinvol2
/dev/mapper/vms-thinvol3p1        4.0G  104M  3.6G   3% /mnt/thinvol3
/dev/mapper/vms-thinvol1_snap0p1  4.0G  120M  3.7G   4% /mnt/thinvol1_snap0

TL;DR

1. Exporting the LVM "Thin" environment

# Unmount active filesystems on LVM Thin Volumes/Snapshots
[email protected]:~$ sudo umount /dev/mapper/vms-thinvolXpZ  
[email protected]:~$ sudo umount /dev/mapper/vms-thinvolX_snapYpZ

# Remove partition device mappings (if any)
[email protected]:~$ sudo kpartx -d /dev/mapper/vms-thinvolX  
[email protected]:~$ sudo kpartx -d /dev/mapper/vms-thinvolX_snapY

# Temporary deactivate Thin Volumes/Snapshots
[email protected]:~$ sudo lvchange --activate n /dev/mapper/vms-thinvolX  
[email protected]:~$ sudo lvchange --activate n /dev/mapper/vms-thinvolX_snapY

# Reactivate Thin Volumes/Snapshots in Read-Only mode
[email protected]:~$ sudo lvchange --activate n --permission r /dev/mapper/vms-thinvolX  
[email protected]:~$ sudo lvchange --activate n --permission r /dev/mapper/vms-thinvolX_snapY

# Backup the contents of the Thin Volumes/Snapshots
[email protected]:~$ sudo dd if=/dev/mapper/vms-thinvolX bs=4M conv=sparse | gzip --stdout --best > /mnt/storage/images/vms-thinvolX.raw.gz  
[email protected]:~$ sudo dd if=/dev/mapper/vms-thinvolX_snapY bs=4M conv=sparse | gzip --stdout --best > /mnt/storage/images/vms-thinvolX_snapY.raw.gz

# Backup the VG metadata
[email protected]:~$ sudo vgcfgbackup --verbose --file /mnt/storage/metadata/vms_backup vms

# Export the Thin Pool metadata
[email protected]:~$ sudo thin_dump /dev/mapper/vms-thinpool_tmeta > /mnt/storage/metadata/vms-thinpool_tmeta.xml

# Deactivate Thin Volumes/Snapshots & VG
[email protected]:~$ sudo vgchange --activate n vms

# Remove VG
[email protected]:~$ sudo vgremove vms

# Remove PV
[email protected]:~$ sudo pvremove /dev/vdaX  

2. Configuring RAID

# Create RAID 5 target
[email protected]:~$ sudo mdadm --create /dev/md1 --level=5 --raid-devices=3 /dev/vda3 /dev/vdc3 /dev/vdd3

# Register RAID 5 target for assembly post-boot
[email protected]:~$ sudo mdadm --detail --scan /dev/md1 | sudo tee --append /etc/mdadm/mdadm.conf  

3. Restoring the LVM "Thin" environment

# Create PV on RAID 5 target
[email protected]:~$ sudo pvcreate /dev/md1

# Obtain UUID of PV
[email protected]:~$ sudo pvdisplay /dev/md1 | grep UUID

# Edit the VG metadata config file
# See Step 17. in main guide

# Restore PV related metadata
[email protected]:~$ sudo pvcreate --uuid $UUID_OF_NEW_PV --restorefile /mnt/storage/metadata/vms_backup /dev/md1

# Check health of VG
[email protected]:~$ sudo vgck --verbose vms

# Recover Thin Pool
[email protected]:~$ sudo lvconvert --repair vms/thinpool

#
# Conditional (4 steps below) - "Fixing mismatching transaction IDs"
#

# Deactivate problematic VG
[email protected]:~$ sudo vgchange --activate n vms

# Remove problematic VG (forcefully)
[email protected]:~$ sudo vgremove --force --force vms

# Edit the VG metadata config file
# Alter transaction_id to the expected transaction id value

# Re-import corrected VG metadata
[email protected]:~$ sudo vgcfgrestore --file /mnt/storage/metadata/vms-thinpool.vg --force vms

# Remove temporary metadata volume
[email protected]:~$ sudo lvremove vms/thinpool_meta0

# Grow PV
[email protected]:~$ sudo pvresize /dev/md1

# Activate LVM Thin Pool
[email protected]:~$ sudo lvchange --activate y vms/thinpool

# Import LVM Thin Pool metadata
[email protected]:~$ sudo thin_restore --input /mnt/storage/metadata/vms-thinpool_tmeta.xml --output /dev/mapper/vms-thinpool_tmeta

# Deactivate Thin Pool for resizing
[email protected]:~$ sudo lvchange --activate n vms/thinpool

# Grow LVM Thin Pool _data_ volume
[email protected]:~$ sudo lvresize --extents 80%VG vms/thinpool

# Grow LVM Thin Pool _metadata volume
[email protected]:~$ sudo lvresize --size +8M vms/thinpool_tmeta

# Remove LVM Thin Pool metadata spare
[email protected]:~$ sudo lvremove vms/lvol1_pmspare

# Recreate LVM Thin Pool metadata spare (correct size)
[email protected]:~$ sudo lvconvert --repair vms/thinpool

# Remove extraneous LVM Thin metadata volume
[email protected]:~$ sudo lvremove vms/thinpool_meta0

# Activate LVM Thin Pool
[email protected]:~$ sudo lvchange --activate y vms/thinpool

# Grow LVM Thin Volumes/Snapshots
[email protected]:~$ sudo lvresize --size +?G vms/thinvolX  
[email protected]:~$ sudo lvresize --size +?G vms/thinvolX_snapY

# Activate LVM Thin Volumes/Snapshots (persistently)
[email protected]:~$ sudo lvchange --activate y --setactivationskip n vms/thinvolX  
[email protected]:~$ sudo lvchange --activate y --setactivationskip n vms/thinvolX_snapY

# Restore data contents
[email protected]:~$ gunzip --stdout /mnt/storage/images/vms-thinvolX.raw.gz | sudo dd conv=sparse bs=4M of=/dev/mapper/vms-thinvolX  
[email protected]:~$ gunzip --stdout /mnt/storage/images/vms-thinvolX_snapY.raw.gz | sudo dd conv=sparse bs=4M of=/dev/mapper/vms-thinvolX_snapY

# Fix backup GPT table
[email protected]:~$ sudo sgdisk --move-second-header /dev/mapper/vms-thinvolX  
[email protected]:~$ sudo sgdisk --move-second-header /dev/mapper/vms-thinvolX_snapY

# Resize partition
# See Step 43 for necessary commands

# Recreate GPT partition device node mappings
[email protected]:~$ sudo kpartx -a /dev/mapper/vms-thinvolX  
[email protected]:~$ sudo kpartx -a /dev/mapper/vms-thinvolX_snapY

# Check Ext4 filesystem integrity
[email protected]:~$ sudo resize2fs /dev/mapper/vms-thinvolXpZ  
[email protected]:~$ sudo resize2fs /dev/mapper/vms-thinvolX_snapYpZ

# Mount Ext4 filesystem
[email protected]:~$ sudo mount --options discard /dev/mapper/vms-thinvolXpZ /mnt/thinvolX  
[email protected]:~$ sudo mount --options discard /dev/mapper/vms-thinvolXp_snapYpZ /mnt/thinvolX_snapY  

Limitations & Shortcomings

Sadly the LVM Thin environment is not without some quite notable limitations that may make you re-consider it as a space efficient backing storage for KVM VM and LXC container usage. The issues outlined below are what negatively impacted my Libvirt (KVM) VM & LXC container directed workflows:

  • Given that TRIM passdown support to a mdadm target was not implemented/supported by the particular version of utilities used I had to ensure that VMs were either performing routinely TRIM operations (e.g. fstrim.timer) or were using the questionable discard flag on supporting filesystems.

  • While the LXC toolset had the necessary arguments for creating LXC containers on LVM Thin Volumes I was unable to get easy access to the contents of the container as I would normally on a traditional filesystem installation.

  • Creating Thin Snapshots with Libvirt is simply not possible as is with the version of Libvirt utilities I was using. Instead I had to a) use the lvcreate utility to create the Thin Snapshot and then b) either clone the VM (via Libvirt) or edit the XML storage stanza of the VM to point to the new Thin Snapshot block device node. The lack of a direct "internal snapshot" equivalent with LVM Thin Volumes/Snapshots resulted in a notable amount of management overhead emulating such a feature!

  • While not directly a disadvantage of LVM Thin environments, it is undeniable that an additional layer of block management introduces a greater risk of potential corruption. Although a workaround exists, the spare LVM Thin Pool metadata volume really should grow in parallel with the main LVM Thin Pool volume, such an oversight lead me question the maturity and stability of the state of thin provisioning in LVM.

  • The notably greater complexity for initial configuration and environment migration when compared to a QCOW2 setup. Simpler approaches typically allow for quicker and more straightforward recovery procedures at the cost of reduced performance or greater storage consumption (in this case).

Final Words

In closing, this guide has been written to serve as a step-by-step for myself in the near future when I add more SSDs to Octeron. While I had performed all the testing in the virtual machine (discussed in this guide) I was only approximately half way through writing this guide when I made the actual migration and expansion of the LVM thin environment on Octeron.

Funnily enough I have once again reviewed my VM management process since writing this guide and have decided against leveraging the LVM Thin environment for my day-to-day VM storage backend. For my particular workflow I was finding that the aforementioned limitations/shortcomings were too intrusive resulting in a reduced flexibility and increased management complexity cost for the benefit of storage gains (which I was never low on in the first place!).

To address my initial issue of "golden" Debian GNU/Linux images becoming stale I have once more adopted the QEMU virtual disk image (QCOW2) file-based backend but now perform the following procedure to ensure that the "golden" image can stay up-to-date without corrupting external snapshots:

  1. Temporarily shutdown the specified Debian GNU/Linux "golden" image via Libvirt (virsh). The virtual disk used is a raw, fully pre-allocated, 10GiB file that simply has a single Ext4 partition that houses all of '/' for simplicity.

  2. Create a single copy of the now offline Debian GNU/Linux "golden" image. Save the copy in a suitably named directory (e.g. project name) on a mounted filesystem situated upon a RAID 5 target on the SSDs.

  3. Power up the Debian GNU/Linux "golden" image once more to ensure it can stay consistently up-to-date.

  4. Create a QCOW2 external snapshot referencing the newly copied raw image as the backing image for each Debian GNU/Linux VM intended for use. If only a single Debian GNu/Linux image is required for the project in question then consider converting the raw image to the QCOW2 format for the benefits it brings.

  5. Define each Debian GNU/Linux VM utilising its corresponding QCOW2 external snapshot in Libvirt via virsh.

This approaches does "cost" 10GiBs of storage for each project as it requires a static base image that is offline for external snapshots to work correctly. One notable benefit of this approach is the simplified process of archiving a project once completed.
Using the LVM Thin environment would require me to delete the LVM Thin Volumes/Snapshots after their use and track their original size for recreation down the line. Moreover it simply may not be possible for LVM Thin Snapshots as it would require the LVM Thin Pool metadata volume to be altered to a previous state (invalidating/corrupting any project that was to be currently worked on!).