Guide on removing a Linux mdadm
RAID1 array while preserving existing partition data, avoiding the need to reinstall or copy files around.
If you ever end up with a similar need, here is how I did it. It goes without saying that you should back up data before doing anything, otherwise you accept the risk of losing it all. Hic sunt dracones 🐉
A dedicated SoYouStart server with two disks, for hosting virtual machines via VMware ESXi.
Unfortunately VMware ESXi does not support software RAID: as a workaround, it is possible to attach two VMDK (virtual disk) to each VM, with each VMDK stored on a separate datastore (hardware disk), and setup software RAID1 directly from the VM via mdadm
.
This annoying limitation actually was one of the main motivators for migrating from VMware ESXi to Proxmox VE, which supports software RAID1 over ZFS. With RAID handled at the host level, virtual machines do not need to manage software RAID themselves anymore, and only one disk needs to be kept for each VM.
We are going to remove devices from the array, then trick mdadm
into being cool with a RAID1 array consisting of only one device.
First, inspect partitions and disks to identify where is what and what needs to be done:
mdadm
array.# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 100G 0 disk
├─sda1 8:1 0 953M 0 part [SWAP]
└─sda2 8:2 0 99.1G 0 part
└─md0 9:0 0 99.1G 0 raid1 /
sdb 8:16 0 100G 0 disk
├─sdb1 8:17 0 953M 0 part /var/tmp
└─sdb2 8:18 0 99.1G 0 part
└─md0 9:0 0 99.1G 0 raid1 /
# cat /proc/mdstat
md0 : active raid1 sda2[0] sdb2[1]
103872512 blocks super 1.2 [2/2] [UU]
# mdadm --detail /dev/md0
State : active
For example, here:
/var/tmp
partition.md0
is spread over sda2
and sdb2
.sda
disk and keep sdb2
.Once everything is settled:
# mdadm /dev/md0 --fail /dev/sda2
mdadm: set /dev/sda2 faulty in /dev/md0
# mdadm /dev/md0 --remove /dev/sda2
mdadm: hot removed /dev/sda2 from /dev/md0
# mdadm --zero-superblock /dev/sda2
In our case, we first dropped the sda1
swap partition (not shown above: disabled with swapoff
and an /etc/fstab
update), and then removed sda2
from the array.
If we stopped at this stage mdadm
would complain the array is degraded because it is missing a device:
# cat /proc/mdstat
md0 : active raid1 sdb2[1]
103872512 blocks super 1.2 [2/1] [_U]
# mdadm --detail /dev/md0
State : clean, degraded
Fortunately, we can trick the array into being cool with one device so that it does not complain it is degraded:
# mdadm --grow /dev/md0 --force --raid-devices=1
raid_disks for /dev/md0 set to 1
# cat /proc/mdstat
md0 : active raid1 sdb2[1]
103872512 blocks super 1.2 [1/1] [U]
# mdadm --detail /dev/md0
State : clean
If the removed partition / disk was also the boot partition / disk, make sure to update grub
, initramfs
and /etc/fstab
as necessary:
# vi /etc/fstab
# grub-install /dev/sdb
# update-initramfs -u
In our case:
md0
is the boot partition but the bootloader was only installed on sda
(check presence of GRUB
with dd if=/dev/sda bs=512 count=1 2> /dev/null | strings
), so we had to reinstall grub
on the remaining sdb
disk to ensure it can be booted from.initramfs
references the swap partition in its RESUME
variable, so we had to remove it from /etc/initramfs-tools/conf.d/resume
and update initramfs
since the swap partition was removed.The machine can now be shutdown, and the unused disk detached from the hardware. Update the VM disk boot order if necessary and boot: voilà!
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 100G 0 disk
├─sda1 8:1 0 953M 0 part /var/tmp
└─sda2 8:2 0 99.1G 0 part
└─md0 9:0 0 99.1G 0 raid1 /
The solution above works completely fine.
However, if we’d like to go the extra mile and remove mdadm
altogether, then we also need to fiddle with the remaining device: the rough idea is to rewrite the mdadm
partition to strip out the RAID superblock while keeping the rest intact.
This sort of funky partition business is typically done from rescue mode, but in this article we’re going to do it directly from the live system for maximum thrill.
# blkid
/dev/sda2: UUID="<sda2_uuid>" TYPE="linux_raid_member"
/dev/md0: UUID="<md0_uuid>" TYPE="ext4"
Take note of both the linux_raid_member
RAID partition (here sda2
) and the underlying partition itself (here md0
).
linux_raid_member
partition and take note of the version and data offset:# mdadm --examine /dev/sda2
Version : 1.2
Data Offset : 16384 sectors
From the man page:
The different sub-versions store the superblock at different locations on the device, either at the end (for 1.0), at the start (for 1.1) or 4K from the start (for 1.2). “1” is equivalent to “1.2” (the commonly preferred 1.x format). “default” is equivalent to “1.2”.
Since we have version 1.2, everything before the data offset in the partition is the RAID superblock, everything after is the underlying partition.
fdisk
on the disk:# fdisk -u /dev/sda
fdisk
prompt, print the partitions using p
:Command (m for help): p
Device Boot Start End Sectors Size Id Type
/dev/sda1 * 2048 1953791 1951744 953M 83 Linux
/dev/sda2 1953792 209715199 207761408 99.1G fd Linux raid autodetect
d
and its partition number.n
:
p
for primary and reuse the same partition number.16384 + 1953792 = 1970176
using the example above).209715199
).w
. This should position the new sda2
partition right in place of the underlying partition (formerly md0
).Spoiler: if the destroyed partition was the boot partition, then the jump scare moment is right now 😱
This is our case: we hit grub rescue
because we destroyed the boot partition.
To recover from that, we are going to manually point grub rescue
to our brand new partition.
grub rescue
prompt, display existing boot values with set
:grub rescue> set
prefix=(mduuid/<sda2_uuid>)/boot/grub
root=mduuid/<sda2_uuid>
As we can see, the issue is that grub
points to the partition we just destroyed (<sda2_uuid>
).
Take note of the prefix
filepath (here /boot/grub
).
ls
to display the existing partitions, named (hdX,msdosX)
(MBR) or (hdX,gptX)
(GPT):grub rescue> ls
(hd0) (hd0,msdos2) (hd0,msdos1)
ls
on each partition until we find the one containing the prefix
filepath:grub rescue> ls (hd0,msdos1)/boot/grub
error: file '/boot/grub' not found
grub rescue> ls (hd0,msdos2)/boot/grub
./ ../ unicode.pf2 i386-pc/ locale/ fonts/ grubenv grub.cfg
grub rescue> set prefix=(hd0,msdos2)/boot/grub
grub rescue> set root=(hd0,msdos2)
normal
module and start it:grub rescue> insmod normal
grub rescue> normal
This should boot the system as usual.
sda2
has properly replaced the former md0
partition:# blkid
/dev/sda2: UUID="<md0_uuid>" TYPE="ext4"
mdadm
was properly removed:# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 100G 0 disk
├─sda1 8:1 0 953M 0 part /var/tmp
└─sda2 8:2 0 99.1G 0 part /
# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices: <none>
# mdadm --detail /dev/md0
mdadm: cannot open /dev/md0: No such file or directory
# mdadm --examine /dev/sda2
mdadm: No md superblock detected on /dev/sda2.
grub
so as to point it to the right UUID and avoid hitting grub rescue
again:# grub-install /dev/sda