Creating a RAID 1 Array for Data Storage from an Existing External Hard Drive Without Data Loss

I have an external hard drive used to store all my pictures, media and videos. Over the years, I have had to enlarge my storage capacity to make room for new additions. Eventually, I could no longer fit into my network-attached storage device and found a great deal on a much larger external hard drive. Unfortunately, this didn’t offer the RAID 1 backup capacity of the two bay NAS. I have the drive connected to a dedicated Linux computer and eventually purchased a second copy of the identical external hard drive once the funds were available. (For some reason the external drive is cheaper than even the least expensive matching drive without housing)

I started by pulling both drives out of their enclosures. I had one drive connected with data, and one drive connected without data. What to do now?

The typical process for this would be to backup all the data to a 3rd drive, connect both drives to the system, and wipe them both to create a new RAID 1 array. Then I would have to copy all the data back from the 3rd drive into the software RAID 1 drive. That requires a 3rd drive or multiple drives with enough capacity to store all the data. In a massive data center, I’m sure that’s no problem. In my basement media room, I don’t have a bunch of extra TBs of space sitting around.

I needed to find a way to get the data into a RAID 1 array without destroying it in the process. It’s possible to create the RAID 1 array in a degraded state. That means it will only have one device in the array. Typically this state happens when a device fails and a new drive is inserted which then receives a complete copy of the data.

The new drive needs to be formatted. I went with ext4 using gparted. Just install the software and follow the gparted instructions to delete all existing partitions and create a new one. Leave 100Mb at the end of the drive to avoid issues with manufacturing differences between the drives. Be VERY careful not to delete the wrong drive. Remember they are identical, and you can quickly get confused. If you erase all your data, you have no backups and no data which is very bad and the whole point of this exercise.

sudo apt install gparted

Then you can create the RAID array with mdadm. This process is decently advanced, so don’t proceed until you know what you’re doing. Read the manual for mdadm. Be sure to change /dev/abcde to the actual partition you just created. Again if you do this wrong, you’re going to lose all your data and be very sad. The missing keyword lets you create the device in a degraded state.

sudo mdadm --create /dev/md0 --level=1 --raid-devices=2 missing /dev/abcde

I got a warning about not being able to boot the device, which was fine with me because this is just for data storage. To confirm that it completed correctly, check the magical file which stores raid information.

sudo tail /proc/mdstat

Once the process completes, you will have a drive available at /dev/md0, but it has no file system. If you get the error below, then you need to create the files system.

mount: /home/ste/mnt: wrong fs type, bad option, bad superblock on /dev/md0, missing codepage or helper program, or other error.

sudo mkfs.ext4 /dev/md0

Now that you’ve got the file system setup you can mount the “array” which is actually just the one drive.
I changed the owner to my user so that I could use the UI to copy the data, but that didn’t work. I was still required to type my password in a bunch, so I just launched nautilus as root for this copy which creates all the new files as root.

sudo mkdir /mnt/new-raid
sudo mount /dev/md0 /mnt/new-raid
sudo nautilus

You can spend the next 15 hours coping your data onto the new drive using the UI or the command line. When you’re done, you will likely need to change the owner of all the new files back to your user. 

sudo chown -R myuser: /mnt/new-raid

Use rsync to copy all the files in a more seamless way. It will still take forever if not longer. To explain -ahH uses archive mode (a) which is a variety of copy variables. Mostly it preserves permissions and time stamps. Human readable (h) makes the data look a little nicer. Hard Links (H) preserves the hard links for any folders like “backintime” which makes use of them. Show progress (–progress) will show progress as the sync happens. I only included this because otherwise, it won’t be apparent that anything is happening and that’s a little unsettling on a process that takes days.

sudo rsync -ahH --progress /old_drive/ /new_drive/

I added the array configuration to the mdadm.conf file. /etc/mdadm/mdadm.conf. Use this command to get the configuration line and add it to the end of the config file.

sudo mdadm --detail --scan >> /etc/mdadm/mdadm.conf

Next, you need to make the old drive match the new drive. My previous drive was in an enclosure and mounted using the fstab. When I removed this device, Ubuntu wouldn’t boot correctly, so I needed to remove my custom mounting before switching around the device. While there I set up the mount for the new array. The UUID is for md0 which can be found in “lsblk.” Don’t use the ones for the drives which are involved in the array. Use the “mount” command to reload the fstab file and confirm your new entry is correct.

sudo lsblk -f
sudo gedit /etc/fstab
UUID=2d603a18-7ed4-4335-906c-fa334f4cad93 /media/Videos ext4 defaults 0 1
sudo mount -av

At this point, you’ve got your new drive setup and mounted it the way you wanted it to show up in the file system. You’ve restarted a couple of times to make sure it’s working, and you’ve copied all your files over as the new working copy of the data. Remember you’re flying without a dependable backup so the files on the new RAID will be the only copy of the files. Be sure everything is there and working.

The next step is to wipe the old drive and add it into the RAID array then copy the files from the new drive onto the old drive to complete the process. Use sgdisk to create a backup of the partition table from the old drive. Move that backup to the new drive to ensure identical partitions. Updating the partition table is a destructive process and will wipe out all data on the target drive. The data on the drive which performed the backup drive should be unaffected.

sudo sgdisk --backup=table /dev/sdc
sudo sgdisk --load-backup=table /dev/sdb
sudo sgdisk -G /dev/sdb

Use fdisk or lsblk or blkid to confirm that the partition matches. It should be exactly identical.

fdisk -l

Restart your computer to make the new changes active in the operating system. This drive will need to have it’s RAID and file system setup like the previous drive. At this point, my array is getting set as /dev/md127 instead of the expected /dev/md0 on restart. Stop the array and then have mdadm recreate it to get it back to the md0. You should get a message mdadm: /dev/md/0 has been started with 1 drive (out of 2)

sudo umount /dev/md127
sudo mdadm --stop /dev/md127
sudo mdadm --assemble --scan

To add the newly partitioned drive to the array use mdadm. /dev/md0 is the target array. -a is for adding a new partition and /dev/sdb1 is the partition to add.

sudo mdadm /dev/md0 -a /dev/sdb1
sudo blkid

Looking at the blkid output you should see the 2 partitions sdb1 and sdc1 both of which should be TYPE=”linux_raid_member.” Additionally, you should check the details for your array to make sure everything is functioning correctly.

sudo mdadm --detail /dev/md0

You should now see 2 drives at the bottom of the output. Confirm that the raid level is RAID1, Array Size is what you expected, Raid devices and Total Devices should match at 2 since this is a RAID 1 setup. At the bottom of the screen, the rebuilding process should have already begun when you added the new drive. State spare rebuilding.

If you would like to watch the progress in the terminal, you can use this command to get an updating progress view.

sudo watch cat /proc/mdstat

 

Here are some useful commands for working with the mdadm files and troubleshooting.

sudo watch cat /proc/mdstat
sudo tail /proc/mdstat
sudo nano /etc/mdadm/mdadm.conf
sudo mdadm --detail /dev/md0
sudo mdadm --stop /dev/md0
sudo mdadm --assemble --scan
sudo blkid
sudo lsblk -f
sudo mdadm -Db /dev/md0
sudo mount -av

Sources:
https://wiki.archlinux.org/index.php/Convert_a_single_drive_system_to_RAID#Create_the_RAID_device
https://askubuntu.com/questions/973632/unable-to-mount-raid1-md0-wrong-fs-type-bad-option-bad-superblock-on-dev-md
https://unix.stackexchange.com/questions/320103/whats-the-difference-between-creating-mdadm-array-using-partitions-or-the-whole
https://www.digitalocean.com/community/tutorials/how-to-manage-raid-arrays-with-mdadm-on-ubuntu-16-04
https://superuser.com/questions/287462/how-can-i-make-mdadm-auto-assemble-raid-after-each-boot
https://tech.feedyourhead.at/content/copy-partition-table-one-disk-another

4 thoughts on “Creating a RAID 1 Array for Data Storage from an Existing External Hard Drive Without Data Loss

  1. Hello. Thanks for writing this guide. I intend to try it but I want to convert my existing Linux boot HDD to RAID 1. I understand there will be complications with the Linux partitions and with GRUB? Could you expand the guide to cover this scenario? I had 2 HDDs right now… one with an active Linux system being used for development work, and a second harddisk that is blank right now.

  2. Warning! Tutorial is wrong at the point of updating partitions with sgdisk. Its written so confusing that you ruin your array if you follow it. Correct steps are (taken from original article – see sources):
    Partition original disk
    Copy the partition table from /dev/sdb (newly implemented RAID disk) to /dev/sda (second disk we are adding to the array) so that both disks have exactly the same layout:

    # sfdisk -d /dev/sdb | sfdisk /dev/sda
    # sgdisk -G /dev/sda

    Second thing is missing part when fixing /dev/md127 to /dev/md0
    You need to update initramfs so it contains your mdadm.conf settings during boot.

    # update-initramfs -u

    • Hi, thank you for the feedback. These are the steps that worked for me but they are not a tutorial. Maybe your comment will help someone although it doesn’t make sense to me. All of the steps that were required for my system at that time are included.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.