/
PBX Platforms - Rebuilding Software Raid

PBX Platforms - Rebuilding Software Raid

Rebuilding an array currently requires us to manually rebuild the partition table on the fresh hard drive. We need to know which drive is still active and which one is the new one. To see this let's run the following command:

# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[0]       104320 blocks [2/1] [U_]   md1 : active raid1 sdb2[0]       1052160 blocks [2/1] [U_]   md2 : active raid1 sdb3[0]       243039232 blocks [2/1] [U_]   unused devices: <none>

To read this look at the first managed disk, md0. If you are adding back in a partition on a drive that is not empty you may have to keep track of which drive different ones are on. For the current purposes we will assume that we are installing a fresh, unpartitioned drive.

We need to see what the current drive is partitioned as, so we can dulpicate the same partition table on the new drive:

# fdisk -l /dev/sdb   Disk /dev/sdb: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes      Device Boot      Start         End      Blocks   Id  System /dev/sdb1   *           1          13      104391   fd  Linux raid autodetect /dev/sdb2              14         144     1052257+  fd  Linux raid autodetect /dev/sdb3             145       30401   243039352+  fd  Linux raid autodetect

... and just to verify that /dev/sda is blank ...

# fdisk -l /dev/sda   Disk /dev/sda: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes      Device Boot      Start         End      Blocks   Id  System

Now we need to edit the partition table on /dev/sda to match exactly what we see on /dev/sdb.

# fdisk /dev/sda   The number of cylinders for this disk is set to 30401. There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with: 1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs    (e.g., DOS FDISK, OS/2 FDISK)   Command (m for help): n Command action    e   extended    p   primary partition (1-4) p Partition number (1-4): 1

For the cylinder start/stop values just refer to the existing partition table. It says "Start" and "End" values for each partition. If you just copy these exactly as the fdisk -l for that drive outputs it will create them exactly the same for you.

First cylinder (1-30401, default 1): 1 Last cylinder or +size or +sizeM or +sizeK (1-30401, default 30401): 13   Command (m for help): p   Disk /dev/sda: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes      Device Boot      Start         End      Blocks   Id  System /dev/sda1               1          13      104391   83  Linux   Command (m for help): n Command action    e   extended    p   primary partition (1-4) p Partition number (1-4): 2 First cylinder (14-30401, default 14): 14 Last cylinder or +size or +sizeM or +sizeK (14-30401, default 30401): 144   Command (m for help): p   Disk /dev/sda: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes      Device Boot      Start         End      Blocks   Id  System /dev/sda1               1          13      104391   83  Linux /dev/sda2              14         144     1052257+  83  Linux   Command (m for help): n Command action    e   extended    p   primary partition (1-4) p Partition number (1-4): 3 First cylinder (145-30401, default 145): 145 Last cylinder or +size or +sizeM or +sizeK (145-30401, default 30401): 30401

We need to set the boot partition as bootable or this drive won't be very useful if the other dies

Command (m for help): a Partition number (1-4): 1

Now we need to set the partition type to 'fd' for all of the partitions which is hex for a linux raid partition.

Command (m for help): t Partition number (1-4): 1 Hex code (type L to list codes): fd Changed system type of partition 1 to fd (Linux raid autodetect)   Command (m for help): t Partition number (1-4): 2 Hex code (type L to list codes): fd Changed system type of partition 2 to fd (Linux raid autodetect)   Command (m for help): t Partition number (1-4): 3 Hex code (type L to list codes): fd Changed system type of partition 3 to fd (Linux raid autodetect)

Let's look at the partition table we created, it should be identical to the one above from the existing hard drive.

Command (m for help): p   Disk /dev/sda: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes      Device Boot      Start         End      Blocks   Id  System /dev/sda1   *           1          13      104391   fd  Linux raid autodetect /dev/sda2              14         144     1052257+  fd  Linux raid autodetect /dev/sda3             145       30401   243039352+  fd  Linux raid autodetect

If it looks good then we use w to write to the disk and exit

Command (m for help): w The partition table has been altered!   Calling ioctl() to re-read partition table. Syncing disks.

I usually run 'cat /proc/mdstat' again so I can see the partitions and compare as I add the newly created partitions back into the raid array.

# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[0]       104320 blocks [2/1] [U_]   md1 : active raid1 sdb2[0]       1052160 blocks [2/1] [U_]   md2 : active raid1 sdb3[0]       243039232 blocks [2/1] [U_]   unused devices: <none>

Now we have to add in each of the partitions back into the managed disks, one at a time. I run 'cat /proc/mdstat' again after each addition to make sure it worked.

# mdadm /dev/md0 --add /dev/sda1 mdadm: added /dev/sda1   # cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sda1[1] sdb1[0]       104320 blocks [2/2] [UU]   md1 : active raid1 sdb2[0]       1052160 blocks [2/1] [U_]   md2 : active raid1 sdb3[0]       243039232 blocks [2/1] [U_]   unused devices: <none>   # mdadm /dev/md1 --add /dev/sda2 mdadm: added /dev/sda2   # cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sda1[1] sdb1[0]       104320 blocks [2/2] [UU]   md1 : active raid1 sda2[2] sdb2[0]       1052160 blocks [2/1] [U_]       [========>............]  recovery = 43.5% (458752/1052160) finish=0.1min speed=76458K/sec   md2 : active raid1 sdb3[0]       243039232 blocks [2/1] [U_]   unused devices: <none>   # mdadm /dev/md2 --add /dev/sda3 mdadm: added /dev/sda3   # cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sda1[1] sdb1[0]       104320 blocks [2/2] [UU]   md1 : active raid1 sda2[1] sdb2[0]       1052160 blocks [2/2] [UU]   md2 : active raid1 sda3[2] sdb3[0]       243039232 blocks [2/1] [U_]       [>....................]  recovery =  0.1% (308480/243039232) finish=78.6min speed=51413K/sec   unused devices: <none>

The last step is to update/refresh the grub configuration on both drives. These steps need to be taken on the main drive (eg. /dev/sda not /dev/sda1) of each member of the RAID array:

# /sbin/grub grub> device (hd0) /dev/sda grub> root (hd0,0) grub> setup (hd0) grub> device (hd0) /dev/sdb grub> root (hd0,0) grub> setup (hd0) grub> device (hd0) /dev/sdX grub> root (hd0,0) grub> setup (hd0)

NOTE: the device name changes, but the grub values (hd0) do not. This ensures that the drives are detected properly in a failure situation.

And that is it. We are rebuilding as you can see. In this case it should take almost an hour and a half to rebuild the largest partition, with the smaller ones done almost as fast as you can type the commands.

Related content

Unable to render {include} The included page could not be found.