Software RAID in Linux Workstations

Tip:

You can use mdadm as a daemon to monitor a RAID array, by running:

mdadm --monitor --mail=root@localhost --delay=1800

/dev/md0

This will poll the array in intervals of 1800 seconds, and critical events and

failures will be emailed to the system administrator. There are many other

monitoring systems available for Linux software RAID as well.

Multiple Disk Failure

In the case of a temporary failure of multiple disks, such as a disk controller failure or cable coming

loose that affects multiple disks, the RAID superblocks will afterwards be out of sync and the RAID

array can no longer be initialized. Using mdadm, you can run:

mdadm --assemble ñforce

to try and recreate the array. If that method doesn’t work, you can run:

mkraid --force

to rewrite the RAID superblocks. In order for this to work, you will need to have a completely up-to-

date /etc/raidtab file, otherwise, if the ordering of the disks is different than expected, data on all

disks could be lost.

Additional Configuration Information

The Persistent Superblock

Previously, the raidtools, which are included with most major Linux distributions, would read your

/etc/raidtab file, and then initialize the filesystem. This required that the filesystem on which

/etc/raidtab resided was mounted, which was unfortunate if you wanted to boot on a RAID.

The persistent superblock solves these problems. When an array is initialized with the persistent-

superblock option in the /etc/raidtab file, a special superblock is written to the beginning of

all disks participating in the array. This allows the kernel to read the configuration of RAID devices

directly from the disks involved, instead of reading from the /etc/raidtab configuration file that

might not be available at all times. You should still maintain a consistent /etc/raidtab file, since

you may need this file for later reconstruction of the array.

The persistent superblock is mandatory if you want auto-detection of your RAID devices upon system

boot.

Chunk Sizes

The chunk size is defined as the smallest amount of data that can be written to a device. You can

never write completely in parallel to a set of disks. If you had two disks and wanted to write a byte,

you would have to write four bits on each disk, with every second bit going to disk 0 and the others

to disk 1. Hardware doesn’t support that, so chunk size is used instead. A write of 16kB with a chunk

size of 4kB will cause the first and the third 4kB chunks to be written to the first disk, and the second

and fourth chunks to be written to the second disk, in the RAID-0 case with two disks. Thus, for large

writes, you may see lower overhead by having fairly large chunks, whereas arrays that are primarily

holding small files may benefit more from a smaller chunk size.