With Linux, the open source OS [Operating System] becoming more popular every day with various flavors [distributions], Bright Enterprise Labs decided that it would be a good idea to point out one excellent feature that Linux offers for free – in all versions on the market. Software based RAID (Redundant Array of Independent Disks). No fancy add on with driver support needed, all you need is two or more storage connectors on your mainboard. Not only SATA is what you can use, even a mix of IDE and SATA drives is possible, although we don’t recommend it due to the very difference in bandwidth that both interfaces offer. Also, two IDE drives on the same port will suffer performance when they have to share the available bandwidth of that link.
Most of our readers will know the difference between RAID0, RAID1, RAID (1+0) aka. RAID10, RAID4, RAID5 and RAID6, but we’d like to briefly introduce them to you. The definition ‘n’ means the total capacity of a disk drive.
RAID0 a.k.a. Drive Span or Stripping (Total capacity = n * number of drives)
Requires two drives and will extend one drive to the next one. This will give you a maximum of capacity, where you reach 100% of your total capacity in one drive, but if one drive fails, all your data will be lost. Also, the reason why it is very popular is the fact that it speeds up your performance quite a bit because it can read / write to two disks at a time, thus doubling the performance of a single drive.
RAID1 a.k.a. Drive Mirror or Mirroring (Total capacity = total capacity / 2)
Requires two drives. All your data on drive #1 will be written instantly to drive #2. This will give you only 50% of your total capacity, but a 50:50 chance if a single drive fails; the remaining one will simply continue to handle your I/O requests.
RAID10 a.k.a. 0+1 (Spanned Mirror) (Total capacity = total capacity / 2)
Requires at least four drives [any number of drives that can be divided by two]. It adds the functionality of both RAID0 and RAID1. Two or more drives will be accessed as a span, but at the same time, requiring the same number of drives for a second span where all the data is also mirrored to. Also, you get the speedup that RAID0 offers. This will also give you 50% of your total capacity, but enables you to use more than 2 disks in one array.
RAID4 (Total capacity = n * number of drives – n)
Requires at least three disk drives. The major difference to other RAID levels is the fact that there is one dedicated drive containing parity data. Thus on every write among the data disks, there’s a write on the parity disk, causing a lot more write accesses to this disk and limiting the maximum RAID performance to the maximum performance of that parity disk. Also, due to the higher number of accesses, this drive is most likely to fail.
RAID5 (Total capacity = n * number of drives – n)
Requires at least three drives, but has no limits if you want to add more drives. The capacity of one drive is removed from the total capacity. The extra capacity is needed for the extra parity data that is being stored among all drives. If one drive fails, the RAID will become ‘degraded’ but will still let you read data off this drive. If another drive fails, all your data will be lost.
RAID6 (Total capacity = n * number of drives – 2 * n)
Requires four or more drives and is similar to RAID5, simply writing the parity data to two different drives and thus is able to handle a drive failure and still allows you to write – if a second drive fails, you’ll only be able to read from this drive. Think of it as RAID5+1.
There are many more RAID levels available. Sometimes vendor specific (in other words, Netapp has modified some RAID levels within its own product lineup and Intel uses RAID1 & RAID0 on its drive matrix option found in today’s chipsets) or simply not found frequently on the market, so we’ll stick to the most used RAID levels.
Linux RAID modules (md) also offers you to add one or more ‘spare’ drive to any RAID other than RAID0, where you do not have any safety and a spare drive doesn’t make sense at all. This spare drive will jump in and replace a failed disk immediately, so your system doesn’t run with a problem for too long.
A downside of Linux RAID is that while it is not impossible to make it work, it’s pretty tricky to boot from a RAID drive setup, so you should have a single drive to boot from, after that, the rest of the system can run off a RAID. But we’re sure that this will be solved in the future, where you’ll have the option of setting up a RAID at the time of Linux installation.
CONTINUED: Creating a RAID, Adding a drive, MDAM Growing RAID, Conclusion.
Creating a RAID array
Enough of the theory, here’s how you can create a drive RAID via the command line on pretty much any flavor of Linux. Now, just like every Linux, this is simple operation if you follow the instructions. Do not be afraid and just take things one step at the time. We are going to try to be as clear as possible, but this is not a walk in the park such as Windows shenanigans.
Our example is creation of a RAID array on a virtual machine with 6 virtual 100 MB disks. If you create a RAID5 among four or more SATA disks, the creation process can range from a few minutes up to an hour or more, depending on your system performance. We will take a look at what this procedure can do for your virtual machine in a follow-up story.

Initial window, preparing for RAID initialization
First thing you need to need are the mdadm tools that are used in this guide. Not all distributions supply this with a basic setup, so you‘d need either "apt-get install mdadm" if you’re on a Debian based Linux Distribution (like Ubuntu) or just check with your package manager to install the mdadm tools.
Second thing to do is to create a partition and mark it with a type that Linux will recognize as a special RAID partition.
admin@debian-vm:~$ sudo fdisk /dev/sdb
Command (m for help): m (Type in ‘m’) (Let’s look at the options we have here)
Command action
a toggle a bootable flag
b edit bsd disklabel
c toggle the dos compatibility flag
d delete a partition
l list known partition types
m print this menu
n add a new partition
o create a new empty DOS partition table
p print the partition table
q quit without saving changes
s create a new empty Sun disklabel
t change a partition’s system id
u change display/entry units
v verify the partition table
w write table to disk and exit
x extra functionality (experts only)
Command action
e extended
p primary partition (1-4)
Partition number (1-4): 1 (Type in ‘1’) (We want the whole drive, so we need just one partition)
First cylinder (1-102, default 1): (simply press ENTER to accept the default values)
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-102, default 102):
Using default value 102
Command (m for help): t (Type in ‘t’) (Now we need to set the type of the partition)
Selected partition 1
Hex code (type L to list codes): L (Type in ‘L’) (Just to check the options we have available)
0 Empty 1e Hidden W95 FAT1 80 Old Minix be Solaris boot
1 FAT12 24 NEC DOS 81 Minix / old Lin bf Solaris
2 XENIX root 39 Plan 9 82 Linux swap / So c1 DRDOS/sec (FAT-
3 XENIX usr 3c PartitionMagic 83 Linux c4 DRDOS/sec (FAT-
4 FAT16 <32M 40 Venix 80286 84 OS/2 hidden C: c6 DRDOS/sec (FAT-
5 Extended 41 PPC PReP Boot 85 Linux extended c7 Syrinx
6 FAT16 42 SFS 86 NTFS volume set da Non-FS data
7 HPFS/NTFS 4d QNX4.x 87 NTFS volume set db CP/M / CTOS / .
8 AIX 4e QNX4.x 2nd part 88 Linux plaintext de Dell Utility
9 AIX bootable 4f QNX4.x 3rd part 8e Linux LVM df BootIt
a OS/2 Boot Manag 50 OnTrack DM 93 Amoeba e1 DOS access
b W95 FAT32 51 OnTrack DM6 Aux 94 Amoeba BBT e3 DOS R/O
c W95 FAT32 (LBA) 52 CP/M 9f BSD/OS e4 SpeedStor
e W95 FAT16 (LBA) 53 OnTrack DM6 Aux a0 IBM Thinkpad hi eb BeOS fs
f W95 Ext’d (LBA) 54 OnTrackDM6 a5 FreeBSD ee EFI GPT
10 OPUS 55 EZ-Drive a6 OpenBSD ef EFI (FAT-12/16/
11 Hidden FAT12 56 Golden Bow a7 NeXTSTEP f0 Linux/PA-RISC b
12 Compaq diagnost 5c Priam Edisk a8 Darwin UFS f1 SpeedStor
14 Hidden FAT16 <3 61 SpeedStor a9 NetBSD f4 SpeedStor
16 Hidden FAT16 63 GNU HURD or Sys ab Darwin boot f2 DOS secondary
17 Hidden HPFS/NTF 64 Novell Netware b7 BSDI fs fd Linux raid auto
18 AST SmartSleep 65 Novell Netware b8 BSDI swap fe LANstep
1b Hidden W95 FAT3 70 DiskSecure Mult bb Boot Wizard hid ff BBT
1c Hidden W95 FAT3 75 PC/IX
Hex code (type L to list codes): fd (Type in ‘fd’) (fd – ‘Linux raid auto’ is what we need)
Changed system type of partition 1 to fd (Linux raid autodetect)
Command (m for help): p (Type in ‘p’) (Have a look at the new partition table)
Disk /dev/sdb: 107 MB, 107413504 bytes
64 heads, 32 sectors/track, 102 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Disk identifier: 0x2c47d250
Device Boot Start End Blocks Id System
/dev/sdb1 1 102 104432 fd Linux raid autodetect
If you think something is wrong and decide to exit without making any changes, type in ‘q’
Command (m for help): w (all fine, let’s write that partition table to the disk)
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
admin@debian-vm:~$
Now you’ll need to repeat that with all the drives you want to add to your RAID
example:
sudo fdisk /dev/sdc
sudo fdisk /dev/sdd

Taking it easy, just type in commands that suit your needs – all the commands are listed below.
After this is done for all the disks in your system, you can easily add all your drives to the new RAID you want to have. All you need is this little line:
mdadm –create /dev/device_name (we use mdX) –level=Y (anything from 0, 1, 10, 4, 5 or 6) –raid-devices=Z (number of drives you want to add) /dev/hdX1 or /dev/sdX1 (you’ll need to know the names of the partitions you want to add and set that ‘1’ because you want to access the first partition of that drive)
Creating a RAID0
admin@debian-vm:~$ sudo mdadm –create /dev/md0 –level=0 –raid-devices=2 /dev/sdb1 /dev/sdc1
mdadm: array /dev/md0 started.
admin@debian-vm:~$ cat /proc/mdstat (this will give us a brief info on our RAIDs)
Personalities: [raid0]
md0 : active raid0 sdc1[1] sdb1[0]
208640 blocks 64k chunks
unused devices: <none>
Creating a RAID1
admin@debian-vm:~$ sudo mdadm –create /dev/md0 –level=1 –raid-devices=2 /dev/sdb1 /dev/sdc1
mdadm: array /dev/md0 started.
admin@debian-vm:~$ cat /proc/mdstat (this will give us a brief info on our RAIDs)
Personalities: [raid1]
md0 : active raid1 sdc1[1] sdb1[0]
104320 blocks 64k chunks
unused devices: <none>
Creating a RAID10
admin@debian-vm:~$ sudo mdadm –create /dev/md10 –level=10 –raid-devices=4 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
mdadm: array /dev/md10 started.
admin@debian-vm:~$ cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10]
md10 : active raid10 sde1[3] sdd1[2] sdc1[1] sdb1[0]
208640 blocks 64K chunks 2 near-copies [4/4] [UUUU]
unused devices: <none>
admin@debian-vm:~$
Creating a RAID4
admin@debian-vm:~$ sudo mdadm –create /dev/md4 –level=4 –raid-devices=4 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
mdadm: array /dev/md10 started.
admin@debian-vm:~$ cat /proc/mdstat (so it’s running in the background, let’s check it)
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md4 : active raid4 sde1[4] sdd1[2] sdc1[1] sdb1[0]
312960 blocks level 4, 64k chunk, algorithm 0 [4/3] [UUU_]
[==============>……] recovery = 72.5% (76544/104320) finish=0.0min speed=25514K/sec
unused devices: <none>
admin@debian-vm:~$ cat /proc/mdstat (do it again to check if it’s done)
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md4 : active raid4 sde1[3] sdd1[2] sdc1[1] sdb1[0]
312960 blocks level 4, 64k chunk, algorithm 0 [4/4] [UUUU]
unused devices: <none>
admin@debian-vm:~$
Creating a RAID5
admin@debian-vm:~$ sudo mdadm –create /dev/md5 –level=5 –raid-devices=4 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
mdadm: array /dev/md5 started.
admin@debian-vm:~$ cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md5 : active raid5 sde1[4] sdd1[2] sdc1[1] sdb1[0]
312960 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
[================>….] recovery = 83.3% (87552/104320) finish=0.0min speed=29184K/sec
unused devices: <none>
admin@debian-vm:~$ cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md5 : active raid5 sde1[3] sdd1[2] sdc1[1] sdb1[0]
312960 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
unused devices: <none>
Creating a RAID6
admin@debian-vm:~$ sudo mdadm –create /dev/md6 –level=6 –raid-devices=4 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
mdadm: array /dev/md6 started.
admin@debian-vm:~$ cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md6 : active raid6 sde1[3] sdd1[2] sdc1[1] sdb1[0]
208640 blocks level 6, 64k chunk, algorithm 2 [4/4] [UUUU]
[=======>………….] resync = 39.2% (41600/104320) finish=0.0min speed=13866K/sec
unused devices: <none>
admin@debian-vm:~$ cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md6 : active raid6 sde1[3] sdd1[2] sdc1[1] sdb1[0]
208640 blocks level 6, 64k chunk, algorithm 2 [4/4] [UUUU]
unused devices: <none>
That’s fairly straight forward and easy. Now you can put on any kind of file system that you want to use. Ext2 or Ext3 are very common, but you can also use XFS or ReiserFS on this. It works just like a new hard drive. The name is the name you set in the /dev/mdX parameter used when the RAID was created.
Once you initiated the RAID creation process, it’ll automatically run in the background and you can view at the status by looking at "/proc/mdstat". Once the creation is done, you should consider checking this device very frequently or set up mdadm to send you emails once something happens. The man pages of mdadm are very well written and will tell you everything about the capabilities of the MD modules. We strongly recommend you take a look at them before you create a RAID.
CONTINUED: Adding a spare drive, Growing your RAID and Conclusion
Adding a spare drive to your RAID
admin@debian-vm:~$ sudo mdadm /dev/md6 -a /dev/sdf1
mdadm: added /dev/sdf1
admin@debian-vm:~$ cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md6 : active raid6 sdf1[4](S) sde1[3] sdd1[2] sdc1[1] sdb1[0]
208640 blocks level 6, 64k chunk, algorithm 2 [4/4] [UUUU]
unused devices: <none>
Here you can see that the drive sdf1 has now the status (S) next to it, marking it as a spare drive.
A very excellent feature that Linux RAID offers is the grow function. This will work best if you make use of the "Logical Volume Manager – short: LVM" within Linux so you can easily expand ext2 or ext3 or other types of file systems at ease. Beware, that growing a drive doesn’t equal growing its partition table, so you really need to keep that in mind.
Growing your RAID
admin@debian-vm:~$ sudo mdadm –grow /dev/md6 -n 6
mdadm: Need to backup 256K of critical section..
mdadm: … critical section passed.
admin@debian-vm:~$ cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md6 : active raid6 sdf1[4] sde1[3] sdd1[2] sdc1[1] sdb1[0]
417280 blocks level 6, 64k chunk, algorithm 2 [6/5] [UUUUU_]
unused devices: <none>
admin@debian-vm:~$
The MDADM tools provide you with a simple command line to check the status of your RAID:
either:
cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md6 : active raid6 sdf1[4] sde1[3] sdd1[2] sdc1[1] sdb1[0]
312960 blocks level 6, 64k chunk, algorithm 2 [5/5] [UUUUU]
unused devices: <none>
or:
sudo mdadm –detail /dev/md6
/dev/md6:
Version : 00.90
Creation Time : Thu Apr 2 12:49:57 2009
Raid Level : raid6
Array Size : 312960 (305.68 MiB 320.47 MB)
Used Dev Size : 104320 (101.89 MiB 106.82 MB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Persistence : Superblock is persistent
Update Time : Thu Apr 2 12:50:04 2009
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Chunk Size : 64K
UUID : 5cb84945:1c101341:47489f1c:03bfb314 (local to host debian-vm)
Events : 0.4
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
2 8 49 2 active sync /dev/sdd1
3 8 65 3 active sync /dev/sde1
4 8 81 4 active sync /dev/sdf1
Now, the reason for this tutorial is not to only show you how easy it is, but also to point out a few brilliant features that we really enjoy with software-based RAID configuration.
First: The ability to grow a RAID. This is rare to find on other RAID systems, where you usually would have to move all the data off your current drive, reconfigure it to the larger size and put the data back on. This will sadly, cause a long downtime. MDADM offers growth without you needing to move the data off and back on. Of course, we do recommend that you should make a backup anyways, but you’ll save the restore time and don’t have to reboot your whole system. All you need is to stop the services and applications that access your RAID’ed drive and unmount it, so you can grow it. To be honest, you can grow your RAID even without unmounting it, and people have documented that it works. But our official stand is that we do not recommend that – especially if you don’t have a backup of all your data. Save yourself the worries.
Second: Force a reassemble of a RAID. It might be unlikely, but it can happen quite easily. Two drives on your RAID5 fail because something stupid happens. Maybe you bump the case and the power-line split that you are using gets a short disconnect and your RAID is gone. Not all hope is lost. If you’re lucky, you might be able to force a reassemble and everything is working again. Sadly, when it comes to a hard-RAID configuration, unfortunately, most of RAID controllers don’t have this option – if your company has a product with the reassemble RAID option, let us know and we’ll be glad to test it.
This of course will only work if there was no data to be written to the disks at the time it happens. That’s why we recommend a Kernel setting called: deadline where you might lose some performance on the writing because write caching is disabled, but in case you don’t have a journaling file system (which would be stupid, too) it can save your data.
Third: If you run out of onboard ports for your RAID to be grown, you can plug in any SATA controller (starting with a two port up to any other kind of controller) and add those drives to your RAID as well.
Fourth: Once you understand the functionality of mdadm, you’ll find it astonishingly easy to use. It’s so simple.
Conclusion
If you are using Linux, we would reccomend this RAID method for all your data. It’s free, fast, simple and pretty well proven. In our follow-up article, coming in days following Easter Monday, we will take a close look at software RAID vs. hardware RAID pretty and show you some hard data on the performance and ease of setup.