Short Guide to RAID

A lot of today’s computers use Hard Drives to store data. Hard Drives use a mechanical system to read the data.Which means that hard drives can and will fail at some point. Which means that the data stored on them will be lost. To prevent data loss due to hard drive failure, we can set up the hard drives in a RAID.

RAID stands for Redundant Array of Independent Disks. An Array in this case refers to multiple disks. The following terms describe the terminology used when talking about RAID:

JBOD: Just a Bunch of Disks [No backup, data is stored as is]

The following RAID types use one of three ways to save data:

Mirroring: Simplest form – Data is mirrored exactly. Needs twice as much storage as the data. For example, 1 GB data will require 2 GB disk space.

Striping: Alternate blocks of data are written to different disks. Two disks together contain all the data.

Parity: Data for a third parity disk is calculated based on two data disks. For example, Disk A x Disk B = Disk C. If any one drive fails, the data for that drive can be recreated using this formula. That is Disk A fails, then the data for Disk A is given by Disk C/Disk B.

Fault Tolerance: The number of drives that can fail simultaneously before data loss.

RAID0: Uses Striping. Data is split into half and stored half on each drive. Can be twice as fast as RAID1, since data is read from two disks. When a drive fails, data is LOST. Advantage is higher read and write speed, not data redundancy. Takes the same amount of storage as the data.

RAID1:Uses Mirroring. When one drive fails, RAID continues to read from the other drive.

RAID2: Uses bit level striping with dedicated parity. Parity data is stored on a single separated dedicated drive.

RAID3: Byte-level striping with dedicated parity. Uses a single parity drive.

RAID4: Block-level striping with dedicated parity. Uses a single parity drive.

RAID5: Block-level striping with distributed parity. Uses multiple parity drives. Minimum 3 drives needed. Can only take 1 drive failure, since data must be recalculated from parity. In a RADI5 array, to store 3 TB data, disk space needed is 4 TB. Read speeds are fast, but write is slow because of parity calculations.

RAID6: Block level striping with double distributed parity. Minimum 4 drives needed. Two drives can fail. 2TB data needs 4TB storage, but as you add more drives, the amount of extra storage decreases compared to RAID5.

RAID01: Data is written to RAID0 (Striping), then mirrored, like in RAID1. Needs twice the capacity just like RAID1, but twice as fast as each individual drive, just like RAID0.

RAID10: Reverse of the above – Data is mirrored like RAID1. Each RAID1 array acts like a drive for RAID0, which stripes the data. Data is duplicated, but the entire system is as fast as a RAID0. If two drives are used per RAID1 array, which is a part of RAID0 array, then the capacity stays at 50% even if more arrays are added.

RAID50: Combines the striping of RAID0 with the distributed parity of RAID5. Needs at least 6 drives. Better write performance and fault tolerance than RAID5.