Misc Links
Forum Archive
News Archive
File DB
 

Ads
 

Advertisement
 

Latest Forum Topics
wow 56 k modems are
Posted by Red Squirrel
on Oct 14 2013, 11:52:23 pm

I Need A Program
Posted by rovingcowboy
on Sep 23 2013, 5:37:59 pm

having trouble witn lan
Posted by rovingcowboy
on Sep 23 2013, 5:40:56 pm

new problem for me
Posted by rovingcowboy
on Sep 23 2013, 5:54:09 pm

RBC Royal Bank
Posted by Red Squirrel
on Aug 13 2013, 6:48:08 pm

 

How to Use MDADM Linux Raid
A highly resilient raid solution!
By Red Squirrel


Raid intro
Before we start, let's first start by a quick introduction to what raid is, and why you should use it. If you are familiar with raid, you may skip to the 2nd page of this article. Raid stands for Redundant Array of Independent Disks. Some also say it stands for Redundant Array of Inexpensive Disks. This can be true but is not always the case, so the first is usually the accepted term. What raid does is it takes multiple hard drives or other storage devices and puts them together, creating a single volume. This is done for different reasons such as performance and redundancy.

Software vs Hardware Raid
For raid to happen, something needs to take the disks, and process/encode the information. This process requires a "program" and some resources. There are two main areas where raid is implemented. Sometimes it is done at hardware level, other times at software level. Hardware level raid happens before the operating system sees it, and thus is usually independent of the OS. Basically, a raid card is used to connect the drives to it. Most raid cards will have a bios or other management utility which will allow you to configure the raid without the need for the operating system, or using an utility in the OS to talk to the card directly. It is also said to be higher performance, though this can possibly be debatable with today's high end CPUs. I have not done any benchmarks myself so I cannot comment further on that.

Software raid does not require any special raid card, and is handled by the operating system. The operating system will see all the disks individually, then present a new "raided" volume to itself, which is what can be formatted and treated as a hard drive. The advantage to this is that it costs much less, and in the case of Linux raid, actually offers features that even the most expensive raid cards wont have, such as being able to convert or grow raid volumes on the fly while the data remains live. The disadvantage is having to rely on the operating system. You can not easily put the operating system itself on a raid, because the operating system needs to be started for the raid to work. If you use a 3rd party software such as Acronis, it will not see the raid as the operating system is not started. It also uses the cpu while a raid card has it's own cpu, though the advantage of that is today's cpus are so fast that it will barely put a dent in it, and may even increase the performance.

Another advantage of software raid is there are less points of failure. If a hardware raid card fails, it may be very hard to recover the array as it will be encoded by that specific raid card's standard. In some cases you may be able to buy the same card and recover the array, but other times even that wont work. With software raid, any controller will work, as long as it can present the disks to the OS. You can even have raids that span multiple controllers. Ideally, a non raid card can be used, but a raid card is fine as most will just present the drives to the OS without raid.

If you do go with hardware raid, beware of cheap raid cards, as often they are what is referred to as "fake raid". Basically this is a card that requires the OS to have the proper drivers, and is actually somewhat software controlled. If you're going to use such card, may as well just use software raid. If you want real hardware raid, be ready to pay upwards $1,000 or even more for a good card. This is where Linux raid is nice as you can get a solid solution without paying that kind of money.

Also one thing to note with raid, vs other systems like JBOD, is that in most cases, all disks need to be the same size. If they are not, they are treated as if they were all the same size as the smallest disk. Because of the way raid works, this makes sense as the disks are evenly used at the same time.

Raid Levels
There are many different ways of combining drives to create a single volume. Some are geared at performance, while others are geared at redundancy which means if a drive fails the raid remains online with all data intact. There are many levels, but here are the most widely used:

The images on the right represents multiple hard drives, and the red one being a failed drive. The string ABCDEF is written to the array. Depending on the raid level, the failed disk's data may be on another disk, aka, redundancy.

Raid 0: Also known as stripe. This raid level takes the data and spreads it evenly across all disks. This can be done with 2 or more disks. Performance is very good, but there is no redundancy. In fact the risk of total failure goes up based on number of disks. If a single disk fails, the entire array is lost. This is great for temporary non critical data that requires high performance access. Space of a volume is equal to the total of all the disks. ex: 2 1TB drives = 2TB of space.
Raid 1: Also known as mirror. This raid level simply mirrors the data across all disks. This can be done with 2 or more disks but usually done with only 2. Performance is not that greatly improved though in some cases read speeds can be faster as both disks can be accessed at once to get different parts of the data. Different raid systems handle this differently. Space of a volume is equal to a single drive. ex: 2 1TB drives = 1TB of space.
Raid 5: Also known as stripe + parity. There are different versions of this raid such as raid 3 and 4 which are slightly different, but they are rarely used. This raid level stripes data across disks, similar to raid 0, except there is a parity bit added to a separate disk (all parity is spread across all disks) using a XOR algorithm. If one disk fails, the data that was lost will be recoverable from the other disks. This raid level requires at least 3 drives. Performance is decent but not as good as 0 because of the parity calculation. Performance is increased by adding more disks. Space of a volume is equal to all disks minus 2. Ex: 3 1TB drives = 2TB of space.

Note: The image represents how it works from a user's perspective. The data is not actually duplicated across like shown, this diagram is only to demonstrate that no matter which disk is removed, the entire ABCDEF string can be recovered. There is much more going on in the background.
Raid 6: This is very similar to raid 5, except the parity is done twice. The data always exists on at least 3 disks. This allows you to lose up to 2 drives without losing any data. Performance is slightly decreased, however raid 6 only makes sense when there are a lot of drives so performance is usually fine because of the bigger number of drives. The reason to do raid 6 is because with too many drives the chance of two failing at once is higher. Space of a volume is equal to all disks minus 2.  
Raid 10: This is raid 1 and 0 combined. Picture two raid 1's that are then raided 0, or two raid 0's that are raided 1. Performance is very good and so is redundancy. You can lose one drive minimum, and possibly more, depending on which drives and how they fall in the array logic. If the right drives fail you could actually lose half of the drives. You could also do something crazy like 3 raid 0's that are mirrored to increase the number of drives that can fail. Space of a volume is equal to half of all the disks. Ex: 4 1TB drives = 2TB. This raid level is not used that often, unless very high performance is required.

Note that this is a very basic explanation of raid levels as this article is more focus on MDADM raid rather than explaining raid itself. There are also many other raid levels such as raid 2 raid 3 raid 4 raid 5e (enhanced) and even raid 5ee (extra enhanced??), but these are the main ones.







Next Page
spacer
31888 Hits Pages: [1] [2] [3] [4] 0 Comments
spacer


Latest comments (newest first)
Be the first to post a comment!


Top Articles Latest Articles
- What are .bin files for? (669062 reads)
- Text searching in linux with grep (161180 reads)
- Big Brother and Ndisuio.sys (150471 reads)
- PSP User's Guide (139547 reads)
- SPFDisk (Special Fdisk) Partition Manager (117240 reads)
- How to Use MDADM Linux Raid (188 reads)
- What is Cloud Computing? (1225 reads)
- Dynamic Forum Signatures (version 2) (8769 reads)
- Successfully Hacking your iPhone or iTouch (18714 reads)
- Ultima Online Newbie Guide (35906 reads)
corner image

This site best viewed in a W3C standard browser at 800*600 or higher
Site design by Red Squirrel | Contact
© Copyright 2017 Ryan Auclair/IceTeks, All rights reserved