[H-GEN] Linux file server

David Jericho davidj at tucanatech.com
Fri Jul 2 00:46:24 EDT 2004


Russell Stuart wrote:

> 1.  Linux software Raid is faster than hardware Raid,
>     providing you have the spare CPU cycles.  This is
>     the consensus on the Dell PowerEdge list.  

It entirely depends on the number of drives in the array and the bus and 
configuration of the array, and of course the RAID controller in 
question. As my previous post indicated, I've got 80% utilisation 
writing using read() at 170MBytes/s on a 1.2TB Ext3 file system 
optimised for random I/O using a SCSI style controller.

If your system does heavy disk IO, I highly recommend a SCSI style 
controller, or a second processor.

> 2.  Software Raid is more reliable that hardware Raid.  This
>     is primarily because of bugs in Raid cards and firmware
>     bugs/incompatibilities in the SCSI disk drives.  

I beg to differ here. If the research has been done properly, using 
vendor recommended hardware, hardware RAID is as good as, if not better 
than software RAID. I have had corrupted software RAID arrays, as well 
as corrupted hardware RAID arrays.

> 3.  Raid is less reliable than a single disk spindle.  In
>     other words, if all you data fits on a single drive, then
>     you will better off reliability wise to just one drive
>     instead of Raid.  

You're way off base here, even with the qualification of your original 
post. The likelihood of a single drive death in a RAID array increases 
in a linear fashion respect to the number of drives in an array. The 
likelihood of a single drive death resulting in lost or corrupted data 
on a properly designed RAID array however decreases faster than linear.

A paranoid RAID 1 controller will read both disks, and compare the 
results informing you if they differ. For performance reasons, some RAID 
controllers may skip this step.

The other benefit of RAID 1, and similar schemes is that the backup, if 
you'll permit me to call it that, is generated on the fly, rather than 
at a given instant of time.

>     It might seem this does should not hold for Raid 1 (mirroring).
>     It does.

I would be very interested to see your methodology for coming to this 
conclusion. If your application or operating system is corrupting data 
then you've got bigger issues to worry about than a drive dying.

> If you are mirroring the damage is reflected on all drives immediately.  

RAID is not a backup solution, it is an uptime solution, and a risk 
migration solution.

> The fallacy younger admins fall into (well I did and I assume
> others do as well) is that since big sites use Raid to
> get good reliability, their small site should also use Raid
> for the same reason.  While it seems reasonable, it is wrong.

The mistake made is that assuming RAID is there to provide backup. It is 
not, and any well read administrator has that knowledge.

> The other thing Raid can give you is speed, but again if you
> are a small site with a high read/write ratio spending the
> money on RAM will give you more speed than Raid - without
> the reliability concerns.

Quite simply you cannot talk about data reliability and integrity 
without talking about RAID. 30 minutes with a notepad and a pen using 
high school mathematics will illustrate that. It's not the only 
solution, but it does make up part of the solution.

To talk of serious data integrity and storage on a single spindle is 
irresponsible at best.

-- 
David Jericho
Senior Systems Administrator, Tucana Technologies




More information about the General mailing list