[H-GEN] Linux file server
David Jericho
davidj at tucanatech.com
Fri Jul 2 00:46:24 EDT 2004
Russell Stuart wrote:
> 1. Linux software Raid is faster than hardware Raid,
> providing you have the spare CPU cycles. This is
> the consensus on the Dell PowerEdge list.
It entirely depends on the number of drives in the array and the bus and
configuration of the array, and of course the RAID controller in
question. As my previous post indicated, I've got 80% utilisation
writing using read() at 170MBytes/s on a 1.2TB Ext3 file system
optimised for random I/O using a SCSI style controller.
If your system does heavy disk IO, I highly recommend a SCSI style
controller, or a second processor.
> 2. Software Raid is more reliable that hardware Raid. This
> is primarily because of bugs in Raid cards and firmware
> bugs/incompatibilities in the SCSI disk drives.
I beg to differ here. If the research has been done properly, using
vendor recommended hardware, hardware RAID is as good as, if not better
than software RAID. I have had corrupted software RAID arrays, as well
as corrupted hardware RAID arrays.
> 3. Raid is less reliable than a single disk spindle. In
> other words, if all you data fits on a single drive, then
> you will better off reliability wise to just one drive
> instead of Raid.
You're way off base here, even with the qualification of your original
post. The likelihood of a single drive death in a RAID array increases
in a linear fashion respect to the number of drives in an array. The
likelihood of a single drive death resulting in lost or corrupted data
on a properly designed RAID array however decreases faster than linear.
A paranoid RAID 1 controller will read both disks, and compare the
results informing you if they differ. For performance reasons, some RAID
controllers may skip this step.
The other benefit of RAID 1, and similar schemes is that the backup, if
you'll permit me to call it that, is generated on the fly, rather than
at a given instant of time.
> It might seem this does should not hold for Raid 1 (mirroring).
> It does.
I would be very interested to see your methodology for coming to this
conclusion. If your application or operating system is corrupting data
then you've got bigger issues to worry about than a drive dying.
> If you are mirroring the damage is reflected on all drives immediately.
RAID is not a backup solution, it is an uptime solution, and a risk
migration solution.
> The fallacy younger admins fall into (well I did and I assume
> others do as well) is that since big sites use Raid to
> get good reliability, their small site should also use Raid
> for the same reason. While it seems reasonable, it is wrong.
The mistake made is that assuming RAID is there to provide backup. It is
not, and any well read administrator has that knowledge.
> The other thing Raid can give you is speed, but again if you
> are a small site with a high read/write ratio spending the
> money on RAM will give you more speed than Raid - without
> the reliability concerns.
Quite simply you cannot talk about data reliability and integrity
without talking about RAID. 30 minutes with a notepad and a pen using
high school mathematics will illustrate that. It's not the only
solution, but it does make up part of the solution.
To talk of serious data integrity and storage on a single spindle is
irresponsible at best.
--
David Jericho
Senior Systems Administrator, Tucana Technologies
More information about the General
mailing list