[H-GEN] Linux file server

Russell Stuart russell at stuart.id.au
Thu Jul 1 21:38:45 EDT 2004


Tim Browne wrote:
> I have easy access to both Promise and Adpatec controllers.
> and
> Sony,LG,Mitsubishi DVDs.

My experience is that:

1.  Linux software Raid is faster than hardware Raid,
     providing you have the spare CPU cycles.  This is
     the consensus on the Dell PowerEdge list.  Dell
     seems reluctant to publicise this, probably because
     it means they will sell less high-end raid cards.
     Be that as it may, when pressed, the verdict of the
     Dell engineers seems to be unanimous.  Qualification:
     I have not done the tests myself.

2.  Software Raid is more reliable that hardware Raid.  This
     is primarily because of bugs in Raid cards and firmware
     bugs/incompatibilities in the SCSI disk drives.  The
     main trouble area seems to be subtle incompatibilities
     between drives and card giving rise to timing and protocol
     errors.  The errors have, in my case anyway, invariably
     lead catastrophic loss of all data in disk array. This
     may seem odd, but it arises because there are several
     manufactures of Raid cards, and there aren't that many
     of each type sold.  The Linux Raid software gets used
     by a lot of people.  The HDD drivers it depends on gets
     thrashed every Linux box.

3.  Raid is less reliable than a single disk spindle.  In
     other words, if all you data fits on a single drive, then
     you will better off reliability wise to just one drive
     instead of Raid.  I do not know if this holds true for
     software Raid - I haven't used it enough.  It is definitely
     true for hardware Raid.  Every Raid array I have admined
     has failed (as in required me to go to a backup).  They have
     always been on high-end servers, of course, and their
     failure has caused a great deal of pain.  Sometimes the
     failure is due to hardware (cards / cables / drives / power
     supplies), but more often due to aforementioned bugs.

     Note the "single spindle" qualification.  If you have to go
     to multiple spindles for size or speed reasons, then
     the outcomes start to favour Raid.  If you have a large
     number of spindles you won't be able to survive without
     Raid - I have seen sites that average one drive failure
     per week, because of the number of drives they use.
     Without Raid the downtime would be simply unacceptable.

     It might seem this does should not hold for Raid 1 (mirroring).
     It does.  Another way to use the second drive is as a hot
     spare - doing an image backup each night after checking
     that all you data looks consistent.  This is more reliable
     because data is damaged more often by applications (bugs
     corrupting files) and users (eg deleting the wrong file) than
     by hardware failures.  If you are mirroring the damage is
     reflected on all drives immediately.  If you do a nightly
     backup you have a reasonable chance of recovering the data
     while the system is online, without messing around with backup
     media.

The fallacy younger admins fall into (well I did and I assume
others do as well) is that since big sites use Raid to
get good reliability, their small site should also use Raid
for the same reason.  While it seems reasonable, it is wrong.

The other thing Raid can give you is speed, but again if you
are a small site with a high read/write ratio spending the
money on RAM will give you more speed than Raid - without
the reliability concerns.

--
Russell




More information about the General mailing list