[H-GEN] Linux file server

Russell Stuart russell at stuart.id.au
Thu Jul 1 23:13:36 EDT 2004


Harry Phillips wrote:
> Russell Stuart wrote:
>>     Another way to use the second drive is as a hot
>>     spare - doing an image backup each night after checking
>>     that all you data looks consistent.
> 
> I have been considering this myself, so what would you use to image one 
> drive to another each night? dd?

Dd is certainly one way.  I did use it for a while as it is
the simplest.  But it does require an identically partitioned
drive, and it faithfully copies across all hidden file system
corruption.  The method I now use is to mkfs the drive each
night, then pax[1] the file system across, and finally run
grub to make the file system bootable.  You end up with a
nice, clean, unfragmented partition.  Should the primary
drive fail catastrophically, you can just swap the hot
spare in.  Indeed, swapping the hot spare in occasionally
is a wonderful way to test your backup, and defragment
you drive at the same time.

I use a shell script to do this.  It is attached.  It
won't be immediately useful to you as it requires a
complex configuration file (complex because it configures
a lot of other things) - but it should be a start.

The failure mode Paul mentioned (the primary failing while
being copied to the secondary) has never happened to me, but
it is a concern.  Remember the discussion was about mirroring
versus taking a image backup each night.  In terms of
recovery, it is more likely application software will
damage your data (thus rendering a mirror useless) than
the hardware failing.  It is even less likely the hardware
will fail while you are doing the image copy.  Both failure
modes are possible of course, and so neither mirroring nor
hot-swap are a replacement for an off-line backup (tape or
whatever).

The rsync technique the Brendan recommended has its own
advantages and disadvantages.  I do use it to do a backup
of the main server to an off-site machine each night via
the internet.  But rsync comes with its own set of bugs -
most of these have to do with deciding that a file hasn't
been modified by just looking at inode (times / length,
etc).  I haven't had a serious failure with rsync when I
use the --ignore-times flag, but that wipes out the speed
advantage rsync has.  Rsync then also then smashes all
access times on the system being backed up - Tridge
seems to have some religious objection to backup programs
modifying the file system (and apparently he deems
restoring the file access time as modifying the file
system).  I use the access time as a means of finding
junk files, so I wasn't happy with that.  It does
overcome Paul's objection however.

--
Russell

[1] I use pax instead of cpio because cpio has a horrible bug,
     at least in the -p mode.  It very occasionally (once a year
     for me) it gets confused about the boundaries between the
     files.  To understand the problem, you have to know that
     internally the -p mode of cpio converts the files into a
     bit stream (just like cpio -i) then unpacks it (just like
     cpio -o).  The file headers get inserted into the stream,
     between each file.  What happens is that cpio looses track
     of where the boundary between the files is, and shifts
     the file header along a few bytes in the stream.
     Effectively that removes a few bytes from the end of each
     file in the stream and tacks them onto the next file.
     The result is about as helpful as corruption on your
     SCSI bus.

     I have never been able to reproduce the bug reliably, nor
     has staring at the source code given me any insights.  Pax
     did have a few bugs and did some things in a ways I didn't
     like.  But I was able to fix pax.  I did submit the patches
     back to OpenBSD.  They liked the changes in pax's
     functionality, and the bug fixes, but didn't like that fact
     that I made the source compatible with both BSD's libc and
     gnu's libc. I didn't bother chasing up to see whether they
     got applied to CVS.

     By the by, GNU tar doesn't seem to have the same bugs gnu
     cpio does.  It has always worked for me.  But I prefer to
     use find/grep/sed to select the files I want to back up,
     rather than relying on tar's complex command line options.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: backup-disk.sh
URL: <http://lists.humbug.org.au/pipermail/general/attachments/20040702/4bb78e4e/attachment.ksh>


More information about the General mailing list