[H-SASIG] rdiff-image-cron /etc/rdiff-image/rdiff-image.conf FAILED!
Russell Stuart
russell-humbug at stuart.id.au
Sun Jan 31 19:05:57 EST 2010
On Sun, 2010-01-31 at 17:44 -0500, root wrote:
> The backup attempted by the command:
> /usr/bin/rdiff-image-cron /etc/rdiff-image/rdiff-image.conf
> failed. Sorry.
I am looking into this. It smells like a bug to me.
However, just so you all understand why the message is appearing at all,
here is a explanation of why it was added.
So far, every time we have moved excalibur, we have managed to balls up
the backup. It was that the backups weren't running - they were running
just fine. It was that they were running just fine on both the old VM
host and the new one. The result was a mishmash of backups on S3 from
both machines. Some were valid (ie from the new machine), some were
not, and there was absolutely no way to know from just looking at S3.
To detect this I added a check. Every time the backup runs it records
the state of S3 after it has run, and then on the next run verifies it
hasn't changed. If it has, you get a message like the one below. The
check is strong: it includes MD5 sums of the files on S3.
The idea is if another copy of this program is running, this check will
pick it up and send a warning email. Hopefully this will get the
problem fixed quickly. As it was, we two hosts stomping all over the
backups for days before I notice. And even that was more a matter of
luck than diligence on my part.
> rdiff-image-s3.py: S3 changed by someone else - 'humbug-excalibur-backup_secret/20100130-133904_20100129-063903_secret.rdiff.gz.gpg' added.
> rdiff-image-s3.py: (continuing with backup.)
> rdiff-image-s3.py: S3 changed by someone else - 'humbug-excalibur-backup_base/20100130-183903_20100129-063903_base.rdiff.gz' added.
> rdiff-image-s3.py: (continuing with backup.)
> rdiff-image-cron: Backup completed, but S3 modified by someone else.
More information about the Sasig
mailing list