[H-GEN] file recovery tools
David Makepeace
zzdmakep at uqconnect.net
Tue Dec 30 07:45:05 EST 2003
On Tue, 30 Dec 2003, Peter Arnold wrote:
> For those interested I've had a *reasonably* successful recovery of most
> of the files. I'm sure I could have done this better somehow and I'd be
> interested in poeples comments on how they'd do some of the steps but
> for now I've managed.
>
> Basicly the steps are as follows
>
> I placed the compact flash card in a CF reader and attached it to my
> linux laptop
> From there I could dd the contents of the CF card using
> "dd if=/dev/sda1 of=cflash.image"
> I used ghex to look through the file and a known working jpeg to confirm
> start and end hex strings as FFD8 and FFD9 repectively
That's usually the correct approach.
When you first notice that a FAT filesystem is corrupted it is probably
the FAT itself (i.e. the File Allocation Table) which is corrupted.
The FAT is a data structure stored near the start of the filesystem that
determines which blocks belong to which files and in what order. It is an
array of entries, one per cluster in the filesystem, where each entry
specifies the next cluster (i.e. the one that follows the current cluster)
in the chain of clusters that make up the content of a file. ("cluster"
refers to a logical block of the file system and is one or more physical
disk blocks.) That means files don't have to consist of consecutive
blocks - i.e. they can be "fragmented". BTW, the names of the different
types of FAT filesystem, i.e. FAT12, FAT16, etc., refer to the size of the
FAT entries in bits.
The problem is that once the FAT is corrupted there is not much other
information left that can be used to reconstruct the filesystem. (There
is a "backup FAT" - but that's another story.) As a result, the best
approach is usually to ignore the filesystem structure entirely and search
for the content. Fortunately you can almost always assume that files in a
FAT filesystem take up consecutive blocks if the files were added to a
freshly formatted filesystem and no files have been deleted before adding
additional files.
For JPEGs there is no need to look for the end marker either as most
programs that manipulate JPEGs ignore anything which follows the end
marker anyway. You can get rid of the trailing garbage by loading the
JPEGs into a program such as xv and save the JPEG out to a new file.
Almost two years ago a colleague of mine was in a panic when he corrupted
the memory stick he used in his Sony camera - a DV camcorder with still
capability. (Sony memory sticks use FAT as well.) I plugged his USB
memory stick reader into my Linux notebook and used dd to create an image
file. I then quickly knocked up a C program to split the JPEGS at the
start markers - see source below (yeah, it was a one-off program - sorry
about the lack of comments, etc.). The start markers were always at the
start of a 512 byte block since they were always at the start of the
files. My colleague was greatly relieved when I managed to recover all
his photos.
> Actually the Canon files seem to start with FFD8FFE1 which was a
> better start string to search for as FFD8 was also in the content.
Sony files also appear to start with FFD8FFE1.
> CF cards on canon store as Fat 16. I'm not sure if that is peculiar to
> canon or standard.
I expect most if not all would use FAT 16.
> Some of the files restored were actually deleted by me on the camera and
> some of these were partially overwritten. I'm not sure how this sits
> with the sequential saving of files but I know I didn't delete any after
> a known OK state and the subsequent addition/corruption of files.
Once you delete a file, any additional files added may be fragmented.
> I'm not sure why turning off the camera before it had finished saving
> corrupted other files that *were* ok. Comments?
I guess that the files that *were* ok were just stored fragmented (due to
deletions) and so couldn't be recovered using this technique.
Cheers,
David.
-----------------------------------------------------------------
David Makepeace | Plugged In Software
mailto:david at PIsoftware.com | Telephone: +61 7 3876 2188
http://www.PIsoftware.com | Facsimile: +61 7 3876 4899
-----------------------------------------------------------------
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
main(int argc, char *argv[])
{
int fd = -1;
char buf[512];
char fname[80];
int fnum = 0;
while (read(0, buf, 512) == 512) {
if (*(unsigned *)&buf == 0xe1ffd8ff) {
if (fd != -1) close(fd);
sprintf(fname, "recovered%03d.jpg", fnum++);
fd = open(fname, O_RDWR | O_CREAT | O_TRUNC, 0666);
if (fd == -1) {
perror(fname);
exit(1);
}
}
if (fd != -1) write(fd, buf, 512);
}
if (fd != -1) close(fd);
exit(0);
}
More information about the General
mailing list