[H-GEN] any faster way?
Anthony Towns
aj at azure.humbug.org.au
Tue Jun 16 08:48:09 EDT 1998
On Tue, Jun 16, 1998 at 10:58:15AM +0000, Michael Dooley wrote:
> I've got database (read 'directory') of around 4000 files that
> I'm trying to tidy up (remove files with <5 lines) and see which files
> I've deleted since I created my master database list. This is what I
> came up with, but the find is s-l-o-w. Is there another faster
> approach?
Let's assume you've got the following layout:
$DBDIR/
file1
file2
reallylongfile1
subdir/
file3
reallylongfile2
reallylongfile3
Let's further assume you have a (sorted) list of expect databases, called
$DB:
file1
file2
file3
file4
reallylongfile1
reallylongfile2
And you want the program to tell you that:
S: file1
S: file2
S: subdir/file3
M: [[ file4 ]]
E: subdir/reallylongfile3
S -- short filename
D -- missing database [unknown directory]
E -- unknown file
We can thus do something like:
#!/bin/sh
(find $DBDIR -type f | awk -F/ '{ print $NF " 2 " $0 }'
cat $DB | sed 's,$, 1,'
) | sort | tee ~/test.out |
awk 'BEGIN { CURR="" }
{ if ( $2 == "1" ) {
if ( CURR != "" ) print "M: [[ " CURR " ]]";
CURR=$1;
} else if ( CURR == $1 ) {
CURR = "";
if ( length($1) <= 5 ) print "S: " $3;
} else if ( CURR != $1 ) {
if ( CURR != "" ) print "M: [[ " CURR " ]]";
CURR="";
print "E: " $3;
}
}'
cat <<EOF
S -- short filename
M -- missing file [unknown directory]
E -- extra file
EOF
Example usage is:
$ DB=mydbfile DBDIR=~/wherever ./test.sh
Working out wtf the above does is left as an exercise to the interested
reader.
Cheers,
aj
--
Anthony Towns <aj at humbug.org.au> <http://azure.humbug.org.au/~aj/>
I don't speak for anyone save myself. PGP encrypted mail preferred.
``It's not a vision, or a fear. It's just a thought.''
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 434 bytes
Desc: not available
URL: <http://lists.humbug.org.au/pipermail/general/attachments/19980616/1942a56a/attachment.sig>
More information about the General
mailing list