[H-GEN] one for an awk guru

Anthony Towns aj at humbug.org.au
Wed Feb 4 03:03:05 EST 1998


-----BEGIN PGP SIGNED MESSAGE-----


>    nawk ' BEGIN {i=1;m=1}
>    $1 ~ /SSBOND/{cysteine[i] = $4;i++;cysteine[i] = $6;i++}
>    END {for (m in cysteine)
>                    printf "%s\n",cysteine[m]}
>    ' $1


#!/usr/bin/perl
while(<>) {                                  # $_ = next line
    if ( /^SSBOND\b/ ) {                     # is the 1st word SSBOND?
        push( @cysteine, (split ' ')[3,5] ); # push the 4th and 6th
    }
}
foreach (@cysteine) { print $_, "\n" }       # print out in order added


FWIW, the Perl version takes about 10 seconds on an 80,000 line file
(about 1.3MB) while the gawk version takes just under a minute.
They're about equal at 1,000 lines.  Awk seems to end up swapping,
curiously -- ie it's using up about 20MB just to cope with a 1.3MB
file. 

Rewriting both so they just print the output as they receive it puts
gawk at about 2 seconds for the 80k line file, while Perl remains
about 7 seconds. 

Using awk, mawk, or nawk in place of gawk, yields similar times for
the on the fly output, and equal or slightly better times than Perl
for the original program.

Cheers,
aj, who is learning a new language

- --
Anthony Towns <aj at humbug.org.au> <http://azure.humbug.org.au/~aj/>
I don't speak for anyone save myself. PGP encrypted mail preferred.

On Netscape GPLing their browser: ``How can you trust a browser that
ANYONE can hack? For the secure choice, choose Microsoft.''
	-- <oryx at pobox.com> in a comment on slashdot.org

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3ia
Charset: ascii
Comment: Key available at http://student.uq.edu.au/~s343676/aj_key.asc

iQCVAwUBNNggvORRvX9xctrtAQHKogP9E+bt3yYw/uV2L0odq8ld3ajkUplnXYX5
AsFF9lJvF3O5INH8bTfpa5XZjTCE4PlrlhwK3EBPvO+clDLNZsIC6x2i9bt2VqRo
pbLYw8prC2B6Fd2/j3sjH0Cn+zctPDUFnncXlpqkgt1983ccBgkdRpFWno6CZ45a
JZ6MRWBogZ0=
=LzRV
-----END PGP SIGNATURE-----

----------------------- HUMBUG General List --------------------------------
echo "unsubscribe general" | mail majordomo at humbug.org.au # To Unsubscribe



More information about the General mailing list