[H-GEN] Re: One-liner hackery (was Re: YARWILP) (fwd)
Martin Pool
mbp at linuxcare.com.au
Sun Apr 30 07:58:26 EDT 2000
[ Humbug *General* list - semi-serious discussions about Humbug and ]
[ Unix-related topics. Please observe the list's charter. ]
I thought some people here would like this.
---------- Forwarded message ----------
Date: Sun, 30 Apr 2000 03:17:09 -0700 (PDT)
From: Rasmus Lerdorf <rasmus at linuxcare.com>
Subject: Re: One-liner hackery (was Re: YARWILP)
> perl -e'$i=0;$/="";while(<>){if(/^From/&&!(/^Status:\s+R/mi)){$i++}}print"$i unread message(s)\n"'
> - given a mbox-format file as an argument, prints the number of
> unread messages
This one doesn't look right to me. You should be checking for /^From /
and not /^From/ and only a /^From / following a blank line should trigger
this. If I run this on a large mbox where I know there are 15231 unread
messages, with your perl one-liner I get 16174 because it is picking up
messages that aren't real messages and also picking up stray "Status: "
lines in the bodies of messages.
A correct and quicker awk script that does this is:
BEGIN{m=0;b=1}/^From /{h=b?1:0}/^Status: [^R]+/{h?m++:0}{b=!length();h=b?0:h}END{print m " unread message(s)"}
b is true when the previous line was blank
h is true when we are currently in the header of a message
m is the unread message counter
Here is the same thing without the whitespace removed. Much easier to
understand it this way:
BEGIN { m=0; b=1 }
/^From / { h=b?1:0 }
/^Status: [^R]+/ { h?m++:0 }
{ b=!length(); h=b?0:h }
END { print m " unread message(s)" }
Awk deserves more respect than it gets.
-Rasmus
--
* This is list (humbug) general handled by majordomo at lists.humbug.org.au .
* Postings to this list are only accepted from subscribed addresses of
* lists 'general' or 'general-post'.
More information about the General
mailing list