[H-GEN] Re: One-liner hackery (was Re: YARWILP) (fwd)

Martin Pool mbp at linuxcare.com.au
Sun Apr 30 07:58:26 EDT 2000

I thought some people here would like this.

---------- Forwarded message ----------
Date: Sun, 30 Apr 2000 03:17:09 -0700 (PDT)
From: Rasmus Lerdorf <rasmus at linuxcare.com>
Subject: Re: One-liner hackery (was Re: YARWILP)

> perl -e'$i=0;$/="";while(<>){if(/^From/&&!(/^Status:\s+R/mi)){$i++}}print"$i unread message(s)\n"'
> 	- given a mbox-format file as an argument, prints the number of
> 	  unread messages

This one doesn't look right to me.  You should be checking for /^From /
and not /^From/ and only a /^From / following a blank line should trigger
this.  If I run this on a large mbox where I know there are 15231 unread
messages, with your perl one-liner I get 16174 because it is picking up
messages that aren't real messages and also picking up stray "Status: " 
lines in the bodies of messages.

A correct and quicker awk script that does this is:

BEGIN{m=0;b=1}/^From /{h=b?1:0}/^Status: [^R]+/{h?m++:0}{b=!length();h=b?0:h}END{print m " unread message(s)"}

b is true when the previous line was blank
h is true when we are currently in the header of a message
m is the unread message counter

Here is the same thing without the whitespace removed.  Much easier to
understand it this way:

BEGIN               { m=0; b=1 }
/^From /            { h=b?1:0 }
/^Status: [^R]+/    { h?m++:0 }
                    { b=!length(); h=b?0:h }
END                 { print m " unread message(s)" } 

Awk deserves more respect than it gets.


