[H-GEN] Reading signed emails using OE

Greg Black gjb at gbch.net
Wed Jun 4 02:18:48 EDT 2003


[ Humbug *General* list - semi-serious discussions about Humbug and     ]
[ Unix-related topics. Posts from non-subscribed addresses will vanish. ]

On 2003-06-04, Michael Anthon wrote:
> I was hesitant to use the regex as you wrote it (pretty much what I
> originally had in my first pass) since I've been stung before with the
> greediness factor in regex.  For example, suppose we had a header like
> 
> Content-Type: multipart/signed; boundary="Qo8f1a4rgWw9S/zY";
> micalg="pgp-sha1"
> 
> Will the first .* in your regex match Qo8f1a4rgWw9S/zY, Qo8f1a4rgWw9S/zY";
> micalg= or Qo8f1a4rgWw9S/zY"; micalg="pgp-sha1

Ah, good point.  I was going to say that such a header would be
wrong, but a quick grep in my mail folders shows that something
like that is common enough.

> Have you got any examples of where my original pattern fails, and does the
> new one fix those?

Naturally, I can't remember which of my folders I found those
examples in.  Part of the problem is that you're using a syntax
that is not guaranteed to work for all versions of sed.

Try one of these equivalents:

  [1]  's/.*boundary="\([^"]\{1,\}\)".*/\1/'
  [2]  's/.*boundary="\([^"][^"]*\)".*/\1/'

The one true sed uses "basic" RE's rather than "extended" RE's,
and so is missing the `+' and `?' operators.  But they can be
simulated by `{1,}' and `{0,1}' respectively (bearing in mind
that the braces, in turn, must be backslash-escaped).  See [1]
above.  The alternate for `+' is to put the atom twice with an
asterisk after the second one, as in [2] above.

I expect that either of the above will do the right thing in any
version of sed.  Most people will probably find [2] easier to
read, although I think [1] is a bit more elegant, albeit one
character longer.  Of course, if you *know* you'll only ever use
a sed that handles the syntax you proposed, it's OK to do it
that way.  However, I always like to use the portable method if
it can do what I want.  Leads to less grief in the long run.

Greg

-- 
Greg Black <gjb at gbch.net> <http://www.gbch.net/gjb.html>
GPG signed mail preferred; further information in headers.

--
* This is list (humbug) general handled by majordomo at lists.humbug.org.au .
* Postings to this list are only accepted from subscribed addresses of
* lists 'general' or 'general-post'.  See http://www.humbug.org.au/



More information about the General mailing list