[H-GEN] Reading signed emails using OE
Greg Black
gjb at gbch.net
Wed Jun 4 02:18:48 EDT 2003
[ Humbug *General* list - semi-serious discussions about Humbug and ]
[ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
On 2003-06-04, Michael Anthon wrote:
> I was hesitant to use the regex as you wrote it (pretty much what I
> originally had in my first pass) since I've been stung before with the
> greediness factor in regex. For example, suppose we had a header like
>
> Content-Type: multipart/signed; boundary="Qo8f1a4rgWw9S/zY";
> micalg="pgp-sha1"
>
> Will the first .* in your regex match Qo8f1a4rgWw9S/zY, Qo8f1a4rgWw9S/zY";
> micalg= or Qo8f1a4rgWw9S/zY"; micalg="pgp-sha1
Ah, good point. I was going to say that such a header would be
wrong, but a quick grep in my mail folders shows that something
like that is common enough.
> Have you got any examples of where my original pattern fails, and does the
> new one fix those?
Naturally, I can't remember which of my folders I found those
examples in. Part of the problem is that you're using a syntax
that is not guaranteed to work for all versions of sed.
Try one of these equivalents:
[1] 's/.*boundary="\([^"]\{1,\}\)".*/\1/'
[2] 's/.*boundary="\([^"][^"]*\)".*/\1/'
The one true sed uses "basic" RE's rather than "extended" RE's,
and so is missing the `+' and `?' operators. But they can be
simulated by `{1,}' and `{0,1}' respectively (bearing in mind
that the braces, in turn, must be backslash-escaped). See [1]
above. The alternate for `+' is to put the atom twice with an
asterisk after the second one, as in [2] above.
I expect that either of the above will do the right thing in any
version of sed. Most people will probably find [2] easier to
read, although I think [1] is a bit more elegant, albeit one
character longer. Of course, if you *know* you'll only ever use
a sed that handles the syntax you proposed, it's OK to do it
that way. However, I always like to use the portable method if
it can do what I want. Leads to less grief in the long run.
Greg
--
Greg Black <gjb at gbch.net> <http://www.gbch.net/gjb.html>
GPG signed mail preferred; further information in headers.
--
* This is list (humbug) general handled by majordomo at lists.humbug.org.au .
* Postings to this list are only accepted from subscribed addresses of
* lists 'general' or 'general-post'. See http://www.humbug.org.au/
More information about the General
mailing list