[H-GEN] REGEX pattern required

Stephen Thorne stephen at thorne.id.au
Wed Dec 8 01:24:14 EST 2004


On 08 Dec 2004 16:14:58 +1000, Russell Stuart <russell-humbug at stuart.id.au> wrote:
>[ Humbug *General* list - semi-serious discussions about Humbug and     ]
> [ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
> 
> On Wed, 2004-12-08 at 15:40, Gary Curtis wrote:
> > This would not be a problem so long as the digits are always
> > in [say] the third element AND elements one and two, together,
> > represent the section of the string to the left of the digits.
> 
> I have not tested any of this, and from experience
> any non-trivial regex has bugs.  Its sort of a
> corollary to "every non-trivial program has bugs".
> 
> That said a preg_match() should give you:
> 
> Input           $match[0]  $match[1]  $match[2]  $match[3]
> --------------  ---------  ---------  ---------  ---------
> "12"            "1"        ""         "12"       ""
> "12aaa"         "12aaa"    ""         "12"       "aaa"
> "aaa12"         "aaa12"    "aaa"      "12"       ""
> "1aaa12"        "1aaa12"   "1aaa"     "12"       ""
> "1aaa12xx"      "1aaa12xx" "1aaa"     "12"       "xx"

Looking at it using the input and desired outputs specified here, I can see two ways immediately:

preg_match("/\d+/", strrev($str), $matches);
$num = strrev($matches[0]);

preg_match("/(\d+)\D*$/", $str, $matches);
$num = $matches[1];

The conclusion I draw from this discussion, this problem, and my experiences with regex in general, is that it is more important to have a comprehensive set of test cases than it is to be able to write good regex.

Regards,
Stephen Thorne.




More information about the General mailing list