[H-GEN] REGEX pattern required

Andrae Muys andrae at tucanatech.com
Tue Dec 7 22:29:12 EST 2004


Gary Curtis wrote:
> [ Humbug *General* list - semi-serious discussions about Humbug and     ]
> [ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
> 
> Whenever I bring up the subject of parsing text all the regex 
> fiends jump up and down, singing its vitues. So here is your
> chance to show me how good it is...
> 
> Whart I want is a pattern to extract the RIGHT-MOST set
> of numeric digits in the following test strings. I plan to use
> the PHP function reg_split (or preg_splt), and hope to get
> a three element array.   array[1] will contain all text to the left
> of the digits (perhaps null),  array[2] will contain the digits, and
> array[3] will contain all text to the right of the digits (or null).
> 
> You may assume there is always at least one digit somewhere
> in the string and NO spaces.
> 
> Examples:
> 123                      null  123  null
> 123abc                null  123  abc
> jk234                    jk  234  null
> abc34fgh             abc   34   fgh
> a9bc456              a9bc   456   null
> K+v5678aa=x       K+v    5678   aa=x
> aa00001-123b     aa00001-   123   b
> 
> Can this be accomplished with just one pattern?
> 

No memory required, so yes a regex can do this.  (If one part of the 
pattern is defined in terms of another part of the patter, you can't use 
a regex.  The cannonical example is matched parenthesis --- the number 
of closing parens is defined wrt. the number of preceeding opening parens.)

What you want is (all non-numeric text)(numeric text)(text)

This becomes ([^0-9]*)([0-9]*)(.*)  escape and substitute appropriate 
character class labels to suit.

Andrae

-- 
Andrae Muys                do { ... }
andrae at tucanatech.com      while (low != (high - 1));
Senior Engineer            assert low == (high - 1);
Tucana Technologies                -- Can you be more defensive?





More information about the General mailing list