[H-GEN] REGEX pattern required
Andrae Muys
andrae at tucanatech.com
Tue Dec 7 22:29:12 EST 2004
Gary Curtis wrote:
> [ Humbug *General* list - semi-serious discussions about Humbug and ]
> [ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
>
> Whenever I bring up the subject of parsing text all the regex
> fiends jump up and down, singing its vitues. So here is your
> chance to show me how good it is...
>
> Whart I want is a pattern to extract the RIGHT-MOST set
> of numeric digits in the following test strings. I plan to use
> the PHP function reg_split (or preg_splt), and hope to get
> a three element array. array[1] will contain all text to the left
> of the digits (perhaps null), array[2] will contain the digits, and
> array[3] will contain all text to the right of the digits (or null).
>
> You may assume there is always at least one digit somewhere
> in the string and NO spaces.
>
> Examples:
> 123 null 123 null
> 123abc null 123 abc
> jk234 jk 234 null
> abc34fgh abc 34 fgh
> a9bc456 a9bc 456 null
> K+v5678aa=x K+v 5678 aa=x
> aa00001-123b aa00001- 123 b
>
> Can this be accomplished with just one pattern?
>
No memory required, so yes a regex can do this. (If one part of the
pattern is defined in terms of another part of the patter, you can't use
a regex. The cannonical example is matched parenthesis --- the number
of closing parens is defined wrt. the number of preceeding opening parens.)
What you want is (all non-numeric text)(numeric text)(text)
This becomes ([^0-9]*)([0-9]*)(.*) escape and substitute appropriate
character class labels to suit.
Andrae
--
Andrae Muys do { ... }
andrae at tucanatech.com while (low != (high - 1));
Senior Engineer assert low == (high - 1);
Tucana Technologies -- Can you be more defensive?
More information about the General
mailing list