[H-GEN] ASCII Characters

Andrew Pullin andrew at hotspurbgc.com.au
Fri Oct 24 09:33:09 EDT 2003


[ Humbug *General* list - semi-serious discussions about Humbug and     ]
[ Unix-related topics. Posts from non-subscribed addresses will vanish. ]

Hi There,
    What characters do you want to keep? It sounds to me that all you want
is ascii text from an irc and nothing else so what you need is a script that
looks at each line, and then for each character in that line you want to
check it is [a-zA-Z] or [1-9] or some punctuation mark. From memory, those
characters are all below 127 so an initial match of below 127 or x00FF
should strip most of the characters away. Check out man ascii to find out
what other control characters you don't want like bell etc and narrow your
search. You could write a Perl script yourself fairly easily, but this kind
of job is so common that if you did a quick search on your favorite search
engine you would probably find one for little effort.

    In fact this message is probably longer than your final Perl script you
will use and I probably should have just written it for you - but then it
wouldn't be a learning experience for you. If you need help, almost any Perl
tutorial will do as in most cases the first thing you do after "Hello World"
is some kind of line count script - which is half of your script already,
and good tutorials have a few variants that are very close to what you want.
    Good Luck.
        Andrew.

----- Original Message -----
From: Joe Skilton
To: general at lists.humbug.org.au
Sent: Friday, October 24, 2003 9:09 PM
Subject: [H-GEN] ASCII Characters


Hi All,
    I have some ASCII characters I need to strip from some output, the only
problem being I don't know the characters (sounds strange I know, but it
will become clear).

The output itself has nothing except blank space where these characters are,
but if I cut and paste it into gedit they come up in a little box, which
contain the numbers 0002 and 0003, I know it's ASCII, as it's IRC output
(and a guy on a scripting channel explained that it is ASCII, but he wasn't
sure what) but i'm not sure what these characters are, and therefore i'm not
sure how to use my perl program to strip them ?

Any Perl heads out there with an idea ? *cough* Jason :) *inocent look* who
said that ?.
Maybe grep or some such program could strip them for me ?

Pre-emptive apology to anyone this annoys (I always seem to piss someone
off),
Cheers,
        Joe


--
* This is list (humbug) general handled by majordomo at lists.humbug.org.au .
* Postings to this list are only accepted from subscribed addresses of
* lists 'general' or 'general-post'.  See http://www.humbug.org.au/



More information about the General mailing list