[H-GEN] Scripting guide

McBofh McBofh at jmcp.id.au
Thu May 25 03:05:28 EDT 2006


Kelvin Heng wrote:
> I am to write a script to retrieve the IP address from a list of log 
> files, and if the IP address matches the ones I need then I will take 
> them out.
> But I am facing some problem with the dot (.) between IP address. Can 
> someone help me with this?
> 
> testfile:
> a
> aa
> aaa
> .aa
> a.a
> aa.
> 
> Test 1:
> $ grep -w a testfile
> a
> a.a
> 
> Test 2:
> $ grep -w "a." testfile
> aa
> .aa
> aa.
> 
> Test 3:
> $ grep "a.a" testfile
> aaa
> a.a

Hi Kelvin,
David Ash has already alluded to the metacharacter nature of "." and
that you need to escape it with the \ character.

Might I suggest that you have a read of the following tutorials on
the subject:

http://en.wikipedia.org/wiki/Regexp
http://www.zytrax.com/tech/web/regex.htm
http://www.regular-expressions.info/reference.html



A simple approach to solving your question is something along these
lines:

$ grep "A\.B\.C\.D" file

Or, since you're wading through logfiles, you could use (g|n)awk and
only print out specific fields. Here's an example:


$ awk '/[0-9].*\.[0-9].*\.[0-9].*\.[0-9].*\./ {print $8,$13}' \
	/var/apache/logs/error_log


192.168.1.20] /scratch/web/htdocs/favicon.ico
192.168.1.20] /scratch/web/htdocs/favicon.ico
192.168.1.20] /scratch/web/htdocs/favicon.ico
192.168.1.20] /scratch/web/htdocs/favicon.ico
192.168.1.20] /scratch/web/htdocs/favicon.ico
192.168.1.20] /scratch/web/htdocs/favicon.ico
192.168.1.20] /scratch/web/htdocs/favicon.ico
10.1.1.20] /scratch/web/htdocs/favicon.ico
10.1.1.20] /scratch/web/htdocs/favicon.ico
10.1.1.20] /scratch/web/htdocs/favicon.ico
10.1.1.20] /scratch/web/htdocs/favicon.ico
10.1.1.20] /scratch/web/htdocs/favicon.ico
192.168.1.20] /scratch/web/htdocs/favicon.ico
192.168.1.20] /scratch/web/htdocs/favicon.ico
192.168.1.20] /scratch/web/htdocs/favicon.ico
192.18.43.5] /scratch/web/htdocs/favicon.ico
192.18.43.5] /scratch/web/htdocs/favicon.ico
192.18.43.5] headers:


which searches for IP addresses in my apache error_log file, and if
it finds any prints out the IP and the file that wasn't found.

I'm using the metacharacter sequence [0-9] to match any sequence of
characters in the set {0,1,2,3,4,5,6,7,8,9}, ".*" to match any number
of characters, and "\." to match the period character.

Manpages you might want to bury yourself in include
* awk / nawk /gawk
* grep / egrep / fgrep
* sed / gsed


Hope that's enough to get you started.


cheers,
James






More information about the General mailing list