[H-GEN] Edit massive XML files

Russell Stuart russell-humbug at stuart.id.au
Mon Sep 19 01:13:00 EDT 2011


On Mon, 2011-09-19 at 14:58 +1000, Mick wrote: 
> what I'm trying to say (badly it seems) is I want to delete the whole 
> line if it begins with 
>     <tag k="postal_code"
> regardless of what the rest of the line contains.

In that case it is trivial.

The -v flag to fgrep makes it pass through every line that *doesn't*
match.  The -f flag to fgrep allows you to supply multiple lines to
search for; just list each on a separate line.  See "man fgrep".  By the
sound of it you have a file very close to the required format now.

So, create a file, with a copy of each line you don't want to match, and
then run fgrep on it.  Eg:

$ cat <===== >lines-to-delete
<tag k="postal_code"
<tag k="source"
=====
$ fgrep -v -f lines-to-delete <orig.xml >new.xml




More information about the General mailing list