[H-GEN] Edit massive XML files
Johannes Sprigode
johannes at paradise.net.nz
Mon Sep 19 00:34:08 EDT 2011
Mick,
awk is your friend.
A very basic awk script below
---> cut below <---
#!/bin/sh
BEGIN {
s1 = "<tag k=\"source\"";
s2 = "<tag k=\"postal_code\"";
}
{
if ((index($0, s1) != 0) || (index($0, s2) != 0));else print $0;
}
---> cut above <---
call with awk -f mick.awk mick.xml
where mick.awk is above script and mick.xml is your input file.
Cheers
Johannes
On Mon, 19 Sep 2011 15:11:12 Mick wrote:
> [ Humbug *General* list - semi-serious discussions about Humbug and ]
> [ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
>
> On Mon, 19 Sep 2011 12:51:39 +1000
>
> Russell Stuart <russell-humbug at stuart.id.au> wrote:
> > On Mon, 2011-09-19 at 12:16 +1000, Mick wrote:
> > > I am sure a combination of grep & sed or similar could automate this
> > > but I can't get my head around the sed part.
> >
> > Maybe it could, but without actual samples showing the data you are
> > trying to edit and what you are trying to accomplish it is hard to
> > say.
>
> mea culpa
>
> small snips
> ===========
> ...
> <node id="290343" version="3" timestamp="2011-07-01T10:12:42Z"
> uid="61216" user="UniEagle" changeset="8597887" lat="51.2238425"
> lon="-2.5264778"/>
> <node id="290344" version="7" timestamp="2011-06-01T11:50:39Z"
> uid="109205" user="Jack Stringer" changeset="8309592"
> lat="51.2237079" lon="-2.5236032">
> <tag k="name" v="Oakhill"/>
> <tag k="place" v="village"/>
> <tag k="postal_code" v="BA3 5"/>
> <tag k="source" v="npe"/>
> </node>
> <node id="290345" version="7" timestamp="2009-10-29T15:49:02Z"
> uid="90943" user="chris_debian" changeset="2982115"
> lat="51.2222188" lon="-2.5236527"/>
> ...
>
> should become
> =============
> ...
> <node id="290343" version="3" timestamp="2011-07-01T10:12:42Z"
> uid="61216" user="UniEagle" changeset="8597887" lat="51.2238425"
> lon="-2.5264778"/>
> <node id="290344" version="7" timestamp="2011-06-01T11:50:39Z"
> uid="109205" user="Jack Stringer" changeset="8309592"
> lat="51.2237079" lon="-2.5236032">
> <tag k="name" v="Oakhill"/>
> <tag k="place" v="village"/>
> </node>
> <node id="290345" version="7" timestamp="2009-10-29T15:49:02Z"
> uid="90943" user="chris_debian" changeset="2982115"
> lat="51.2222188" lon="-2.5236527"/>
> ...
> ==========
>
> in this instance the tags:
> <tag k="postal_code" v="BA3 5"/>
> <tag k="source" v="npe"/>
> need to be deleted along with EVERY other instance whatever the v="
> value is
>
> mick
> _______________________________________________
> General mailing list
> General at lists.humbug.org.au
> http://lists.humbug.org.au/mailman/listinfo/general
More information about the General
mailing list