[H-GEN] Edit massive XML files

Michael Anthon michael at anthon.net
Sun Oct 2 19:38:46 EDT 2011


Your other alternative is some web based tool, only one that springs to mind
right now is Yahoo's "Pipes" http://pipes.yahoo.com/pipes/ .  I've only ever
really had a bit of a fiddle with this but it's pretty easy to use.  I'm
sure there will be other similar tools out there but I've never needed to
use them.

The java based open source BI suite from Pentaho also has a pretty powerful
ETL tool called Kettle that could be used for this kind of work. This one I
have used quite a lot and it's performance is pretty good
http://kettle.pentaho.com/

I've tried writing XML parsers in various types of languages using various
libraries before but have often run into difficulties with the size of files
I needed to work with.  The main issue I ran into is that a lot of the
libraries I was using attempted to build the whole DOM as an in memory
object... which can be a bit of an problem at times :-)

Cheers,
Michael

On 20 September 2011 17:42, Mick <bareman at tpg.com.au> wrote:

> [ Humbug *General* list - semi-serious discussions about Humbug and     ]
> [ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
>
> On Mon, 19 Sep 2011 16:34:08 +1200
> Johannes Sprigode <johannes at paradise.net.nz> wrote:
>
> Many thanks to all who offered advice, Johannes' offering provided what
> I was looking for, it removed what I wanted and only what I wanted.
>
> Unfortunately, I was trying to take an unsuitable shortcut to filter
> the data and was minimally successful in advancing towards my final
> goal of eliminating irrelevant data and rectifying inconsistent tagging
> of nodes.
>
> I guess my only option is to learn python & ruby in order to tweak
> QGIS' Open Street Map importer to meet my needs.
>
> mick
> _______________________________________________
> General mailing list
> General at lists.humbug.org.au
> http://lists.humbug.org.au/mailman/listinfo/general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.humbug.org.au/pipermail/general/attachments/20111003/407d2122/attachment.html>


More information about the General mailing list