[H-GEN] Edit massive XML files

Benjamin Fowler ben.fowler.bjf at gmail.com
Mon Oct 3 12:01:03 EDT 2011


What are your primary technical objections to using DOM?  Memory consumption?  Too low level?

Its been a while since I've needed to diddle XML in code (I usually use XSLT, or a binding library, eg XMLBeans or JAXB), but I think it's okay for small documents where XSLT can't be used for some reason.

Cheers, Ben.


On 3 Oct 2011, at 16:40, Paul Gearon <gearon at ieee.org> wrote:

> [ Humbug *General* list - semi-serious discussions about Humbug and     ]
> [ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
> 
> On Sun, Oct 2, 2011 at 7:38 PM, Michael Anthon <michael at anthon.net> wrote:
> <snip/>
>> I've tried writing XML parsers in various types of languages using various libraries before but have often run into difficulties with the size of files I needed to work with.  The main issue I ran into is that a lot of the libraries I was using attempted to build the whole DOM as an in memory object... which can be a bit of an problem at times :-)
> 
> Ugh. *Never* build a DOM. The only time I break this rule is in a web
> browser, since the browser has already built it for me.
> 
> SAX is useful here, if you're prepared to process the data as it's
> coming in. However, for really large datasets, that can be akin to
> drinking from the fire hose. It depends on the application. In a lot
> of cases, a more useful API is StAX, which lets you pull things off
> the stream as you need them.
> 
> Either SAX or StAX is fine, but never DOM.  :-)
> 
> Regards,
> Paul
> _______________________________________________
> General mailing list
> General at lists.humbug.org.au
> http://lists.humbug.org.au/mailman/listinfo/general



More information about the General mailing list