[H-GEN] Processing large text files in perl
Jason Parker-Burlingham
jasonp at uq.net.au
Wed Mar 19 00:18:16 EST 2003
[ Humbug *General* list - semi-serious discussions about Humbug and ]
[ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
Christopher Biggs <chris at epipe.com.au> writes:
> Michael Anthon <michael at anthon.net> wrote:
> > My real question is how I should go about this if I were to
> > rewrite it in perl. My first thought was to have mysql/postgresql
> > installed on the machine that will be running the process (to
> > avoid network traffic) and use perl DBI but I don't know if that
> > is the "best" way to do it. Is there some other simple and fast
> > DB system I could use instead?
>
> If you do need an RDBMS, then it doesn't /get/ much simpler than Perl,
> DBI and mysql (maybe excepting msql, which is optimized for simple
> operations on fairly uncomplicated tables).
Byron Ellacott <bje at apnic.net> writes:
> If you don't need particularly complex querying, you could try out the
> perl berkeley db interface. In particular, BerkeleyDB.pm allows you to
> use:
A third alternative (well, really an alternative to Chris's suggestion
(which is what I'd use)) is to use the New Spanking Class::DBI, which
is what I use to manage my USENET archive.
Basically you set up your tables and write a couple of classes, one
for each table; each object of that class represents a row in the
database. All this inherits from Class::DBI itself, and you get
search methods, etc etc for free! Hooray! All the knowledge about
how to connect to the database, whether it's MySQL or PostgreSQL or
Oracle or whatever is abstracted away and life is good again.
"has_a" and "has_many" relationships can be declared between classes
so that objects are automatically created from foreign keys.
You can even still just talk plain old SQL to the database and return
objects if things aren't flexible enough for you.
Things it's not: super-fast, bug-free, wart-free. Even the API isn't
totally stable yet. But it *has* reduced the amount of legwork
required to get pretty basic functionality up and running:
[0]henry at freezer:USENET $ wc suck-from-nntp.pl USENET.pm
49 151 1293 suck-from-nntp.pl
151 467 3745 USENET.pm
200 618 5038 total
(suck-from-nntp talks NNTP to various servers to fetch articles, and
creates the necessary objects; this has the side-effect of storing all
the articles found in the database.)
jason
--
``I may have agreed to something involving a goat.'' -- CJ
--
* This is list (humbug) general handled by majordomo at lists.humbug.org.au .
* Postings to this list are only accepted from subscribed addresses of
* lists 'general' or 'general-post'. See http://www.humbug.org.au/
More information about the General
mailing list