[H-GEN] The perennial book thread

Fri Oct 31 02:12:06 EST 2003

[ Humbug *General* list - semi-serious discussions about Humbug and     ]
[ Unix-related topics. Posts from non-subscribed addresses will vanish. ]

On 2003-10-31, Raymond Smith wrote:
> Jason Parker-Burlingham said:
> > Does anyone know of a good source of C coding examples to draw on?  It
> > doesn't necessarily have to be a guide or book built to learn from; a
> > decent collection of medium-sized programs that are written how good
> > programmers write will do the trick.
> 
> What -- GLIBC isn't good enough for you? :-P I'm not sure, but a good
> source of reviews of books is the Association of C and C++ Users' book
> reviews:
>     http://www.accu.org/bookreviews/public/index.htm

Depends on your definition of "good".  Their list of highly
recommended advanced C books includes some gems, but some utter
rubbish.  And it is missing the seminal book.

Peter van der Linden's book jumped into my list of worst books
on C when I was sent my first review copy.  Although he agreed
to fix some of the errors I pointed out, he stuck to his guns
about most of his misconceptions.  I have never written so many
marginal notes in a book while reviewing it.  And I wish it had
been pulped at birth.

Another one is a book on C and C++ by David Spuler -- if you'd
crossed swords with him on comp.lang.c you'd already know he was
an idiot; seeing a book about C and C++ is further evidence.

Book reviews are only as good as the reviewers -- most people
who review books on C know less about it than the books' authors
and so can't offer useful opinions; and most of the authors know
less about it than they'd need to know to get a job writing it.

The best book reviews are written by people whom you know to be
competent in the subject.  Otherwise, they're little better (and
often worse) than the cover blurbs.

> Other than that, you really can't go past trying different approaches
> and getting them reviewed by a good Mentor.

The good mentor idea is excellent; just choose with care.

It's also useful to read bad code in order to see if you can
articulate what makes it bad.  I can give some examples for
this.

Start with Dan Bernstein's software -- anything will do, but if
you take something like qmail which has a well-known task, it's
probably a good starting point because you'll know that your
confusion is not over what the software does, but is a result of
the way it's written.

(I should add that there's nothing wrong with qmail from the
point of view of robustness or functionality or security, as far
as can be deduced by its success in the field.  My comments are
to do with the coding style.)

Since Bernstein believes that nobody in the world apart from him
can write correct code, he refuses to use the standard libraries
and reinvents not only the wheel but also the chisels to make it
with.  The code is fast and works but is close to unmaintainable
because it's such a shock to the reader.  Anything that you'd
expect to accomplish with a single library call ends up being
coded in maybe a dozen or twenty lines of really obscure stuff.

When I set out to modify qmail so that it would certain things
the way I wanted instead of in the Bernstein way, I expected it
would take me about an hour, maybe half a day if the code turned
out to be interesting and I just started reading.  But it took
me a couple of days to dig through to the stuff I needed and I
most certainly did not just browse it for interest.  And that's
also why I don't use anybody else's qmail patches -- the ones
I've seen show no evidence of having been written by people who
have managed to understand the Bernstein code and so could
easily introduce bugs that I'd rather not risk.

Another rich lode of really bad code is the source to any Unix
system and its utilities -- be it System V, BSD or Linux.  Much
of the code was written by students who still had a lot to learn
and who were just having fun.  Most of them had no idea just how
much they still didn't understand and the code is a reflection
of the lack of skill they had.  (Not all the Unix/BSD/Linux code
is bad, but a great deal is.)

I've never seen Microsoft source, but it would certainly be a
source of amusement -- I've read the API's for lots of their
system functions and they could only have been invented by
people without a clue.

I've just gone looking for some good code that's easy to get
hold of.  Perl is interesting -- it's quite readable (and the
comments are entertaining, sometimes even useful), although the
style is not what I'd choose and they do abuse the preprocessor
rather too much and there are few too many "clever" bits where
there's no real reason.  Some of the clever stuff is fine, such
as the hacks to Henry Spencer's regex code where there are real
performance issues; but a lot of it looks gratuitous.  And, in
the absence of comments telling us otherwise, it is gratuitous.
Although I don't really recommend the Perl code as great code,
reading it might be useful -- it will illuminate some of the
interesting corners of the language (which would be a bonus for
a Perl programmer), and it does show at least one approach to
coding a significant and complex piece of software.

So then I thought I'd have a look at Python.  On first blush,
the code seemed OK, but as I read more files I realised that the
style police needed to be called out.  I would refuse to work on
Python unless they ran the entire code base through indent(1)
with some options that suited me.  And then I found a file with
this comment at the top:

/* A perhaps slow but I hope correct implementation of memmove */

OK, so they provide an implementation of memmove(3) for those
compilers that predate the 1989 standard -- how nice.  Perhaps
nobody uses this code, since any compilers still in use should
not need it; perhaps the fact that it would only be compiled by
non-standard compilers means that the incorrectness of the code,
according to the standard, is moot.  Still, coding a memmove(3)
function that was definitely correct is trivial; getting it
wrong is inexcusable.  Anyway, for a file with less than 20
lines of C, it's not much of an advertisement ...  So skip
Python.

So let's have a browse through some other public source code on
this box ...  Well, 45 minutes just went up in smoke.  This is
really depressing, but there's really lots more bad code than
good code out there.

Perhaps I should define good code:  it's code that is correct
and that is easy for its human readers to understand.  To some
extent, it's also efficient -- but only as far as needed and not
at the expense of correctness (ever) or clarity (except when
performance is inadequate and profiling shows that it's needed,
and only when accompanied by the original clear code and clear
documentation explaining what was done).

I could go on with this topic for days, but I suspect that I
should stop here.  I thought we were supposed to do this kind of
thing on dsig, but I also saw at least one claim that the ?sig
lists were to be rolled into general -- perhaps a list mom can
set us straight?

Cheers, Greg

--
* This is list (humbug) general handled by majordomo at lists.humbug.org.au .
* Postings to this list are only accepted from subscribed addresses of
* lists 'general' or 'general-post'.  See http://www.humbug.org.au/