[H-GEN] Shell programming [was: Re: Encrypting a tar backup]

Greg Black gjb at gbch.net
Fri Oct 25 03:40:09 EDT 2002


[ Humbug *General* list - semi-serious discussions about Humbug and     ]
[ Unix-related topics. Posts from non-subscribed addresses will vanish. ]

Jason Parker-Burlingham wrote:

| Greg Black <gjb at gbch.net> writes:
| 
| > |         $ WRONG=`cat file | grep pattern | wc -l`
| > |         $ RIGHT=$(grep -c pattern file)
| > Here again, I would disagree.  It's not grep's job to count
| > stuff and that's better left to the specialised program, wc.
| 
| This is where I started to think there may be something interesting
| going on.  I think that you and I use two quite different sets of
| idioms when shell programming.  I prefer to use all the options
| available to me, regardless of whether they really match the program's
| stated purpose, while (and I hesitate to try to summarize your point
| of view) you seem to have distinct expectations of the duties of each
| program.  I'm sure there are list members who sort into either camp.

Rather than re-state my position, I'll refer you to that paper
by Kernighan and Pike that I have mentioned previously -- they
cover the fundamentals of my attitude thoroughly and there is
good material in there for anybody who works with Unix systems.
See http://www.cs.bell-labs.com/cm/cs/doc/84/kp.ps.gz for the
full story (including some excellent background on the Unix
philosophy).

In practice, my use of utilities depends on my previous
experience -- if somebody has added some new option to an old
tool that I know how to use, I won't discover the option in
normal circumstances.  If I want to use a little-used but
well-known feature of some tool, I may refer to the man page to
remind myself of the syntax and, if there are new features, I
may read about them.  Whether or not I remember that they exist
is subject to the vagaries of my memory, but is also a factor of
the suitability (as I see it) of the new feature for the tool.

And I'm not entirely consistent, as befits a hard-core Unix
user.  Although I resisted using the `z' option of GNU tar for
ages, preferring to include g{,un}zip as an element of the
command invocation, I have adopted `z' over the past year or so
because I have convinced myself that I will probably never again
have to use a tar variant that doesn't support it.  Despite
that, if I'm typing at an unfamiliar system (e.g., giving
somebody a hand at a Humbug meeting), I'm almost certain to
revert to using the separate program in the pipeline.

| Perhaps a good discriminator would be whether one would use sort -u?

I've always thought that the capability of -u belongs in a sort
program and I dislike the non-standard syntax of uniq (one of
the programs that needs help from cat if it's to handle multiple
input files safely[1]), so I use sort -u when it can do what I
want.  If I need the extra capabilities of uniq, I sigh quietly,
RTFM, and add it to my pipeline.  On the other hand, there are
some real efficiency benefits from using -u rather than uniq
(provided that -u does what you want, of course).

| > Part of learning about shell programming (and command line use in
| > general) is understanding when you might want to use cat and when
| > you're better off not using it.[2]
| 
| And I think this is the point I was trying to make.  Your earlier
| example was a case that I had missed.

I'm sure we're basically in agreement -- this discussion is
mainly intended to give people some food for thought.

| > There are far
| > more important things to weed out of one's repertoire.
| 
| Especially if the poor sap is using C shell for scripting!

For those who don't know, there are several million excellent
reasons for not using csh for scripts; and no good reasons for
using it.

[1] To see what I mean about uniq, you can sort two files and
    strip duplicate lines with either of these:

        $ sort -u foo bar
	$ sort foo bar | uniq

    But, if in a moment of forgetfulness, you tried to do it
    with this:

        $ uniq foo bar

    you will be severely punished -- you get no output and bar
    is suddenly over-written with the contents of foo.  Oops.
    
Greg

--
* This is list (humbug) general handled by majordomo at lists.humbug.org.au .
* Postings to this list are only accepted from subscribed addresses of
* lists 'general' or 'general-post'.  See http://www.humbug.org.au/



More information about the General mailing list