[H-GEN] perl question about sort()

Jason Parker-Burlingham jasonp at uq.net.au
Thu Feb 27 10:28:57 EST 2003


[ Humbug *General* list - semi-serious discussions about Humbug and     ]
[ Unix-related topics. Posts from non-subscribed addresses will vanish. ]

Tony Nugent <tony at linuxworks.com.au> writes:

> For a while now I've been wanting to do some complex sorting (eg, by
> specific fields) with arrays in perl scripts, but never managed to
> do it using sort()...  figuring out exactly how to use perl's sort()
> function has me somewhat confused.

Okay.

C<sort> can accept a code reference as an optional argument before the
list to be sorted:

        sort CODE LIST
        sort &byname @students;

The code reference should use the *global* (not package or "my"
variables!) as arguments.  They represent the thing being sorted, in
turn.  The code reference should not return inconsistent results.  Oh,
and don't modify $a or $b.

Let's assume for a moment that you'd like to sort a list of hashrefs,
which have a "name" and an "age" attribute.

Since the coderef can be an anonymous block, you can do it all
in-line, like so:

        sort { $a->{name} cmp $b->{name} } @students;
                OR
        sub byname {
                $a->{name} cmp $b->{name};
        }
        sort byname @students;

Note that we are sorting references to hashes, so the -> operator
comes into play; and the use of the C<cmp> operator instead of the
numeric "<=>" (or "spaceship") operator.

Similarly to sort by age (a numeric value) you'd use something like:

        sort { $b->{age} <=> $a->{age} } @students

Note that by putting $b and $a on the other side(s) of the operator,
we reverse (most of) the results returned and the list is sorted in
reverse order (run this code if you're not convinced):

        print join " ", sort { $b <=> $a } (1..14,20..27), "\n";

Okay.  What if your students are objects (as they should be).  And
they have a method call which returns their GPA, but is slow?  A naive
approach would be

        my @smarties = sort { $b->gpa <=> $a->gpa } @students;

But that could be *very* slow for millions of students, since the gpa
method gets called over and over for the same student (remember that
it's slow?)  Now we bring out the big guns:

        # This is the basic idea, I have no time to test it.
        my @smarties = map {
                $_->[0]
        } sort {
                $b->[1] <=> $a->[1]
        } map {
                [ $_, $_->gpa ]
        } @students

This is known as a Schwartzian Transform.  Reading back-to-front, we
see that the first C<map> turns the list of student objects into a
list of arrayrefs containing two elements:  the student object, and
its gpa value.  Then, these arrayrefs are sorted by their second
element.  *Then*, as a final step, another map transforms the (now
sorted) student-gpa pairs back into student objects, while preserving
the ordering.

The advantage of this is that the GPA object method is called only
once per object (in the initial C<map> phase).

Note that in at least several years of writing almost exclusively in
Perl, I have never, I think, had to write a Schwartzian Transform
sort.  I simply haven't had to deal with datasets big enough to make
it worthwhile!  I mention it only for completeness, and the
"oooh...pretty" factor.

> It seems to be highly flexable and customisable, but the logic and
> syntax of it hasn't seem to have jelled for me.

If there's something very specific in this example that you don't
understand, reply to the list and point it out.  I'll happily try
again.

> I'm wondering if anyone here is inspired to give some insights and
> examples of how it is used, and what sorts[1] of amazing things you
> can do with it.

Okay, you should also read the C<sort> section of the Perl manual (run
"perldoc -f sort") and the Perl FAQ entry on sorting
(run "perldoc -q sort").

By the way:  if you do not know already, perldoc holds most of the
answers to questions at this level (and beyond).  You already have it
installed, since you're writing Perl and presumably have an
interpreter somewhere.  You should learn how to use it by reading the
perldoc manpage ("perldoc perldoc") and perusing the "perl" manpage,
which is a table of contents for everything else.

jason
-- 
``I may have agreed to something involving a goat.''  -- CJ

--
* This is list (humbug) general handled by majordomo at lists.humbug.org.au .
* Postings to this list are only accepted from subscribed addresses of
* lists 'general' or 'general-post'.  See http://www.humbug.org.au/



More information about the General mailing list