[H-GEN] Spamassassin performance tests

Sat Apr 6 20:25:40 EST 2002

[ lmth.etteuqiten/ua.gro.gubmuh.www//:ptth :gnidnatsrednu elihwhtroW ]
[           .retrahc s'tsil eht evresbo esaelP  .scipot detaler-xinU ]
[  dna gubmuH tuoba snoissucsid suoires-imes - tsil *lareneG* gubmuH ]

On Sun Apr 07 2002 at 00:35, Nikolai Lusan wrote:

> Robert Brockway wrote:
> 
> > 100 simultaneous instances of spamassassin was having a significant effect
> > on my K6-233 fileserver.

(ouch)

BTW, spamassassin (http://www.spamassassin.org/) is easy to install:

 $ perl -MCPAN -e shell
  ... (you might need to set it up) ...
 cpan> install Mail::SpamAssassin
  ... (it may also require installing of dependencies, that's ok) ...
 cpan> quit

> I discovered this the hard way, I did a run which generates much email
> and my p233-MMX desktop slowed to a crawl as it's load topped 15 :)

One trick with using spamassassin is _not_ to let it process large
messages (greater than ~50-100k), they are usually not spam anyway.
I have in my ~/.procmailrc file...

=========8<---- cut --------------------
# SpamAssassin checks - only for mail < 100kb
# spamd MUST be running on port 20202 (spamd -d -p 20202)
:0 fw
* < 100000
| spamc -p 20202
# if not using spamd, then do:
# | spamassassin -P
:0 e
{
  EXITCODE=$?
}
:0H :spam/$LOCKEXT
* ^X-Spam-Status:.*Yes
| rcvstore +spam
=========8<---- cut --------------------

> > today I conducted some performance tests on Spamassassin vs its
> > spamc/spamd pair that are supposed to improve performance.  The different
> > is amazing.

Indeed.  Running spamassassin for each message is demonstrably slow
when you use it with procmail... using spam{c,d} does make a huge
difference.  With SA I'd do a fetchmail run and each message would
take 10-15 seconds or more to process with "| spamassassin -P", but
with spamc the pause is down to less than 1-2s.

BTW, running spamd or spamassassin with the -L flag helps (it stops
it doing things like DNS and other internet checks).

> > It took maybe 200 messages running the conventional
> > spamassassin mechanism to drop my server blake to its knees.  As for
> > spamc/spamd well I'm part way through sending 2000+ emails to my test
> > users and so far it is doing it on its ear :)
> 
> This may sound like an odd question but how where you using
> spamassassin? I have it configured using the spamass-milter plugin for
> sendmail (which uses spamc) and I get large load increases when heaps of
> mail is delivered. The milter also seems to slow sendmail down over time
> (the mail sever for a non-profit organisation I do stuff for has
> spammass-milter on it and while the load doesn't go up th time it takes
> sendmail to do stuff does). Given that you seem to have found a way
> remedy this situation I would be interested to know how.

I'm using it in two places... in my own ~/.procmailrc (above), and
on mail servers running (8.11.x, 8.12.x) versions of sendmail that
have had the "milter" API enabled.  I'm then using MIMEDefang (which
is perl-based) to filter emails.  (I have milter-enabled rpms
available for sendmail 8.12.3 if anyone is interested).

MIMEDefang (http://www.roaringpenguin.com/mimedefang/) allows you to
filter email at the time that mail is being processed (sent or
received)... it gives you complete control, perl-style, over the
fate of any message (accept, reject, tempfail, alter or otherwise
hack as you may desire).

I'm using it to filter emails for (among other things) virus
attachments (there are at least four virus products that will work
with defang on linux), and it does that VERY well.  (I'm using
nai/mcafee's uvscan).

I was also using SpamAssassin with defang for a short while... both
SA and 'defang are perl and they work well (and efficiently)
together -- mimedefang is already daemonised, it uses the SA perl
modules directly and does not need to use spamd.  It is still
necessary not to allow SA to filter large messages (because of the
performance hit).

Until I can customise SA some more, at the moment I have it just
adding a "spam rating" header and not to fail/tempfail or quarantine
any of them at all.

  SA is excellent, but it isn't 100% accurate at identifying spam
  (I'd say that it gives false positives to around 2-3% of my own
  email.  I rather see spam go through than non-spam to be blocked).

  Not many (very few, if any) spams are missed, but its "threshold"
  score level of "5.0 hits" is too low, even 7.0 gets a lot of false
  positives -- but raising it then allows more real spam to slip
  through.

  I'll end up making it block again, but set the threshold level to
  something like 10.0 to catch the more "blatant" spammers, and try
  to fine-tune the white/blacklists it is using.

> Nikolai

Die spam die.  :)

There are other things you can use with MIMEDefang besides SA and
virus scanners, like:
- Anomy::HTMLCleaner (http://mailtools.anomy.net) to sanitise
  potentially malicious html/java, and
- mvWare (mswordview, http://www.wvware.com) to convert .doc
  attachments to "safe" html.
Because the filter is a perl include that you write yourself,
M'defang is very powerful and highly customisable.

It's a real pity that the internet has come to the point where email
needs to be so thoroughly filtered and censored...

Cheers
Tony
---*#*=-=*#*=-=*#*=-=*#*=-=*#*=-=*#*=---
  Tony Nugent <Tony at linuxworks.com.au>
  LinuxWorks - Gold Coast Qld Australia

--
* This is list (humbug) general handled by majordomo at lists.humbug.org.au .
* Postings to this list are only accepted from subscribed addresses of
* lists 'general' or 'general-post'.