[H-GEN] Experiences with SpamAssassin
Greg Black
gjb at gbch.net
Fri Nov 22 22:28:32 EST 2002
[ Humbug *General* list - semi-serious discussions about Humbug and ]
[ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
Joel Michael wrote:
| On Fri, 2002-11-22 at 17:26, Greg Black wrote:
| > Total archive size 84516
| > Actual spam 7918
| > False positives 218 ( 0.28% of good messages)
| > False negatives 1652 (20.86% of actual spam)
| > Time to process 13:58
|
| Actually, I'd be interested to see how the numbers above change over
| time. Say, how much different are the scores from this month than the
| scores from the first month you have email? Or how scores for 1998 are
| different to scores in 2002.
I'm not going to do a month-by-month analysis -- the quantity of
data points is not sufficient to derive meaningful conclusions
from that (and I'd have to do some actual work to extract the
data anyway). Nor is there any point in analysing the false
positives -- there are so few that this would also be pretty
meaningless.
However, taking the known spam and the failures to identify it,
we can get some trends. The numbers in the table are the number
of messages in each category.
Year Correctly Identified Not Identified Total
1998 244 68% 116 32% 360
1999 223 44% 283 56% 506
2000 268 60% 180 40% 448
2001 846 74% 290 26% 1136
2002 4471 80% 1134 20% 5605
Totals 6052 75% 2003 25% 8055
Treating 1999 as an outlier, there has been a general trend for
SpamAssassin to get better over time at identifying spam. Since
it's a recent tool and was presumably written to cater for the
kind of spam we've been seeing over the past couple of years,
that's not a surprise -- but it's encouraging. However, it
still means that I'd get to see 3 or 4 spam messages a day that
got through the barrier.
As things stand, I'll keep using SpamAssassin for a while and
see how it goes. Later, I'll have a look at crm114 as it seems
to be an interesting tool. In the end, I suspect that I'll
decide that my initial negative reaction to TMDA when it was
first announced might have waned sufficiently for me to adopt it
as the last bastion.
Part of the problem for TMDA is that quite a bit of the spam I
receive comes courtesy of mailing lists that allow anybody to
post -- I think this is stupid in the 21st century, but I don't
get to manage those lists and I'm not willing to give them up.
Greg
--
* This is list (humbug) general handled by majordomo at lists.humbug.org.au .
* Postings to this list are only accepted from subscribed addresses of
* lists 'general' or 'general-post'. See http://www.humbug.org.au/
More information about the General
mailing list