Skip to navigation
   
Dan Jones's Blog
Fighting Spam with Spamassassin

By Dan Jones in Reader

Posted in Spam, Networking, Email on August 14, 2008 at 8:08 am

Permalink | Author Profile

Well, after many years with zero anti-spam technologies (and manual deletion of ~ 200 items a day) I decided it was time to move my mail host and implement anti-spam technologies.

Now I already have a home SAMBA server, running Debian, which also acts as a mini desktop. I decided to use this as my mail volume isn’t huge… I get ~20 valid emails a day, ~200-500 spams depending on the day of week really.

SpamAssassin looked to be the premier anti-spam solution out there for Linux, and I selected a Debian EXIM integration. Took a while to learn exim, but I’m now mostly impressed with the configuration. I’ve used dovecot as a IMAP server. All these are the standard Debian stable packages……

Basic procedure for me was I installed the packages - then I followed this guide and got a basic system up and running… and moved a “test” domain name to point inbound SMTP at the box so I could then fully test all the options and tune the anti-spam.

Tricks the above guide missed:

Using CPAN (perl -m CPAN -e shell) to install Net::DNS. Without this vital step Spamassassin missed out on ALL DNS tests, which are quite good for scoring.

Bayesian filtering.

  • Set this up to use a system wide database, in a folder you control with world read/write access. The default isn’t right.
  • You may wish to increase the default size of the bayes database. I increased mine 10 times.
  • It seems to require 200 spams and 200 non-spams to be learnt before its operational - at first I did not realise this. I fed Bayes a folder of 2000 spams, and let it read my (already filtered of spam) archive of personal mails as non-spam (3400 items). This trained the spam filter quite well.. I used a variation of this script
  • If you run sa-learn with -D for debug it does tend to show faults in your SA config.
  • Increasing score of BAYES_99 for me at least results in better results.
  • I’ve set up learn as spam folders in my mailfile, which is learnt and deleted every 6 hours (ie mails making it through SA I drag to this folder).

Setting SpamAssassin up is NOT easy, and requires a lot of tinkering to get runnign as you want (hence my playing with a test domain). Once complete however, its an brilliant system in my opinion at least.

Now its up and running, only 4 spams have hit my mailbox (though I’m still storing all spam - aim is to not store very high scoring spams in future, and only store “uncertain” results. Though right now, with ~5000 spams not hitting my mailbox I’m a happy bunny.

SpamAssassin is also available as a windows version I believe. For Exchange users with nothing else it may be worth a look.

12345
Rated: 100% (1 votes)
Loading ... Loading ...

 
Advertisement
Advertisement