Spam Filtering with SpamAssassin and procmail

(Updated May 26, 2006)

The VV server has two software packages that are important in detecting, and filtering spam: SpamAssassin (man page), and procmail (man page). SpamAssassin detects spam according to user-specified rulesets and parameters, and marks it as such, so that... procmail can filter mail into separate mail directories (so that it doesn't show up in your INBOX).

SpamAssassin

SpamAssassin is configured by creating a .spamassassin directory in your home directory, and editing a file called user_prefs in that directory:

mkdir .spamassassin
cd .spamassassin
pico user_prefs

In there, you can use various commands, but the most popular are:

whitelist_from email
.. where "email" is the e-mail from whom messages should never be marked as spam (basically a "safe list"). You can use this command as many times as you wish.

use_auto_whitelist 0
SpamAssassin has something called auto whitelist, which I've personally found completely useless. In fact, it increases spam levels for me. The SpamAssassin home page has more information about the Auto-Whitelist feature.

required_hits 3.0
SpamAssassin works using a point system - the more "spam rules" a message matches, the more "spam points" it is assigned. The number in this command is the number of points from which SA considers a message "spam". The default is 3.5 points.

score ALL_TRUSTED 0

Warning: This is a Bad Idea™. Don't do this. I was personally too lazy to properly configure the trusted_networks parameter, so I made SpamAssassin effectively ignore any information provided by it. In fact, properly configuring this parameter probably yields better results. For more information, see Bowie Bailey's comments.

The following paragraph simply explains what I did. YMMV.

ALL_TRUSTED is one of the rules that comes with SpamAssassin. I've personally found it makes a lot of mistakes, similar to the auto_whitelist I was talking about. By assigning it a score of 0, I disable it completely. You can try using the rule by simply omitting this line, or you can see a list and description of all rules, including ALL_TRUSTED.

In the end, the user_prefs file should look something like this:

whitelist_from mom@aol.com
whitelist_from grandma@aol.com
whitelist_from *@vv.carleton.ca

use_auto_whitelist 0
required_hits 3.0
score ALL_TRUSTED 0

procmail

Once SpamAssassin has determined whether a message is spam or not, we need to somehow sort the messages into our normal INBOX, or a seperate spam folder (or even delete spam on the spot!). procmail allows us to do that, by creating a file called .procmailrc (man page) in your home directory.

pico .procmailrc

The .procmailrc file uses a strange syntax, but fortunately once you've created the file, you don't have to modify it again. Here's some of the most common lines:

MAILDIR=$HOME/mail
Makes sure procmail knows where to store your mail once it's been sorted.

:0fw
| spamassassin

Tells procmail to "forward" your mail through SpamAssassin before sorting it any more. This allows SpamAssassin to analyze it and to identify it as spam if it's the case.
(Note that, strictly speaking, your mail has already passed through SpamAssassin before getting to this point. However, it seems to me that on the first try, SpamAssassin doesn't properly process your user_prefs file. I haven't figured out exactly why that is.)

:0
* ^X-Spam-Flag: YES
spam

Tells procmail to check the result of SpamAssassin's checks and, if the message is spam, to deliver it to a separate directory called "spam" (you can change the name to whatever you like). This mail folder is accessible from Pine through the Folder List screen, in case some legitimate mail was mistakenly taken for spam and thrown in there.
If you're absolutely 100% positively sure you want any messages that are flagged as spam (even if some of your real e-mail might be flagged) to be permanently deleted, use /dev/null instead of spam.

At this point, since it's the end of your .procmailrc file, procmail will simply deliver the remainder of your messages in the normal INBOX. Here's a sample .procmailrc file:

MAILDIR=$HOME/mail

:0fw
| spamassassin

:0
* ^X-Spam-Flag: YES
spam

Links and Contact

Even though I find the SpamAssassin documentation somewhat incomplete, it is still the reference of this package. Make sure to check it for a list of command and rules that SpamAssassin uses.

For information on procmail, the .procmailrc man page is the ultimate reference. The syntax is a bit awkward at first, but just keep the documentation and your editor open at the same time and you should be fine.

If you have any problems or any comments, feel free to e-mail me at cat@vv.carleton.ca.