Internet Express for Tru64 UNIX Version 6.8 Administration Guide (14233)

Second, the non-spam message group is fed to bogofilter. Again, each message is broken down
into word tokens, scored and recorded in the bogofilter database as non-spam. The following
command is used to register a set of non-spam messages collected in mbox:
$ bogofilter -n -M mbox # non-spam messages
At the end of each training run, bogofilter saves its updated database in a file called
.bogofilter/wordlist.db.
Over the course of time, spam message content will change. Periodic training runs with new
spam and valid message sets are necessary to keep bogofilter's internal database current.
5.4.2 Filtering with Bogofilter
Once the bogofilter database has been primed, the command can be used to filter new messages.
When a mail text message is filtered using a bogofilter trained database, bogofilter will return a
value of 0 for spam, 1 for non-spam, 2 for unsure, and 3 for I/O or other errors. Here is an example:
$ bogofilter new-messages
You can use the bogofilter command line to set many options that determine how bogofilter
operates (see bogofilter(1) for more details). The file /usr/internet/etc/bogofilter.cf
can be used to set additional parameters that affect its operation. In the file
/usr/internet/etc/bogofilter.cf.example are samples of all of the parameters. Status
and logging messages can be customized.
5.4.3 Filter Integration with Other Tools
The following sections describe how bogofilter can be integerated with other e-mail tools.
5.4.3.1 Using Bogofilter with procmail
The following procmail rule will take mail on stdin and save it to file spam if bogofilter thinks
it is spam:
:0HB:
* ? bogofilter
spam
This similar rule will also register the tokens in the mail according to the bogofilter classification:
:0HB:
* ? bogofilter -u
spam
If bogofilter fails (returning 3) the message will be treated as non-spam.
The following recipe accomplishes the following:
Spam-bins anything that bogofilter rates as spam
Registers the words in messages rated as spam as such
Registers the words in messages rated as non-spam as such
With this in place, it will normally only be necessary for the user to intervene (with -Ns or -Sn)
when bogofilter miscategorizes something.
# filter mail through bogofilter, tagging it as spam and
# updating the wordlist
:0fw
| bogofilter -u -e -p
# if bogofilter failed, return the mail to the queue, the MTA will
# retry to deliver it later
# 75 is the value for EX_TEMPFAIL in /usr/include/sysexits.h
:0e
5.4 Bogofilter Spam Filter 131