Simple Spam Filtering

Version: 2.5 (12/13/2002)
http://yendi.cc.cmu.edu/work/s-spam.html

1.0 Background

Sieve is a mechanism for creating email filters. The script is stored on the server and applied to all incoming email. For example, the script is able to automatically filing an email message into a given folder.

SpamAssassin examines incoming email messages and based on a set of rules, it generates a value where the larger the value, the greater the chance that the email message is spam.

Once the value is above 5, the Andrew MX servers will insert the following header into the message:

    X-Spam-Warning: 83 (CLICK_BELOW,CLICK_HERE_LINK,WEB_BUGS,CTYPE_JUST_HTML)

The numeric value in the header is multiplied by 10. This is to make it easier for Sieve to deal with it since Sieve does not have floating point numbers. The information in parentheses is the list of rules that were matched.

Sieve can use the information that SpamAssassin provides to automatically file message into an appropriate folder. While a number of users have already written their own script, Writing the script is not a trivial task.

This project is to create a simple web interface to allow users to easily file messages tagged as spam into a folder.

2.0 Requirements

3.0 Implementation Guidelines

3.1 White/Black List Issues

3.2 Handling of Existing Sieve scripts

In general, if the CGI detects an existing sieve script that it did not generate or the vacation CGI did not generate then the user should be given an option to deactive that script and enable the generated script.

It may be difficult to determine whether or not an existing vacation script was generated by the vacation CGI or user written. In this case, the recommendation is to just use the existing vacation code to determine whether or not the script in question is a vacation script. If so, then treat it like your generated vacation script. If you want to be paranoid, make sure it only has vacation and no fileinto or any other constructs that isn't put in there by the current vacation script.

An earlier version of the vacation CGI allowed one to edit a sieve script that was not generated by the CGI. However, when the user went to save the script back to the server, the vacation CGI would not allow it to be saved. The new system should not allow the user to edit a script that they can't save back.

4.0 Implementation Roles

Changelog

2.5 - 12/12/2002 - Clarified the white/black list precedence
                   rule. Described why we needed to create a folder externally
2.4 - 12/10/2002 - Updated formatting
2.3 - 11/07/2002 - Updates from Lerchey
2.2 - 09/04/2002 - Revisions after initial meeting
0.1 - 08/29/2002 - Initial draft