Bayespam

bayespam 0.9.2 - The qmail spam filter that learns

NEW! Version 0.9.2

Changes include:

  • MIME support and decoding
  • DBM support
  • Incremental add/remove of emails
  • Changed to simple 0/1 exit value
  • More control over .qmail setup
  • Includes filter test script
  • Command-line arguments
  • Directory recursion
  • Ignore files over certain size
  • Ignore all-numeric tokens, HTML comments
  • Ignore duplicate "interesting" tokens
  • Switchable case sensitivity
  • Better error checking
  • Various optimizations

I recently read an article on Slashdot that pointed to a paper written by Paul Graham called A Plan for Spam. In this paper, Mr. Graham talked about a spam filter he is working on that used Bayesian classification to determine if a particular piece of email is spam or not.

I was inspired, so I immediately went out and wrote my own, and here it is. Bayespam is a qmail spam filter written in Perl, using Bayesian classification to filter out unsolicited commercial email. I wrote it with ease of installation and use in mind, and I encourage you to give it a try. Bayespam actually learns as you give it more spam to process, so it should become better and better the longer you use it.

BEWARE! Bayespam is really a proof-of-concept piece of software. While I have run bayespam on my own email server, I can't guarantee how well it will work or that it won't crash and burn your machine (though I don't think it will). Please use it at your own risk, but if you do use it, please report back to me on how well it works for you. And please don't complain about how C-looking my Perl is...I'm a C programmer!

Bayespam requires the use of the following modules:

Be sure to download the modules you need before you use Bayespam. If you're not sure, try running the bayestest.pl script -- it will complain plenty if the right modules aren't there!

download current (0.9.2) here

download 0.9 here

AttachmentSize
bayespam-0.9.2.tar.gz14.41 KB
bayespam-0.9.tar.gz10.72 KB

Theme port sponsored by Duplika Web Hosting.
Home Back To Top