Re: Speak Now or be Unsubscribed!

From: Sean Chittenden <sean(at)chittenden(dot)org>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: sfpug(at)postgresql(dot)org
Subject: Re: Speak Now or be Unsubscribed!
Date: 2003-03-16 20:28:10
Message-ID: 20030316202810.GI66903@perrin.int.nxad.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: sfpug

> > Wait! This seems like a really dumb idea, no offense. Marc runs the
> > list servers, right? I think it'd be very easy for him to integrate
> > bmf into the mail delivery process and solve the problems for everyone
> > for all of the PostgreSQL lists. One better, Marc uses FreeBSD,
> > which, oddly enough, earlier today I created a port for. As soon as
> > the freeze is lifted I'm going to commit the port, but until then,
> > check out the shar file here:
>
> Um, what's BMF?

For the FreeBSD users:

http://people.FreeBSD.org/~seanc/bmf.shar

For the rest of the world:

http://sourceforge.net/projects/bmf/

It's a really fast, very small, Bayesian Mail Filter. In 2002, I
received over 90K pieces of SPAM. 2003 was on track for well over
120K, but that rate had been picking up. I added bmf to the mix (top
of my personal delivery rules), and now I'm filtering out all of the
SPAM for both all of the mailing lists that I subscribe to and is sent
to me personally. I'm estimating no more than 50 spam emails for the
remainder of the year, hopefully less. That's a 99.962% reduction in
SPAM and it took me more time to type this email out than it did to
setup/train.

bmf is an adaptive filter that has to be trained (mutt integration
works or through a CLI or mail alias. I setup a cron job that runs
over an IMAP folder in each customers account so that once a night I
clean out that folder and use those mails to train their filter since
they don't have CLI access. Forwarding to a special address works too
for the pop3 or list users). The on disk size of the binary is 35K
(it's tiny) and its running memory foot print is just over 1MB (1144K,
512 shared though): it's lite. On a dog slow IDE machine, I processed
over 40K emails in about 10min. It's fast.

Anyway, as for a solution for this list, edit the aliases file to
include something like:

sf-pguser: "|/usr/local/bin/bin -p|(majordomo|mailman)"

Then in the mailman/majordomo config, nuke all email that has the
"X-Spam-Flag: YES" header set. What I like about this is it's not a
set of hard rules like SpamAssassin (requires manual
maintenance/tuning). With this, I just take a piece of known SPAM and
either pass a message to it via a pipe:

"|bmf -S"

or via file:

"bmf -S -i [filename]"

So, in this case, all Marc would have to do is:

1) Update the aliases file to include bmf in the delivery chain.

2) Add a new alias (obfuscated) that'd allow list administrators to
forward email to to train it as spam, or not spam:

sf-pguser-add-spam-61ad905cad2e49ce5cf6b1852998263d: "|bmf -d ~lists/bmf/sf-pguser -S"
sf-pguser-legit-mail-41c846edfb3d75a3adc2b21d379270cf: "|bmf -d ~lists/bmf/sf-pguser -N"

Mail with the "X-Spam-Flag: YES" header set should be held for
moderation in case it was a false positive and needs to be forwarded
to the legit-mail addy above. The reason for the md5 at the end of
the alias is that way only administrators who know the md5 at the end
will be able to send email to bmf.

At some point here in the not too distant future I'm going to write
this up in an article for FreeBSD and publish it with
mailman/majordomo/postfix/qmail/sendmail specific details for the
layman, but I can't even begin to describe how wonderful and eerie it
is to not get any SPAM. Seriously, I used to get tons of it to wade
through... *poof* mf == near magic bullet. I had a few false
positives when I first set things up, but it was a no-brainer to fix
this and I haven't had problems since.

Here's a chump position on SPAM for the inclined:

http://people.FreeBSD.org/~seanc/#spam

A spam filter like this is a basic necessity for being a participant
on the Internet just like a firewall is.

I'd like to work with the FreeBSD.org admins and PostgreSQL admins to
get this setup so that the two biggest projects that I'm involved with
can filter out SPAM emails for those that aren't fortunate enough to
have bmf installed on their server. -sc

--
Sean Chittenden

In response to

Browse sfpug by date

  From Date Subject
Next Message Patrick Hatcher 2003-03-17 16:38:15 Re: Speak Now or be Unsubscribed!
Previous Message Josh Berkus 2003-03-16 19:50:25 Re: Speak Now or be Unsubscribed!