Rant: Spam, SpamAssassin, FuzzyOCR, system admin and stuff

I've been going hard core on spamassassin tuing. I'm sick of the spam I'm getting.

I have installed:
– qpsmtp – better spam filtering at the SMTP connect level. Should reject alot of spam before we even accept it. I've been watching it, and it is very good. Apache.com use it =)
– FuzzyOcr – will read text inside images that are sent, and will set the score if it finds keywords in it. I'm about half way done on this one.
– pyzor – a community spam digest service with hashes of known spam

I have had to upgrade some things, mostly perl modules.

I have changed the postfix config to have qpsmtp receive the mail on port 25, then forward it on to another port which is where postfix is listening. Apart from that, no postfix config has been changed in order to minimise the chance of screwing it up. There are some duplicated tests at the moment, and I feel I'll find the balance between qpsmtp and postfix as to where the tests will occur.

The big thing is that qpsmtp can pass the mail thru spamassassin at the connect stage, so that is the mail is spam (we can agree on the score for spam) then the mail is bounced. This I feel is better than simply marking it as spam, and having to go through our spam folders to make sure. this way, if someones mail is rejected as spam, they can take care of it.

I still dont think SA is set up right, because I get different scores on the same mail depending on how I run it. If I run it with spamassassin < /tmp/testspam I get one score. If I run it with spamc < /tmp/testspam (which then connects to spamd) I get another score. Also, I'm getting errors saying that the bayes journal is write protected, and lo and behold, my bayes_journal file is owned by root. Which is weird, and when I change it back, it changes it back again. So that is something to look at too. The mail.err log is full of errors, but that will go away soon, and mail is still getting through. I'd like to centralise the SA config and databases. That way, when you mark something as spam I get the benifet of it being in a shared hash db. and vice-versa. The fuzzyOcr plugins evens hashes images, so if you get the same image I have already marked as spam, you wont get it. sounds good eh? pyzor isn't working yet, probably not configured proper. And all this instead of the mountain of work I'm supposed to be doing. ah well. countdown is on to the forrester delivery and our shipment too. shipment is now due feb5, and forrester maybe by end of this week, maybe start of next. yay! also, I'm working on converting my gallery to MySQL (yes with comments! damn you!) and converting my Blog to wordpress. having problems with WAMP and perl and cygwin - things are busting all over the place. leaning towards a full blown linus install again! ye gods! maybe I'll just say fuck it and go Mac! Jake out.

