Page 1 of 2 12 LastLast
Results 1 to 10 of 15

Thread: Some more simple tips for cutting spam. . .

  1. #1
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    9

    Default Some more simple tips for cutting spam. . .

    This stuff probably should go on the wiki if other people validate it, but I wanted to share a couple things I have done recently that have significantly improved my spam detection without increasing false positives.

    1) The RBLs, which I know some people (including forum moderators) like a lot, and others (including forum moderators) don't like at all, each assign a score to the message. My own experience has been that having hits on three RBLs won't give a sufficient score to push the message over the "tag" threshold even if you don't have pesky things like the auto-whitelist getting in the way. Lowering the tag percentage from 33 to 29 (admin gui, Global Settings, AS/AV) lowers the actual point-score required from 6.2 to 5.8, and this is enough to make a big difference on multi-RBL spam.

    2) The Bayesian filters alone can't get a message tagged as spam either--this one opens a big debate because some feel that the "Spam" designation should be reserved for the real unrepentent bulk emailers and not for other purveyors of junk that (at least in theory) you can unsubscribe from. I find that from my users' perspective this doesn't wash because I have beat into their heads that unsubscribe links are poison and shouldn't be used, and because they tell me "If I say it's spam, I want your server to treat it as spam!"

    I fixed this issue by raising the scores for high-hit Bayesian messages. I added the following lines to the bottom of my /opt/zimbra/conf/spamassassin/local.cf file:
    Code:
    #My tweaks to the Bayes scoring system - DWM
    score BAYES_00 0.0001 0.0001 -2.312 -2.599
    score BAYES_05 0.0001 0.0001 -1.110 -1.110
    score BAYES_20 0.0001 0.0001 -0.740 -0.740
    score BAYES_40 0.0001 0.0001 -0.185 -0.185
    score BAYES_50 0.0001 0.0001 0.001 0.001
    score BAYES_60 0.0001 0.0001 1.0 1.0
    score BAYES_80 0.0001 0.0001 2.5 2.5
    score BAYES_95 0.0001 0.0001 5.5 5.5
    score BAYES_99 0.0001 0.0001 6.5 6.5
    The scores for bayes negatives or equivocals are unchanged, but this adds an extra point and a half to the extreme high scores and saves me a lot of moaning from users.

    3) I had several messages that, despite the high Bayes score, have been getting through due to an evil little entry called the Bonded Sender Program. Now I know there are people who believe in the BSP, and I really don't want to get in a flame war, but essentially this is a program whereby those who subscribe to the program--for a price and agreement to follow certain rules of conduct--get a pass to send unsolicited messages. Spamassassin gives BSP hits a -4.5 score, which pretty well overrides everything else you've done and makes the message come through anyhow (BSP's own website actually advocates a -100 score! ).

    Now I and my users don't want somebody else's business client list overriding our own opinions of what's junk, so I added the following to my local.cf file:
    Code:
    # Score to eliminate Bonded Sender Program (BSP) whitelisting
    score RCVD_IN_BSP_TRUSTED 0
    score RCVD_IN_BSP_OTHER 0
    score RCVD_IN_BONDEDSENDER 0
    and that took care of those messages. Interestingly, with a score of zero, the BSP score doesn't even show in the header of messages that were using it to get through before I made the change.

    Anyway, hope this is helpful to some of the rest of you. Now if we could just get that pesky auto-whitelist to behave. . .

    Cheers,

    Dan

  2. #2
    mmorse's Avatar
    mmorse is offline Moderator
    Join Date
    May 2006
    Location
    USA
    Posts
    6,242
    Rep Power
    20

    Default

    har har to the 'including forum moderators' bit trying to single us out eh?

    I like your anti bonded senders bit - how about adding that to the wiki with a suggestion of half instead of 0 (call it "reduce impact of bonded senders" or something)
    Improving Anti-spam system - Zimbra :: Wiki

    Instead of tag & move in one motion - I think it would be cool to have three levels by default:
    -It takes a sec to understand because 'tag' is where the the x-spam header/flags apply; therefore I won't use the word 'tag' below but we all know marking is the first step.
    -Move to junk/spam folder @ somewhere between 25% (5pts) to 33% (6.6pts)
    -But change it so the level for applying a ***SPAM*** label is separate @ 50% (10pts)
    -Kill @ 75% (15pts)

    I do like how the dsn cutoff level (delivery status notifications) is now mapped to the kill level for v5. As in my view 20 was always pointless, why spend effort telling the spammer you got the message if you're just going to discard the message anyway ya know?

    Just had an interesting thought on the AWL but I'll add it to that thread there in a sec.
    EDIT: /forums/administrators/12126-correcting-poisoned-auto-whitelist-awl-2.html#post64742

    Oh, and for the past year or two I have become graylisting fanatic that's for sure. It's so funny because all the spammers would have to do is spend a little more cpu time/bandwidth/adhere to the RFC's, and they would get their message to a lot more people; but their stupidity is my benefit I guess. Those 'wealthy Nigerian prince's just looking for a way to offload their money' seem to need a lesson on that - ssh don't tell em!
    Last edited by mmorse; 11-01-2007 at 04:05 PM.

  3. #3
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    9

    Default

    Quote Originally Posted by mmorse View Post
    har har to the 'including forum moderators' bit trying to single us out eh?
    Well, I wasn't sure who to single out. You recommend them (I like them), and Bill (phoenix) doesn't.

    Quote Originally Posted by mmorse View Post
    -Move to junk/spam folder @ somewhere between 25% (5pts) to 33% (6.6pts)
    -But change it so the level for applying a ***SPAM*** label is separate @ 50% (10pts)
    -Kill @ 75% (15pts)
    Why? Unless you have a lot of false positives that you don't want to have the SPAM label, it's somewhat immaterial, isn't it? At least from my perspective I don't care WHAT the server calls it as long as it gets it the heck out of my way. At least for me, false positives have so far been a complete non-issue.

    Quote Originally Posted by mmorse View Post
    Oh, and for the past year or two I have become graylisting fanatic that's for sure.
    Maybe it's just the business I'm in--I'll admit that tractor buyers might not be the most bleeding edge I.T. people--but I had too many people getting stuffed that shouldn't when I had greylisting implemented on an older server. It turned out to be a non-acceptable problem. But maybe that's just us. . .

  4. #4
    mmorse's Avatar
    mmorse is offline Moderator
    Join Date
    May 2006
    Location
    USA
    Posts
    6,242
    Rep Power
    20

    Default

    I don't, but I guess other's get the false positives all the time - so the levels thing kinda stems from requests I have seen and stuff I've toyed with occasionally.

    Of course we all have our personal sorting methods for getting the critical emails answered first. I've seen some thick-client users who want their junk folder ranked, but if the thick-client itself doesn't have a x-spam-score sort method, they want to avoid having to set up filters for x-spam-level stars or something.

    These people want a fast low-hi kinda thing that will persist across whatever client they may be in; and so they can tell at a glance on looking through a junk/spam folder. So you would have:
    Low junk/spam folder
    High junk/spam folder + label

    At the same time it boggles my mind how they just don't want the score prepend/append to the subject if it's spam.


    On the web-client side it would be relaly nice to expose the x-spam-score as a sortable column in junk.


    Extrapolating that a bit - zimbra is all about reducing time needed right? In the junk/spam folder, rather than manually running a search by date then score; have it auto arrange into days with the lowest scores first/at the top.
    I think days would be a short enough span to work in, because there's always the "hey did you get this?" "Don't see it in my inbox let me check my junk." Thus it will probably near the top as far as scores go, & you won't have to revert to sort by timeline.

    This would help a lot in picking your values: Bug 19152 - RFE: Spam statistics
    It would be helpful when adjusting spam tag and kill levels to have statistics.
    I'd like to see some kind of distribution of scores for all messages, not just
    spam.

    In addition statistics about the scores of messages being classified as
    "Junk" or "Not Junk" by users.With this information, admins can adjust tag and kill levels accordingly. If one sees a high number of "Not Junk" submissions where the spam score is close to the tag level, it might be appropriate to adjust the tag level.
    Re-sending large emails sucks for the concept of graylisting. Things like that always make me wish for more configurable options, but that means someone's gotta code the features into them in the first place...
    Last edited by mmorse; 11-01-2007 at 04:58 PM.

  5. #5
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    9

    Default

    Quote Originally Posted by mmorse View Post
    I like your anti bonded senders bit - how about adding that to the wiki with a suggestion of half instead of 0 (call it "reduce impact of bonded senders" or something)
    Improving Anti-spam system - Zimbra :: Wiki
    Done. I added it as a sub-bullet under Spamassassin config. I also added a phrase about upping the numbers in the other spamassassin scores to increase the impact of desired filters.

  6. #6
    mmorse's Avatar
    mmorse is offline Moderator
    Join Date
    May 2006
    Location
    USA
    Posts
    6,242
    Rep Power
    20

    Default

    Did you mean to make it negative -0.5 instead of positive 0.500? (original being -4.5 or -4.2)
    Thanks, guess we were getting side tracked - as this was supposed to be tips afterall.
    "Some more simple tips for cutting spam" > "Dan & Mike's extrapolated ramblings on anti-spam ideas"

    ----

    Hey, we're you able to find an RFE on "x-spam-scores column in the junk/spam folder"?

    ----

    For those still reading along, the other RFE's that will really make a difference to the overall experience when finally finished are Bug 6953 - Per user spam white lists in the UI & Bug 3870 - per user Spam Assassin score

    This is also interesting: AboutMaia - Maia Mailguard - Trac
    Individual & system wide spamassassin bayes training, (domain rules can override any individual's rules), & amavisd-new.
    Takes either two SMTP-based mail servers in a dual-MTA arrangement OR an SMTP server with re-injection capability (e.g. Postfix)
    It does use an older/custom amavisd-new 2.2.1 even though 2.5.2 came out this last june -reasons are here: AmavisVersion - Maia Mailguard - Trac
    Someone asked about integrating it in Bug 14167 - Improved spam filtering and reporting
    Last edited by mmorse; 11-01-2007 at 07:00 PM.

  7. #7
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    9

    Default

    Quote Originally Posted by mmorse View Post
    Did you mean to make it negative -0.5 instead of positive 0.500? (original being -4.5 or -4.2)
    Thanks, guess we were getting side tracked - as this was supposed to be tips afterall.
    "Some more simple tips for cutting spam" > "Dan & Mike's extrapolated ramblings on anti-spam ideas"
    Hee hee! Yes, I did. I'll correct that. Come to think of it, giving those guys an anti-hit for having the nerve to think we'd fall for their slimy scheme may be poetic justice, but it wasn't actually what I had in mind. . .

    As for the RFE, I didn't look too far. Suppose I should for the greater good but in point of fact my junk folder is for one thing--junk. I have seen nothing in it that needed sorting or evaluation; I only go there to read the headers and see what of my filters are actually doing their job.

  8. #8
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    9

    Default

    Quote Originally Posted by mmorse View Post
    Just had an interesting thought on the AWL but I'll add it to that thread there in a sec.
    EDIT: /forums/administrators/12126-correcting-poisoned-auto-whitelist-awl-2.html#post64742
    Mike, the suspense is killing me. . .what was your interesting thought?

  9. #9
    mmorse's Avatar
    mmorse is offline Moderator
    Join Date
    May 2006
    Location
    USA
    Posts
    6,242
    Rep Power
    20

    Default

    You already replied didn't you? -that link on automatically changing the factor as time goes on. -or we're you just joking cuz that edit has a direct link to the post

    HA! to the "an anti-hit for having the nerve to think we'd fall for their slimy scheme" -about the best logic I've heard all day! (It's already a long one and it isn't yet noon) To be fair, I'm sure there's tons of arguments for it. After all, they convinced the SA people to use a -4.2 and even got it upped to -4.5...and they wanted -100 sheesh!

    Ya, haven't had a 'false positive' in a long time. Which is a catch-22 in-and-of itself because:
    a) I don't 'seriously' go through my junk - though if it sorted least score first I might glance at it more.
    b) You can only call it a 'false positive' if you know about it! -lol
    c) But unless someone complains that I didn't reply, who really does care afterall right?
    Last edited by mmorse; 11-02-2007 at 08:26 AM.

  10. #10
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    9

    Default

    Quote Originally Posted by mmorse View Post
    You already replied didn't you? -that link on automatically changing the factor as time goes on. -or we're you just joking cuz that edit has a direct link to the post
    OK, now you're messing with my head. I specifically went to that link and your post was at around 1 pm, and the note I was quoting didn't get posted till after 3, so I thought you had ANOTHER great pearl of wisdom for us! Guess you had a long day yesterday too, eh?

    Quote Originally Posted by mmorse View Post
    HA! to the "an anti-hit for having the nerve to think we'd fall for their slimy scheme" -about the best logic I've heard all day! (It's already a long one and it isn't yet noon) To be fair, I'm sure there's tons of arguments for it. After all, they convinced the SA people to use a -4.2 and even got it upped to -4.5...and they wanted -100 sheesh!
    Had a hunch I wasn't the only one who thought that showed a bit of cojones. . .
    Edit: Lest anyone think I was engaging in a bit of hyperbole about that -100 score, it's on their website:
    SpamAssassin 2.2x/2.3x For versions 2.2x and 2.3x, configuring SpamAssassin to use Bonded Sender requires you to add the following lines to your local SpamAssassin configuration file (such as /etc/mail/spamassassin/local.cf):
    header RCVD_IN_BONDEDSENDER eval:check_rbl('relay', 'sa.bondedsender.org.')
    describe RCVD_IN_BONDEDSENDER Received via a whitelisted Bonded Sender address
    score RCVD_IN_BONDEDSENDER -100.000
    The large negative value informs SpamAssassin that the message is less likely to be spam.
    Quote Originally Posted by mmorse View Post
    Ya, haven't had a 'false positive' in a long time. Which is a catch-22 in-and-of itself because:
    a) I don't 'seriously' go through my junk - though if it sorted least score first I might glance at it more.
    b) You can only call it a 'false positive' if you know about it! -lol
    c) But unless someone complains that I didn't reply, who really does care afterall right?
    My thought, and what I teach my users, is you glance at the junk folder once in a while to look for stuff you should have gotten, or you go there when somebody swears they sent you a message you haven't gotten. Other than that, I look at it with a bit more care now while I'm tuning the system, but I fully expect to ignore it after that. I have already set my global defaults to clean out the junk messages after 5 days to keep mailbox size under control.

    We're a small shop; total of only 500-600 messages a day for 30 users, and roughly 40% spam (the bulk of that to only two users), so it's pretty easy to keep an eye on things.
    Last edited by dwmtractor; 11-02-2007 at 09:16 AM.

Page 1 of 2 12 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Spam/Ham training under Outlook/Thunderbird/etc.
    By chuckm in forum Administrators
    Replies: 23
    Last Post: 03-18-2009, 11:01 AM
  2. Trying to understand Zimbra's anti-spam system
    By TaskMaster in forum Users
    Replies: 11
    Last Post: 01-25-2008, 09:59 AM
  3. Simple Spam Assassin help needed
    By gfdos.sys in forum Administrators
    Replies: 3
    Last Post: 09-17-2007, 12:51 PM
  4. Spam being scored with BAYES_00
    By flyerguybham in forum Administrators
    Replies: 6
    Last Post: 04-24-2007, 12:07 PM
  5. Training spam and ham
    By Justin in forum Developers
    Replies: 2
    Last Post: 10-31-2006, 03:39 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •