Results 1 to 10 of 10

Thread: [SOLVED] I don't think RBLs or Bayes are working for me

  1. #1
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    10

    Default [SOLVED] I don't think RBLs or Bayes are working for me

    Following the instructions in the user manual, I enabled several of the RBLs to increase the reliability of my spam filtering. Running zmprov to check my set up nets me the following:

    Code:
    zimbra@mail:~$ zmprov gacf | grep zimbraMtaRestriction
    zimbraMtaRestriction: reject_invalid_hostname
    zimbraMtaRestriction: reject_non-fqdn_hostname
    zimbraMtaRestriction: reject_non_fqdn_sender
    zimbraMtaRestriction: reject_rbl_client bl.spamcop.net
    zimbraMtaRestriction: reject_rbl_client sbl.spamhaus.org
    zimbraMtaRestriction: reject_rbl_client relays.mail-abuse.org
    However, here are the X-Spam headers from two messages I just sent myself from offsite:

    Code:
    X-Spam-Score: 3.159
    X-Spam-Level: ***
    X-Spam-Status: No, score=3.159 tagged_above=-10 required=6.6 tests=[AWL=1.124,
    	BAYES_00=-2.599, DNS_FROM_RFC_POST=1.708, HTML_10_20=1.351,
    	HTML_MESSAGE=0.001, HTML_SHORT_LENGTH=1.574]
    
    
    
    X-Spam-Score: 2.784
    X-Spam-Level: **
    X-Spam-Status: No, score=2.784 tagged_above=-10 required=6.6 tests=[AWL=0.749,
    	BAYES_00=-2.599, DNS_FROM_RFC_POST=1.708, HTML_10_20=1.351,
    	HTML_MESSAGE=0.001, HTML_SHORT_LENGTH=1.574]
    If I'm reading this right, I'm getting spam checks on Bayesian filters, DNS, HTML messages, and whatever AWL is (which I don't know), but there is no evidence of RBL checking. Given some of the junk that's getting thru my system I doubt that it is, witness this example of spam that also came through:

    Code:
    X-Virus-Scanned: amavisd-new at 
    X-Spam-Score: 6.513
    X-Spam-Level: ******
    X-Spam-Status: No, score=6.513 tagged_above=-10 required=6.6
    	tests=[BAYES_99=3.5, DNS_FROM_RFC_ABUSE=0.2, DNS_FROM_RFC_WHOIS=1.447,
    	HTML_MESSAGE=0.001, SUBJ_ALL_CAPS=0.997, UPPERCASE_50_75=0.368]
    This last message was an ATTENTION BENEFICIARY notice with about 3/4 caps and obviously one of those Nigerian 419-style scams.

    Anyway, the question is this: Should RBL scores show in my headers whether or not the source IP has a hit in one of the databases, or will it only show in the case of a hit?

    As a second point I think maybe I should weight my Bayesian filter higher than 3.5, but I don't see in the documentation how one goes about changing the absolute score a feature can give--in my case Bayes seems to top out at 3.5 which seems to let an awful lot through, because with a required score of 6.6 for spam, a 100% hit on Bayes is only 53% of the way to scoring as spam. So far I'd say that's insufficient. How do I adjust that range?

    On that note, there are references all over the forum to changing the kill and tag percentages on the admin UI, but nowhere have I been able to find documentation of just how those percentage numbers relate mathematically to the various scores I see in the headers of emails. Could someone clarify this for me please?

    Finally, I don't see DSPAM referenced at all in my headers, though there is a DSPAM directory in /opt/zimbra. Am I only getting amavisd and not dspam? If so, what do I do about it?
    Last edited by dwmtractor; 08-29-2007 at 12:01 PM.

  2. #2
    Jimerson is offline Intermediate Member
    Join Date
    Aug 2007
    Posts
    16
    Rep Power
    8

    Default

    RBL messages should stop all together. You will never see them. It is a blacklist, so if a message comes in that is listed on the blacklist, it will drop it, if the RBL's are configured correctly.

    You should be able to take some of the messages that you have recieved and go check the RBL's to see if it is listed there.

    As for the spam, according to the manual, you need to drop at least 200 messages in the spam email account and 200 in the non spam email account for it to start scoring spam properly.

    Quote from admin manual:

    "In order to get accurate scores to determine whether to mark
    messages as spam at least 200 known spams and 200 known hams must be
    identified."

    -Jim

  3. #3
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    10

    Default

    I have trained on at least 200 spam and over 900 ham messages.

    I still would appreciate more detail on the way the settings relate to spam scoring, and on how I might increase relative scoring of the Bayesian filter. Otherwise known junk will never get blown out if it doesn't meet the other criteria since, as I said, the Bayesian score is topping out at 3.5 on even the worst spam.

  4. #4
    Jimerson is offline Intermediate Member
    Join Date
    Aug 2007
    Posts
    16
    Rep Power
    8

    Default

    I understand where you are coming from. We are in much the same situation here, so hopefully someone might have some decent input.

  5. #5
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    10

    Default Spam Scoring

    I have one particular spammer that sends me three to six messages a day, that seem to be caught by none of the "biggies" because he's spamming specifically heavy equipment dealers like ourselves. My Bayes gives him the full 3.5, the "Dear something" gives him 2.1, and the other scores are so low that they're essentially meaningless and his messages always get through.

    If ANY of the following criteria were also included it'd probably push these messages over the threshold, but I don't see that they're options:

    1) BCC - if the recipient (me) is only bcc'ed and not in the "To:" or "cc" lines, it ought to get a small score, maybe around 1 or 1.2. This wouldn't be enough to exclude legitimate mailing lists, but it would add to the aggregate problems for spam messages

    2) I know some spam filters add a small score for any out-of-country messages; again not enough to kill them outright but enough to add to the score. The messages in question are coming from Singapore

    3) If I could just increase the Bayesian weighting by about .75 to 1 point, it'd push him over the edge. This is more desirable than lowering the point threshold on ALL messages since it would just give more weight to my Bayesian scores for stuff I have classified as junk, rather than allowing the random combination of all the other scores to increase false positives.

    Help anyone???

  6. #6
    mmorse's Avatar
    mmorse is offline Moderator
    Join Date
    May 2006
    Location
    USA
    Posts
    6,242
    Rep Power
    21

    Default

    Quote Originally Posted by dwmtractor
    zimbraMtaRestriction: reject_non-fqdn_hostname
    This is incorrect, the - (dash) should be _ (underscore).

    Quote Originally Posted by dwmtractor
    zimbraMtaRestriction: reject_rbl_client relays.mail-abuse.org
    Turn this off-it's now part of a paid trendmicro service-else you are just wasting bandwidth for 'no licence' return values.

    dns checks:
    reject_unknown_client -I leave off because every client needs a valid A record or it won't deliver
    reject_unknown_hostname -I leave off because every server that sends you mail needs a A & MX record (and I definitely want alerts from some of my servers that don't have mx records)
    reject_unknown_sender_domain -I leave this on; the @domain.com part of their email address must resolve proper A/mx

    I also use:
    host checks- to conform to the industry standards:
    reject_invalid_hostname
    reject_non_fqdn_hostname
    reject_non_fqdn_sender
    RBL's - Real Time Black Lists:
    Code:
    reject_rbl_client dnsbl.njabl.org
    reject_rbl_client cbl.abuseat.org
    reject_rbl_client bl.spamcop.net
    reject_rbl_client dnsbl.sorbs.net
    reject_rbl_client zen.spamhaus.org
    zen combines spamhaus' sbl, xbl and pbl (while I don't always agree with the pbl, zen resolves much faster/has more copies out there, so i've given in to their policies)

    Check their websites for details, some I usually get the most 'spam' returns from spamcop & spamhaus. Keep in mind they all tend to share info with each other-and score it differently, check their websites for details.

    You enter them all on one (or use +) else you'll erase the currently set values:
    zmprov mcf zimbraMtaRestriction reject_invalid_hostname zimbraMtaRestriction reject_non_fqdn_hostname zimbraMtaRestriction reject_non_fqdn_sender zimbraMtaRestriction reject_unknown_sender_domain zimbraMtaRestriction “reject_rbl_client dnsbl.njabl.org” zimbraMtaRestriction “reject_rbl_client cbl.abuseat.org” zimbraMtaRestriction “reject_rbl_client bl.spamcop.net” zimbraMtaRestriction “reject_rbl_client dnsbl.sorbs.net” zimbraMtaRestriction “reject_rbl_client zen.spamhaus.org”
    to check:
    Code:
     zmprov gacf | grep zimbraMtaRestriction
    To reduce email to accounts that you don't even have:
    Change the entry in zmmta.cf for smtpd_reject_unlisted_recipients to 'yes', save the file and restart postfix. (postfix reload)

    (Add your IP's to the trusted area of local.cfg, -you don't want some user marking an email from a coworker at your same organization as junk, then it affecting the bayes score (this is not to be confused with mtamynetworks-which is for submitting mail from remote networks)

    Anyway, the question is this: Should RBL scores show in my headers whether or not the source IP has a hit in one of the databases, or will it only show in the case of a hit?
    Only added when you get a 'hit'.
    -check the /opt/zimbra/conf/spamassassin folder for the points added

    As a second point I think maybe I should weight my Bayesian filter higher than 3.5, but I don't see in the documentation how one goes about changing the absolute score a feature can give--in my case Bayes seems to top out at 3.5 which seems to let an awful lot through, because with a required score of 6.6 for spam, a 100% hit on Bayes is only 53% of the way to scoring as spam. So far I'd say that's insufficient. How do I adjust that range?
    3) If I could just increase the Bayesian weighting by about .75 to 1 point, it'd push him over the edge. This is more desirable than lowering the point threshold on ALL messages since it would just give more weight to my Bayesian scores for stuff I have classified as junk, rather than allowing the random combination of all the other scores to increase false positives.
    You can manually edit values in your /opt/zimbra/conf/spamassassin folder (there will be a bunch of files in there defining rules)
    -Also see the all important /opt/zimbra/conf/amavisd.conf.in (only edit the .in not the live copy - it then gets copied live on restart)
    While your browsing through that file you can fix any 'wheight listing' which starts by applying +- to mail from a certain address/domain (there's a few defaults provided.) Negative scores mean it's legit. The higher the positive score the worse.
    Also, while there change the 'sa_dsn_cutoff_level' to something more realistic. (near top of file) You dont' want to send delivery status notifications "I got your mail" to the spammers.

    (I suggest you change kill/tag levels through zmprov/admin console though -not amavisd.conf.in- so i'll keep through upgrades)

    On that note, there are references all over the forum to changing the kill and tag percentages on the admin UI, but nowhere have I been able to find documentation of just how those percentage numbers relate mathematically to the various scores I see in the headers of emails. Could someone clarify this for me please?
    No sweat, it's the standard 20point system
    20in spamassassin/amavisd.conf.in =100% in the admin console
    10=50%
    5=25%
    etc

    zmprov mcf zimbraSpamKillPercent 50
    (It's given in percentages-so that would kill anything with 10pts on the 20pt scale)

    100% = 20pts
    33% = 6.6pts
    75% = 15pts
    etc

    You can change the action (discard vs bounce etc) in amavisd.conf.in (don't edit amavisd.conf directly, edit the .in and restart)
    $final_spam_destiny=D_DISCARD;

    You can also play with the dsn (delivery status notification) setting; so over a certain level you won't be responding 'I got your mail' to the spammers.
    $sa_dsn_cutoff_level = 50;

    To delete/not bother quarantining high scoring spam (therefore reducing the number of items in the quarantine) this setting allows you to discard quarantined spam above this level:
    $sa_quarantine_cutoff_level = 90;
    It is cleaned up every day though:
    0 1 * * * find /opt/zimbra/amavisd/quarantine -type f -mtime +7 -exec rm -f {} \; > /dev/null 2>&1

    Note: In that amavisd.conf.in file, wherever possible it's better to set the values with zmprov/admin console (ie: the tag & kill levels) so that it stays consistent across upgrades.

    zmprov mcf zimbraSpamTagPercent 30
    -would put everything above 6pts in the junk folder (and label with a custom **SPAM** in the subject line if you have that enabled) - I personally don't because it's already in the junk folder.

    tag level - Mail goes to the junk folder (Unless the user has their own filter that moves it elsewhere; then I suggest they/you make an accompanying x-spam-level header filter: say contains at least **** then move back to junk etc)

    kill level - The mail does not get delivered to the users (unless you set final_spam_destiny to D_PASS - values are D_PASS, D_BOUNCE, D_REJECT and D_DISCARD -search the postfix documentation for descriptions)

    For more ideas see Improving Anti-spam system - ZimbraWiki
    I especially like graylisting- You take the mail 'hold it', then you send back a temporary error; so that they try mail delivery again. Then when a legit connection is attempted again the mail goes through. Spammers just tend to move on and not bother. The preferred method: if no retry is made within say 1hr you add x points to it's score and still deliver it.
    Razor is also very good.
    Last edited by mmorse; 10-10-2007 at 05:08 AM. Reason: turning this into a general article...

  7. #7
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    10

    Default Bayes Criteria adjustment

    Thanks for all the recommendations mmorse. I will be implementing a number of them right away.

    Some may say RTFM on this, but I'm not sure what FM to R so please forgive me. . . . . . but I looked at a whole bunch of the config files in /opt/zimbra/conf/spamassassin and I can't figure out how which of those files influences the total points assigned by the Bayesian filter to each message.

    When I look at my junk messages, I can see that the maximum score is 3.5 points (for what I'm guessing is a 100% or 95% + hit or something like that). If my theory is correct (and I'm far from certain it is), some configuration file somewhere says that the Bayesian filter has a range to play with, from -2.599 for known ham to +3.5 for known spam, and then it assigns a number based on a calculated probability that a message is ham or spam. This would mean, for example, that if a message is 60% match to the Baysesian spam database and a 10% match to the ham database, it would get a spam score of 3.5 * .6 = 2.1, and a ham score of - 2.6 * 0.1 = -0.21, which we would add together as 2.1 - 0.21 = 1.89 for the aggregated Bayes score. Am I anywhere close to correct here?

    If I am, then what I want to do is change none of the calculations, but only to increase the number 3.5 to 4.5 or 5.0 to affect the total points awarded for a hit on the Bayes spam database. But I don't see those point ranges specified in the files, so either I'm reading the wrong files or I can't read the syntax.

    I hope this makes my question more clear. . .

  8. #8
    mmorse's Avatar
    mmorse is offline Moderator
    Join Date
    May 2006
    Location
    USA
    Posts
    6,242
    Rep Power
    21

    Default

    Not in front of a machine right now - but I believe that file started with a 50_ and down in the very end of that file.

  9. #9
    dwmtractor's Avatar
    dwmtractor is offline Moderator
    Join Date
    Jul 2007
    Location
    San Jose, CA
    Posts
    1,027
    Rep Power
    10

    Default OK, I see where you mean

    That file is 50_scores.cf. I did find that very score list way down the file. I had not read that far down because the stuff at the top was all the indicators of hot stocks, hot chicks, etc. The relevant section:
    Code:
    # make the Bayes scores unmutable (as discussed in bug 4505)
    score BAYES_00 0.0001 0.0001 -2.312 -2.599
    score BAYES_05 0.0001 0.0001 -1.110 -1.110
    score BAYES_20 0.0001 0.0001 -0.740 -0.740
    score BAYES_40 0.0001 0.0001 -0.185 -0.185
    score BAYES_50 0.0001 0.0001 0.001 0.001
    score BAYES_60 0.0001 0.0001 1.0 1.0
    score BAYES_80 0.0001 0.0001 2.0 2.0
    score BAYES_95 0.0001 0.0001 3.0 3.0
    score BAYES_99 0.0001 0.0001 3.5 3.5
    However, the top of the file ALSO says
    # Please don't modify this file as your changes will be overwritten with
    # the next update. Use @@LOCAL_RULES_DIR@@/local.cf instead.
    Does this mean that if I put the exact same syntax as above in local.cf (which is mostly commented out on my default install) it'll override the settings in 50_scores.cf?

    And secondly, you mentioned turning on Razor as a good idea, but I don't see where I can turn it on. . .

    Thanks again for all your help on this and many other threads!

  10. #10
    mmorse's Avatar
    mmorse is offline Moderator
    Join Date
    May 2006
    Location
    USA
    Posts
    6,242
    Rep Power
    21

    Default

    it always amazes me how I remember where stuff like that is

    That's the idea, but there's always plenty of stuff to recheck after upgrades...so I don't bother personally. (Plus I like knowing what the defaults are every time, that way I can make suggestions to help others.)

    For other's reading this article, remember you need 200 spam & 200 not-spam to even start bayes filtering. see: CLI zmtrainsa - ZimbraWiki
    (add your own networks to the trusted_networks section in /opt/zimbra/conf/spamassassin/local.conf)

    Note-the below isn't supported by zimbra:
    I put greylisting on my current build-but I'm happy enough with my spam levels that I didn't do razor/pyzor this time around.
    Improving Anti-spam system - Razor2 - ZimbraWiki

    I would tweak all the other settings as best as you can first, give yourself a month to assess and refine. Then see if you get any user complaints, and if your still getting too much spam go through that wiki for ideas. (and to some 'too much spam' means different things-to me as long as it goes to junk I leave things alone-but if stuff goes to people's inboxes instead then I tweak)
    Last edited by mmorse; 09-05-2007 at 03:23 PM.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Bayes score not showing in headers.
    By brained in forum Installation
    Replies: 1
    Last Post: 10-21-2006, 10:40 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •