Zimbra offers Open Source email server software and shared calendar for Linux and the Mac
Go Back   Zimbra :: Forums > Zimbra Collaboration Suite > Administrators

Welcome to the Zimbra :: Forums!
Welcome, if you would like to post a comment please register. We also encourage you to explore all things Zimbra with our team and members of the community.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 04-23-2007, 02:12 PM
Active Member
 
Posts: 26
Default Spam being scored with BAYES_00

I have been fighting poor spam filtering performance on my Zimbra server for months now, and it seems the most frustrating aspect of it is that my bayes database gets poisoned with spam within a couple of weeks of being reset.

I did this about two weeks ago, and today I noticed many obvious spam, following the same pattern as spam I have been marking as Junk for the past two weeks, ending up with a BAYES_00 test hit. Is there anything I can do to make the bayesian filter not recognize these as ham? I'm not exactly clear as to why they are hitting on this test, actually - are all messages trained as ham until such time they are marked as spam? I did an initial import of about 3,000 ham messages, but since that time I haven't trained anything as ham. Yet, clearly it thinks these are ham.

What about giving a BAYES_00 test a score of -0.001 instead of -2.599? That would catch a TON of our spam, and I'm not sure that it would allow many ham to be recognized as spam - the BAYES_00 test seems to catch more junk than real ham anyway.

Any other good strategies out there? I am also thinking of upping the score for SPAMCOP_BL to like 3.5. I don't see too many false positives coming from that test.

Overall, our filter performs at about 74% efficiency on my inbox and lets through many obvious stock and drug sale spam (not just the GIF-based ones, even the text based ones). One user had 800 spam to 13 ham waiting for her this morning from the weekend. Users are starting to revolt. :-/

All ideas appreciated!
Reply With Quote
  #2 (permalink)  
Old 04-23-2007, 02:44 PM
Former Zimbran
 
Posts: 5,606
Default

What version are you using?

Early versions of 4.5 had a bug with BAYES.
Reply With Quote
  #3 (permalink)  
Old 04-23-2007, 02:47 PM
Active Member
 
Posts: 26
Default

I am using 4.5.3. I do recall the bug, but I think that was prior to 4.5.3. I think the bug had something to do with not preserving the bayes database, right? I have reset it multiple times since then anyway. Or was it some other bug?
Reply With Quote
  #4 (permalink)  
Old 04-23-2007, 03:40 PM
Former Zimbran
 
Posts: 5,606
Default

Yeah, it was 4.5.1 I think.

Try upgrading to 4.5.4 (coming soon:4.5.5) and try again.
Reply With Quote
  #5 (permalink)  
Old 04-23-2007, 03:55 PM
Active Member
 
Posts: 26
Default

According to the release notes there were no changes to the spam subsystem for 4.5.3 -> 4.5.4. I am trying to avoid upgrading for the sake of upgrading. I don't believe this would affect what I am seeing.

Does anyone know when messages are identified as ham? Is there any other way a message would be able to get a BAYES_00 hit unless it or messages very similar to it had been previously identified as ham?
Reply With Quote
  #6 (permalink)  
Old 04-23-2007, 04:33 PM
Former Zimbran
 
Posts: 5,606
Default

No, but there is a script that fixes/repairs the BAYES DB incase it got moved out of the way.
Reply With Quote
  #7 (permalink)  
Old 04-24-2007, 12:07 PM
Active Member
 
Posts: 26
Default

Since I've reset my bayes db several times since then, I don't have use for that script any more. When the bug first happened, I just moved the db back myself. Anyway, that's not related to the problem I am seeing, which is that within 1-2 weeks of resetting the database, obvious spam are getting hits on BAYES_00. My question is how it the bayes database learning that these are to be scored as very strong ham?
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes


Similar Threads

Why Join?

Registering let's you ask questions, makes it easier to search, displays any files attached to posts, and notifies you about replies.

blog.zimbra.com




 

SEO by vBSEO ©2011, Crawlability, Inc.