Nuclear Elephant: DSPAM offers the two-level bayesian database suggested by dwmtractor. It ships with Zimbra, but has been disabled since 4.5 because it appeared unstable under load. Barracuda claims to have made it work somewhat reliably. They have some really old patches posted at
Spam Filter / Spam Firewall / Web Filter / Spam Appliance / Load Balancer / Content Filter / Email Archiver
Hmm, after some quiet period, a DSPAM 3.8 was released in March, though the changelog shows no activity since December 2006. Not sure if anyone at Zimbra has been tracking it, or if they just gave up on it.
I like the ideas above. I can post an RFE if no one else has interest. (I recently got really busy, but that could change again.)