Well, this is my personal opinion. I'm still running 3.1.0 but there will not be difference I suppose.
1. Yes
2. DSPAM is statistical analysis. It will not catch any SPAM until it get trained. Moreover, I do not think that DSPAM alone will move spam to junk folder. It just adds some points (just 0.5 by Zimbra default) to SA total score.
3. See 2.
4. Good question. Sure you can use built-in web based training, but it is quite exhausting for me. Especially if your zimbra is production one, you will depend on how your users will do the training and this is for sure not good. Zimbra alone does some kind of auto training - messages with sufficient total score end in junk and junk is used in nightly training job. This is good but not quite sufficient, if your users will not do their training part with spam that got to the inbox. BTW, how will you force users to train spam/ham if they use POP3? It would be great if zimbra had shared folders and users could easily move spam/ham to special shared folders used for training, or if training could be based on message IDs send to special location. What do you think of it, Zimbra folks?
5. My another question to zimbra folks - does zimbra use SA auto training? Is the bayes_auto_learn switch working? And you are right cdyer, bayes needs to be trained both spam and ham to be effective. See
http://wiki.apache.org/spamassassin/BasicConfiguration, bayes_auto_learn.
6. Phoenix is right.
I made myself some windows/samba app that uses my mail archive (see postfix always_bcc). From the archive it reads headers info from all messages and stores it into database. (also SA and DSPAM scores among others). Then in tab I can clearly see messages and their properties, I can open them in notepad to check them and I can copy messages to special disk folders and then use them for automatic SA/DSPAM training. This training is based on cron, zmtrainsa and sa-learn. I can sort and filter messages by SA score, so I can (for example) every day easily check messages with score from 2 to 5.6, which are kind of suspicious and appropriately train on them. I see that there is some space for zimbra team to make similar feature, I'm not able do it in linux.
Personally, I use 5.6 SA threshold (28 in zimbra). After some training, I set both BAYES_99 and DSPAM spam scores to 4.4 - this is really better than to set total threshold too low because of false positives. Also it seems to me, that DSPAM almost never makes false positives so it is really good to increase its weight. Of course SA networks test are crucial too just as additional SA rulesets from
www.rulesemporium.com. Now I see almost no spam. I'm still quite new to linux, but there is lot of documentation about SA, anyone can tailor it to his needs I think.
And to the Zimbra team, I really appreciate your product, keep up the good work, waiting eagerly for new features!