Actually, most of the time DSPAM is getting it right even when the mail is classified incorrectly.
E.g. yesterday DSPAM scored 60% of the submitted spam correctly. However, even though DSPAM presumably added 8.7 to the scores of those emails when they came in, spamassassin must have subtracted more than 2.1 due to Bayes and other elements (such as DNSWL).
If you increased the DSPAM score from 8.7, reduced the spam score from 6.6, or reduced/eliminated some of the spamassassin scores, then those three true positives would have gone straight into people's Junk folders. They wouldn't have been submitted by your users. The false negatives might still have been submitted, which would make DSPAM's accuracy look worse, even though the overall accuracy of your antispam system would be better.
However, all of this ignores the impact of retraining over time. Regardless of whether you use SA + DSPAM or just DSPAM, it might be a little strange that retraining is only happening on email that gets misclassified. It seems to me that proper retraining should be done using representative samples of actual ham/spam, not just the subset that gets sorted incorrectly. It's strikes me as especially distorting with respect to the very small and unrepresentative proportion of ham that actually gets used for retraining.
I am not an expert in this area but it seems to me that it would be more valid to retrain using a combination of reported spam/ham, and weighted samples of the spam/ham which were (presumably) classified correctly, and therefore weren't reported.
To overcome the potential distortion inherent in the current scheme, you could periodically collect your own corpora of representative ham/spam at your site and then use them to train SA and DSPAM.
However, in practice, the overall accuracy of the antispam system is so high that I don't worry about it too much, and if I were to try to tweak it further, I'd probably look into using DCC, Pyzor, and/or Razor as described in
Improving Anti-spam system - Zimbra :: Wiki