When I do spellcheck of a text with accents (like in French or Spanish), the accentuated characters are used as separators.
The problem is two-folded: the regexp in aspell.php and the text extracted from the webclient.
The first part is solved as in the WiKi, that is, the regexp used to convert the text to a list of words is:
$words = preg_split('/[^\w\'\xc0-\xfd-]+/', $text);
To test it, I can access the test page directly as:
and fill a text with accents. This works fine.
The problem is the text passed as argument from Zimbra. The WebMail replaces the accents with "?", then, of course, a word like
which the PHP code correctly splits in the words "ex" and "cuter".
I traced the problem down to AjxStringUtil.convertHtml2Text(), but I don't see anything wrong with the code. Can it be the Element.toString() that is working in a weird way?