Zimbra Collaboration Suite 7.0
Administrator's Guide
Open Source Edition

Index Store
The index and search technology is provided through Apache Lucene. Each message is automatically indexed as it enters the system. Each mailbox has an index file associated with it.
The tokenizing and indexing process is not configurable by administrators or users.
Message tokenization
The process is as follows:
The mailbox server parses the message, including the header, the body, and all readable file attachments such as PDF files or Microsoft Word documents, in order to tokenize the words.
Tokenization is the method for indexing by each word. Certain common patterns, such as phone numbers, email addresses, and domain names are tokenized as shown in Figure .