Zimbra on Hadoop
We are a small company that are trying to use ZCS mainly for email purposes. Internally we use Hadoop as our distributed file system and wanted to continue using that for the ZCS. Has any one integrated Zimbra with Hadoop? Specifically we want all data used by Zimbra to use Hadoop Distributed File System instead of the local filesystem. Thus all the message store etc should reside on hadoop. Any help on this regard is extremely appreciated.
Welcome to the forums :)
Well if you want the datastore on a alternative file system then why not just present the storage and mount it under /opt/zimbra/store ?
Thanks for the quick reply.
I am very new to Zimbra so please excuse me if these are very naive questions. I looked at the option of mounting Hadoop as a regular file system using FUSE. There are a couple of issues that I can't find answers for
1. Has anyone done any performance testing using FUSE mounting of Hadoop and using Zimbra? Our email traffic is fairly significant(and includes lots of big messages) so this is a very moot point.
2. Does all data that zimbra creates/uses reside on /opt/zimbra/store? Including any temporary scratch data, etc? Basically if a particular machine running zimbra goes down for whatever reason, does switching on another machine work?
Thanks in advance!
I haven't worked with hadoop specifically, so I can't answer the question about performance, but I would imagine you would set it up the same way you would for a ZCS cluster on a SAN only with a different filesystem..
Cluster Install for Single-Node ConfigurationFor Red Hat Cluster Suite Integration
file lock issues?
i was just thinking what could go wrong with "simply switching" over to another machine...
Is it because zimbra could be setting up locks on certain files on the mount? ...which then would be an issue when another machine tries to access it.
maybe some one with more clarity wouldbe able to confirm this.
I like the idea of using one of the "slacker databases" like CouchDB or Hadoop DFS to save store filesystems, but I don't think that it's a easy implementation to do it in a way that Zimbra doesn't recognize. However, there is a closed RFE that might make this possible:
Originally Posted by tagman
Bug 30550 – Support for third party blob stores
There are some interesting ideas. If you do LDAP replication/MySQL replication and have the blog store in a replicated file store, this could be a very flexable and powerful system.
Check it out.