One of the useful features of Zimbra is that duplicate emails only get stored in the mailstore once. Tastes great, less filling.
Now, suppose a bunch of accounts are moved to a new server/mailstore via zmmailboxmove and those accounts received much the same emails. Oops, mailstore size skyrockets! :eek:
Is there a utility or process to de-dup a given mailstore?
De-dupe is per mail store as the users moved to separate mail store have new sql database.
Are you saying that if user A and B (whose mailboxes are on server C at first) have been emailing each other big PowerPoint files, then have their mailboxes moved to server D that the hard links on server D for those PowerPoint files no longer exist (and so the store size goes up)?
We've never tested that on our end.
All the best,
My scenario is the following:
Users A,B,C,D on server S1 are on a mailing list and keep all their messages, so all the messages they receive are the same for each account.
Say User D is moved to server S2 via zmmailboxmove, where S2 uses a different mailstore. Then User C is moved to S2. The mailstore on S2 will be twice the size on S1.
I'm seeing this in practice, I've moved roughly half (in terms of number AND size) my users to a new server yet the mailstore size on the new server is much larger than the old server, by a factor of 2. I have users that get CC'd on a large number of emails.
I guess I would say I am not surprised at that.
Preserving the single instance store during a mailbox move would require the move script to compare the blobs in the mailbox being moved to every blob in the store on the target server in order to decide whether to create a new hard link or a new blob.
That sounds non-trivial in terms of programming complexity and very, very demanding of compute resources.
Veronica has already pointed out that the single-instance store is a creature of each mailbox server, not of a Zimbra multi-server farm, so this again seems "WAD" to me. ("Working As Designed" in old IBM mainframe-speak).
Wouldn't hurt to fill out an RFE though; I'd vote for it. :rolleyes:
But the takeaway for me here is to be careful about correctly sizing a Zimbra mailbox server up front for the expected life of the server, so as to avoid the need to move mailboxes unless absolutely necessary. Or alternatively, to use 64-bit Xen deployments to move the Zimbra virtual server to new hardware when needed to avoid having to move mailboxes.
Hope that helps,
I agree, overly complex. Makes more sense to do a batch utility that combs through the mail store and de-dupes. Which is what I was hoping had already been written. :D
Originally Posted by LMStone
In that same vein of thought: Do emails get duped when doing an imap migration to zimbra? A batch script that de-duped would be very helpful for reducing the message store size in that scenario too.
Yes and in PST migration too.
Originally Posted by cayaraa
Originally Posted by cayaraa
However, if you use "manual hardlinks" instead of integrated SIS, what happens if one user deletes the mail the hardlink points to?
Did anyone create an RFE for this? I couldn't find one and would gladly fill it out as this is a feature that would be incredibly useful for us.