My client had the problem that they used Thunderbird to transfer some mail folders to Zimbra and so the "internal date" for all the emails was set to the date of the transfer (Dec 4, 2006) - see
IMAP Sync and "received" date.
So I wrote a script which:
- identifies the database for a user
- dumps all the rows from the mail_item table (or just the rows for a specific folder)
- reads the dumped data and locates the raw in email under /opt/zimbra/store
- extracts the date from the raw email and converts it to yyyy-mm-dd hh:mm:ss
- creates an SQL update statement (converting yyyy-mm-dd hh:mm:ss to seconds)
- saves the SQL update statements to a file
- shows the commands to execute the file of SQL statements
Caveats:
- the script skips messages where the 'internal date' is not 04-12-2006 (dd-mm-yyyy format)
- assumes /opt/zimbra/store/0
- it worked for me - it may not for you!
- uses the Korn shell (some minor changes required for sh or bash)
Nothing like SQL and a bit of shell scripting...
Rgds - Angus
Here's the script:
Code:
#!/bin/ksh
SCRIPT=`basename $0`
USAGE="usage: [FOLDER_ID=nnnn] $SCRIPT username"
: ${DOMAIN:=yourdomain.com}
STORE_BASE=/opt/zimbra/store/0
if [ $# -ne 1 ]; then
echo $USAGE >&2
echo "(to list the folders for a user: echo -e \"sm username@${DOMAIN}\\\\ngaf\" | su - zimbra -c zmprov)"
exit 1
fi
USER=$1
USERNAME=${USER}@${DOMAIN}
# Define the output files.
MB_DUMP_FILE=/tmp/mb_${USER}.list
UPD_SQL_STATS=/tmp/sql_${USER}.list
[ "$FOLDER_ID" ] && UPD_SQL_STATS=/tmp/sql_${USER}_${FOLDER_ID}.list
TBL_DUMP_FILE=/tmp/mailitems_${USER}.dump
# Delete all the output files.
for file in $MB_DUMP_FILE $UPD_SQL_STATS $TBL_DUMP_FILE; do
[ -f $file ] && rm -f $file
done
ID=`echo "select id from mailbox where comment='${USERNAME}'" | su - zimbra -c "mysql -N zimbra"`
if [ -z "$ID" ]; then
echo "\nerror: can't find mailbox ID for user $USERNAME!!!"
exit 1
fi
MAILBOX_DB=mboxgroup${ID}
STORE_PATH=${STORE_BASE}/${ID}/msg/
echo "
Username : $USERNAME
Database : $MAILBOX_DB
Store path: $STORE_PATH"
# Columns in mailitem table:
# mailbox_id
# id
# type
# parent_id
# folder_id
# index_id
# imap_id
# date
# size
# volume_id
# blob_digest
# unread
# flags
# tags
# sender
# subject
# name
# metadata
# mod_metadata
# change_date
# mod_content
# mailbox_id id type parent_id folder_id index_id imap_id date size volume_id...
# 23 7223 5 NULL 7221 7223 7223 1165200534 4842 1...
col_list="id, from_unixtime(date,'%d-%m-%Y %l:%i:%s%p'), sender, subject"
where_clause="type=5"
[ "$FOLDER_ID" ] && where_clause="$where_clause and folder_id=$FOLDER_ID"
echo "\nDumping all emails for '$USERNAME' from $MAILBOX_DB to $MB_DUMP_FILE..."
echo "(where clause: $where_clause)"
echo "select $col_list from mail_item where $where_clause" |
su - zimbra -c "mysql --skip-column-names $MAILBOX_DB" > $MB_DUMP_FILE
email_cnt=`cat $MB_DUMP_FILE | wc -l | sed 's/ *//'`
echo "\
SET AUTOCOMMIT=0;
START TRANSACTION; " > $UPD_SQL_STATS
echo "\nReading $email_cnt emails $MB_DUMP_FILE - saving SQL update statements to $UPD_SQL_STATS..."
cat $MB_DUMP_FILE |
while read id date time cols; do
# echo "ID=$id; date=$date; time=$time; other cols=$cols"
# Check that there is only one file for the current ID.
file_cnt=`ls -l ${STORE_PATH}/*/${id}-*.msg 2> /dev/null | wc -l`
if [ "$file_cnt" = 0 ]; then
echo "error: no files found for ID $id!!!"
continue
fi
if [ "$file_cnt" -gt 1 ]; then
echo "error: more than one file found for ID $id!!!"
ls -l ${STORE_PATH}/*/${id}-*.msg
continue
fi
if [ "$date" != "04-12-2006" ]; then
echo "skip: date <> 04-12-2006"
continue
fi
# Check that only one row matches the current ID.
row_cnt=`echo "select count(*) as count from mail_item where id=$id" | su - zimbra -c "mysql --skip-column-names $MAILBOX_DB"`
# echo "row_cnt=$row_cnt"
if [ "$row_cnt" = 0 ]; then
echo "error: can't locate row for ID $id!!!"
continue
fi
if [ "$row_cnt" -gt 1 ]; then
echo "error: more than one row found for ID $id!!!"
continue
fi
# Extract the date from the raw email.
email_date=`sed -n "s/\r$//;s/^Date: //p" ${STORE_PATH}/*/${id}-*.msg | head -1`
if [ -z "$email_date" ]; then
echo "error: can't extract a date from the raw email for id $id!!!"
echo "file: `ls ${STORE_PATH}/*/${id}-*.msg`"
continue
fi
# Convert the date from the email to yyyy-mm-dd hh:mm:ss format.
email_yymmdd_hhmmss=`echo $email_date | sed -f /root/scripts/convert_date.sed`
if [ -z "$email_yymmdd_hhmmss" ]; then
echo "error: can't convert date ($email_date) for id $id!!!"
continue
fi
# Display SQL statement to update the mailbox.
echo "ID=$id: $email_date ==> $email_yymmdd_hhmmss"
upd_sql="update mail_item set date=unix_timestamp('${email_yymmdd_hhmmss}') where id=$id"
# echo "update SQL: $upd_sql"
echo "${upd_sql};" >> $UPD_SQL_STATS
# echo "$upd_sql" | su - zimbra -c "mysql $MAILBOX_DB"
done
echo "COMMIT;" >> $UPD_SQL_STATS
echo "
Command to backup the 'mail_item' table:
echo \"select * into OUTFILE '$TBL_DUMP_FILE' from mail_item\" | su - zimbra -c \"mysql $MAILBOX_DB\"
Command to execute the update statements:
su - zimbra -c \"mysql $MAILBOX_DB\" < $UPD_SQL_STATS
" And here is the sed script to re-format the date (save as /root/scripts/convert_date.sed):
Code:
# Fri, 16 Sep 2005 08:30:39 +1000
s/^[A-Z][a-z][a-z], \([0-9][0-9]\) \([A-Z][a-z][a-z]\) \([0-9][0-9][0-9][0-9]\) \([0-9][0-9]:[0-9][0-9]:[0-9][0-9]\) +\([0-9]*\)$/\3-\2-\1 \4/
s/Jan/01/
s/Feb/02/
s/Mar/03/
s/Apr/04/
s/May/05/
s/Jun/06/
s/Jul/07/
s/Aug/08/
s/Sep/09/
s/Oct/10/
s/Nov/11/
s/Dec/12/
And here is some sample output:
Code:
FOLDER_ID=7221 ~/scripts/dump_zimbra_mailbox.ksh gray
Username : gray@yourdomain.com
Database : mboxgroup23
Store path: /opt/zimbra/store/0/23/msg/
Dumping all emails for 'gray@yourdomain.com' from mboxgroup23 to /tmp/mb_gray.list...
(where clause: type=5 and folder_id=7221)
Reading 52 emails /tmp/mb_gray.list - saving SQL update statements to /tmp/sql_gray_7221.list...
ID=7223: Thu, 12 Oct 2006 11:03:07 +1000 ==> 2006-10-12 11:03:07
ID=7225: Thu, 27 Jul 2006 09:31:22 +1000 ==> 2006-07-27 09:31:22
...
ID=7313: Fri, 16 Sep 2005 08:30:39 +1000 ==> 2005-09-16 08:30:39
ID=7314: Fri, 16 Sep 2005 08:41:51 +1000 ==> 2005-09-16 08:41:51
Command to backup the 'mail_item' table:
echo "select * into OUTFILE '/tmp/mailitems_gray.dump' from mail_item" | su - zimbra -c "mysql mboxgroup23"
Command to execute the update statements:
su - zimbra -c "mysql mboxgroup23" < /tmp/sql_gray_7221.list