View Single Post
  #2 (permalink)  
Old 01-29-2007, 04:22 AM
area area is offline
Active Member
 
Posts: 47
Default I've just written such a script...

My client had the problem that they used Thunderbird to transfer some mail folders to Zimbra and so the "internal date" for all the emails was set to the date of the transfer (Dec 4, 2006) - see IMAP Sync and "received" date.

So I wrote a script which:
- identifies the database for a user
- dumps all the rows from the mail_item table (or just the rows for a specific folder)
- reads the dumped data and locates the raw in email under /opt/zimbra/store
- extracts the date from the raw email and converts it to yyyy-mm-dd hh:mm:ss
- creates an SQL update statement (converting yyyy-mm-dd hh:mm:ss to seconds)
- saves the SQL update statements to a file
- shows the commands to execute the file of SQL statements

Caveats:
- the script skips messages where the 'internal date' is not 04-12-2006 (dd-mm-yyyy format)
- assumes /opt/zimbra/store/0
- it worked for me - it may not for you!
- uses the Korn shell (some minor changes required for sh or bash)

Nothing like SQL and a bit of shell scripting...

Rgds - Angus

Here's the script:
Code:
#!/bin/ksh

SCRIPT=`basename $0`
USAGE="usage: [FOLDER_ID=nnnn] $SCRIPT username"

: ${DOMAIN:=yourdomain.com}
STORE_BASE=/opt/zimbra/store/0

if [ $# -ne 1 ]; then
   echo $USAGE >&2
   echo "(to list the folders for a user: echo -e \"sm username@${DOMAIN}\\\\ngaf\" | su - zimbra -c zmprov)"
   exit 1
fi

USER=$1
USERNAME=${USER}@${DOMAIN}

# Define the output files.
MB_DUMP_FILE=/tmp/mb_${USER}.list
UPD_SQL_STATS=/tmp/sql_${USER}.list
[ "$FOLDER_ID" ] && UPD_SQL_STATS=/tmp/sql_${USER}_${FOLDER_ID}.list
TBL_DUMP_FILE=/tmp/mailitems_${USER}.dump

# Delete all the output files.
for file in $MB_DUMP_FILE $UPD_SQL_STATS $TBL_DUMP_FILE; do
   [ -f $file ] && rm -f $file
done

ID=`echo "select id from mailbox where comment='${USERNAME}'" | su - zimbra -c "mysql -N zimbra"`
if [ -z "$ID" ]; then
   echo "\nerror: can't find mailbox ID for user $USERNAME!!!"
   exit 1
fi

MAILBOX_DB=mboxgroup${ID}
STORE_PATH=${STORE_BASE}/${ID}/msg/
echo "
Username  : $USERNAME
Database  : $MAILBOX_DB
Store path: $STORE_PATH"

# Columns in mailitem table:
#    mailbox_id
#    id
#    type
#    parent_id
#    folder_id
#    index_id
#    imap_id
#    date
#    size
#    volume_id
#    blob_digest
#    unread
#    flags
#    tags
#    sender
#    subject
#    name
#    metadata
#    mod_metadata
#    change_date
#    mod_content
# mailbox_id   id type parent_id folder_id index_id imap_id       date size volume_id...
# 23         7223    5      NULL      7221     7223    7223 1165200534 4842         1...

col_list="id, from_unixtime(date,'%d-%m-%Y %l:%i:%s%p'), sender, subject"
where_clause="type=5"
[ "$FOLDER_ID" ] && where_clause="$where_clause and folder_id=$FOLDER_ID"
echo "\nDumping all emails for '$USERNAME' from $MAILBOX_DB to $MB_DUMP_FILE..."
echo "(where clause: $where_clause)"
echo "select $col_list from mail_item where $where_clause" |
     su - zimbra -c "mysql --skip-column-names $MAILBOX_DB" > $MB_DUMP_FILE
email_cnt=`cat $MB_DUMP_FILE | wc -l | sed 's/  *//'`

echo "\
SET AUTOCOMMIT=0;
START TRANSACTION; " > $UPD_SQL_STATS

echo "\nReading $email_cnt emails $MB_DUMP_FILE - saving SQL update statements to $UPD_SQL_STATS..."
cat $MB_DUMP_FILE |
while read id date time cols; do
   # echo "ID=$id; date=$date; time=$time; other cols=$cols"

   # Check that there is only one file for the current ID.
   file_cnt=`ls -l ${STORE_PATH}/*/${id}-*.msg 2> /dev/null | wc -l`
   if [ "$file_cnt" = 0 ]; then
      echo "error: no files found for ID $id!!!"
      continue
   fi
   if [ "$file_cnt" -gt 1 ]; then
      echo "error: more than one file found for ID $id!!!"
      ls -l ${STORE_PATH}/*/${id}-*.msg
      continue
   fi
   if [ "$date" != "04-12-2006" ]; then
      echo "skip: date <> 04-12-2006"
      continue
   fi

   # Check that only one row matches the current ID.
   row_cnt=`echo "select count(*) as count from mail_item where id=$id" | su - zimbra -c "mysql --skip-column-names $MAILBOX_DB"`
   # echo "row_cnt=$row_cnt"
   if [ "$row_cnt" = 0 ]; then
      echo "error: can't locate row for ID $id!!!"
      continue
   fi
   if [ "$row_cnt" -gt 1 ]; then
      echo "error: more than one row found for ID $id!!!"
      continue
   fi

   # Extract the date from the raw email.
   email_date=`sed -n "s/\r$//;s/^Date: //p" ${STORE_PATH}/*/${id}-*.msg | head -1`
   if [ -z "$email_date" ]; then
      echo "error: can't extract a date from the raw email for id $id!!!"
      echo "file: `ls ${STORE_PATH}/*/${id}-*.msg`"
      continue
   fi

   # Convert the date from the email to yyyy-mm-dd hh:mm:ss format.
   email_yymmdd_hhmmss=`echo $email_date | sed -f /root/scripts/convert_date.sed`
   if [ -z "$email_yymmdd_hhmmss" ]; then
      echo "error: can't convert date ($email_date) for id $id!!!"
      continue
   fi

   # Display SQL statement to update the mailbox.
   echo "ID=$id: $email_date ==> $email_yymmdd_hhmmss"
   upd_sql="update mail_item set date=unix_timestamp('${email_yymmdd_hhmmss}') where id=$id"
   # echo "update SQL: $upd_sql"
   echo "${upd_sql};" >> $UPD_SQL_STATS
   # echo "$upd_sql" | su - zimbra -c "mysql $MAILBOX_DB"
done
echo "COMMIT;" >> $UPD_SQL_STATS

echo "
Command to backup the 'mail_item' table:

    echo \"select * into OUTFILE '$TBL_DUMP_FILE' from mail_item\" | su - zimbra -c \"mysql $MAILBOX_DB\"

Command to execute the update statements:

    su - zimbra -c \"mysql $MAILBOX_DB\" < $UPD_SQL_STATS
"
And here is the sed script to re-format the date (save as /root/scripts/convert_date.sed):
Code:
# Fri, 16 Sep 2005 08:30:39 +1000
s/^[A-Z][a-z][a-z], \([0-9][0-9]\) \([A-Z][a-z][a-z]\) \([0-9][0-9][0-9][0-9]\) \([0-9][0-9]:[0-9][0-9]:[0-9][0-9]\) +\([0-9]*\)$/\3-\2-\1 \4/
s/Jan/01/
s/Feb/02/
s/Mar/03/
s/Apr/04/
s/May/05/
s/Jun/06/
s/Jul/07/
s/Aug/08/
s/Sep/09/
s/Oct/10/
s/Nov/11/
s/Dec/12/
And here is some sample output:
Code:
FOLDER_ID=7221 ~/scripts/dump_zimbra_mailbox.ksh gray

Username  : gray@yourdomain.com
Database  : mboxgroup23
Store path: /opt/zimbra/store/0/23/msg/

Dumping all emails for 'gray@yourdomain.com' from mboxgroup23 to /tmp/mb_gray.list...
(where clause: type=5 and folder_id=7221)

Reading 52 emails /tmp/mb_gray.list - saving SQL update statements to /tmp/sql_gray_7221.list...
ID=7223: Thu, 12 Oct 2006 11:03:07 +1000 ==> 2006-10-12 11:03:07
ID=7225: Thu, 27 Jul 2006 09:31:22 +1000 ==> 2006-07-27 09:31:22
...
ID=7313: Fri, 16 Sep 2005 08:30:39 +1000 ==> 2005-09-16 08:30:39
ID=7314: Fri, 16 Sep 2005 08:41:51 +1000 ==> 2005-09-16 08:41:51

Command to backup the 'mail_item' table:

    echo "select * into OUTFILE '/tmp/mailitems_gray.dump' from mail_item" | su - zimbra -c "mysql mboxgroup23"

Command to execute the update statements:

    su - zimbra -c "mysql mboxgroup23" < /tmp/sql_gray_7221.list
Reply With Quote