OK - here are some updated sed commands to handle the other date formats found in your raw email files (which, as far as I understand email standards, are non-standard).
Please note that these commands assumes that the dates in your raw emails are in d[d] mm yyyy format. # Mon, 4 07 2005 09:59:05 +0200 (CEST)
# Wed, 29 06 2005 10:12:27 GMT
# Fri, 1 04 2005 09:40:51 GMT
# 4 10 2006 00:30:29 -0700
# 22 03 2006 06:28:02 -0800
s/^[A-Z][a-z][a-z], //
s/\([0-9][0-9]*\) \([0-9][0-9]\) \([0-9][0-9][0-9][0-9]\) \([0-9][0-9]:[0-9][0-9]:[0-9][0-9]\).*$/\3-\2-\1 \4/
s/-\([1-9]\) /-0\1 /
Please use the sed script, and the shell script posted earlier, at your own risk. |