How to rename large numbers of .eml files to usful names (e.g. stored with Tunderbird add-on SmartSave)

I have written a UNIX bash-shell script which can rename saved e-mail messages to _.eml; it will add version numbers if two e-mails would get the same name, but differ (full text comparison):

file cleansen_eml:

#!/bin/bash
# cleanup .eml file name and date

for THEFILE in *.eml ; do
    # echo >&2 "processing $THEFILE" ;
    if test "${THEFILE%.eml}" = "$THEFILE" ; then
        echo >&2 ;
        echo >&2 "**** $PWD" ;
        echo >&2 "File $THEFILE ignored since no .eml" ;
        continue ;
      fi ;
    THEFROM=$(grep -m 1 "^From: \|^Von: " "$THEFILE") ;
    THETO=$(grep -m 1 "^To: \|^An: " "$THEFILE") ;
    THEDATE=$(grep -m 1 "^Date: \|^Datum: " "$THEFILE") ;
    THESUBJECT=$(grep -m 1 "^Subject: \|^Betreff: " "$THEFILE") ;
    if test -z "$THEFROM" -o -z "$THEDATE" -o -z "$THESUBJECT" ; then
        echo >&2 ;
        echo >&2 "**** $PWD" ;
        echo >&2 "From:, Date:, or Subject: not found in $THEFILE. Ignore, maybe no e-mail." ;
        continue ;
      fi ;

    # echo >&2 "$THEFROM" ;
    CLEANFROM="$(echo ${THEFROM#*: } | sed -f $PSROOT/Arbeit/shell_scripts/clean_mailer.sed | cut -c 1-20)" ;
    CLEANTO="$(echo ${THETO#*: } | sed -f $PSROOT/Arbeit/shell_scripts/clean_mailer.sed | cut -c 1-20)" ;
    CLEANSUBJECT="$(echo ${THESUBJECT#*: } | sed -f $PSROOT/Arbeit/shell_scripts/clean_subject.sed )" ;
    CLEANFILENAME="$(echo -n "$CLEANFROM to $CLEANTO, $CLEANSUBJECT" | tr -c "A-Za-z0-9.,@ " "_" | sed -e "s/__/_/g" | cut -c 1-100).eml" ;
    # echo >&2 "-> $CLEANFILENAME" ;
    VERCOUNT=1 ;
    while test -e "$CLEANFILENAME" ; do
        echo >&2 ;
        echo >&2 "**** $PWD" ;
        echo >&2 "FILE $CLEANFILENAME exists." ;
        if cmp "$THEFILE" "$CLEANFILENAME" ; then
            echo >&2 "The files are the same:" "$THEFILE", "$CLEANFILENAME" ;
            break ;
          fi ;
        echo >&2 "I add a version number to the filename: " ;
        CLEANFILENAME="${CLEANFILENAME%.eml}" ;
        CLEANFILENAME="${CLEANFILENAME%.[0-9]}" ;
        CLEANFILENAME="${CLEANFILENAME%.[0-9][0-9]}" ;
        CLEANFILENAME="${CLEANFILENAME}.$VERCOUNT.eml" ;
        VERCOUNT=$(($VERCOUNT+1)) ;
      done ;
    # echo mv "$THEFILE" "$CLEANFILENAME" ;
    mv "$THEFILE" "$CLEANFILENAME" ;
    touch -d "${THEDATE#*: }" "$CLEANFILENAME" ;
    echo -n . ;
  done ;

exit 0 ;

It needs two helper files:

clean_mailer.sed:

s/=[?][A-Za-z]*-[-0-9]*[?]Q[?]//g
s/[?]=//g
s/^"\([^"]*\)" <\([^>]*\)>.*/\1 <\2>/g
s/^\([A-Z]\)\([a-z]*\) \([A-Za-z]*\) <\([-a-zA-Z0-9._]*@[-a-zA-Z0-9._]*\)>.*/\1.\3/
s/^\([A-Za-z]*\), \([A-Z]\)\([a-z]*\) <\([-a-zA-Z0-9._]*@[-a-zA-Z0-9._]*\)>.*/\2.\1/
s/^\([A-Z]\)\([a-z]*\) \([A-Za-z]\)\([A-Za-z]*\) \([A-Za-z]*\) <\([-a-zA-Z0-9._]*@[-a-zA-Z0-9._]*\)>.*/\1.\3.\5/
s/^<\([a-zA-Z]\)\([a-zA-Z]*\)[.]\([a-zA-Z]*\)@\([-a-zA-Z0-9._]*\)>/\1.\3/
s/^\([a-zA-Z]\)\([a-zA-Z]*\)[.]\([a-zA-Z]*\)@\([-a-zA-Z0-9._]*\)/\1.\3/
s/^\([-A-Za-z0-9_][-A-Za-z0-9_ ]*\)<\([^>]*\)>/\1/
s/^<\([-a-zA-Z0-9_][-a-zA-Z0-9]*\)@\([-a-zA-Z0-9._]*\)>/\1@\2/
s/  / /g
s/^ //g

clean_subject.sed:

s/=[?][A-Za-z]*-[-0-9]*[?]Q[?]//g
s/[?]=//g
s/=E4/ae/g
s/=FC/ue/g
s/=F6/oe/g
s/=DF/ss/g
s/="E/./g
s/  / /g
s/^ //g

Unfortunately, I do not understand much about Windows batch programming. I’m sure it could be also done in Windows. However I have installed cygwin and use a batch file:

cleansen_eml.bat:

cd "%1"
echo "%1"
C:\cygwin\bin\bash -ic "cleansen_eml"

I have put the batch in the context menu of Vista folders. Unfortunately I have lost my notes how I did it, but I had it from the web…

(2012-02-14) Ok, after reinstalling my Vista, I also had to look into this again, it can be done resonably well witH FileMenu Tools from LopeSoft.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: