Shell script filename pattern matching help
September 26, 2011 8:01 AM   Subscribe

Another shell script help needed post

I previously posted a question (answered quickly and succesfully - thanks!) about filtering out TIFF files from a directory tree that have no matching PDF associated with them. (See previous MeFi question)

Now I've been trying to modify that script without success to do the following:

Find all PDF files in a directory tree that do not have a matching (that is, filename match) with an additional set of characters in the filename. The characters are: _Aug11

So, if I have a set of files in a directory:
one.pdf
one_Aug11.pdf
two.pdf
three.pdf
three_Aug11.pdf

then the file "two.pdf" has no corresponding "two_Aug11.pdf". I need the script to echo this filename out (or to a file, whatever).

I tried using the original script but obviously when you find all .pdf files, you match the _Aug11 ones also. So the script would report that one_Aug11.pdf has no matching one_Aug11_Aug11.pdf which is not what I want.

Am I looking at using awk to solve this? I am not clear on how to use awk witihin a bash script. Help!
posted by dukes909 to Computers & Internet (8 answers total) 2 users marked this as a favorite
 
Best answer:
for fn in *.pdf; do if [[ $fn != *_Aug11.pdf && ! -e ${fn//.pdf/_Aug11.pdf} ]]; then echo $fn; fi; done

posted by Rat Spatula at 8:14 AM on September 26, 2011 [1 favorite]


You can use grep to filter out the "_Aug11" files. Here's a basic example, which will work as long as none of the files have spaces in their filenames:

date_pattern="_Aug11"

ls $directory | grep -v $date_pattern | while read filename
do
    base=${filename%.pdf}
    if [ ! -f "${base}${date_pattern}.pdf" ]
    then
        echo $filename
    fi
done
posted by 1970s Antihero at 8:17 AM on September 26, 2011


Wrote a quick Perl script here that might be a bit more flexible.
posted by Blazecock Pileon at 8:18 AM on September 26, 2011


Response by poster: Wow! Again super fast responses. The one liner by Rat Spatula worked great, will try others. Note: some will have spaces in their names.
posted by dukes909 at 8:24 AM on September 26, 2011


Response by poster: Ok, I'm missing it: what does this part of the script do?
-e ${fn//.pdf/_Aug11.pdf}
posted by dukes909 at 8:25 AM on September 26, 2011


For names with spaces, the variable expansions will need surrounding doublequotes in places where it's important:

for fn in *.pdf; do if [[ "$fn" != *_Aug11.pdf && ! -e "${fn//.pdf/_Aug11.pdf}" ]]; then echo $fn; fi; done


-e tests for a file's existence.

The // modifier replaces one chunk of a variable's expanded value with another.
posted by Rat Spatula at 8:35 AM on September 26, 2011


Response by poster: This brings up another question. What if I wanted to assign each of the files (one.pdf and one_Aug11.pdf) a variable name? Is this possible with the // modifier?

Ultimately what I'm trying to do is if there is a one.pdf and an one_Aug11.pdf then I want to remove the one.pdf and rename the one_Aug11.pdf to just one.pdf (or just mv one_Aug11.pdf one.pdf ) ...
posted by dukes909 at 12:57 PM on September 26, 2011


Spend some time playing around with the various operators described at my "replaces" link above. This kind of string-chopping is fairly easy with BASH.
posted by Rat Spatula at 1:58 PM on September 26, 2011


« Older Relationship with a shelf life?   |   What are some fun games and activities that... Newer »
This thread is closed to new comments.