I Wish I Had a Diffy
January 11, 2008 7:46 AM   Subscribe

I'm searching for the ultimate in hand-holding, peel-my-grape recursive directory diff tools for the lazy.

I'm looking for something that does (and here's where my vocabulary fails me) a "diff" between two directory structures. Like many things for which I've searched and not found, it's probably either a result of me not knowing the correct terminology or just being insanely specific.

Beyond Compare, from a similar thread, looked nifty, but I don't need comparisons within files, just between them, that is, either the files are the same, or they aren't. And they aren't all text, either, which the product page indicates is an issue. ExamDiff Pro doesn't seem to do subdirectories.

Supposing I have three directories. The third directory is empty and called "duplicates." The first two directories are filled with files, subdirectories, and files in those subdirectories. They're mostly similar. And that's the problem, they're mostly just duplicates. Text files, compressed video, Word documents, all kinds of stuff.

I'm looking for a utility that recursively compares the first directory to the second. Should it find a file in the second directory that has the same filename and is, upon a binary comparison, the same file as one in the first directory, the software moves it to the third "duplicates" directory, preserving the overall directory structure.

This would leave the first directory a mostly-empty skeleton with just a few files in it, the second directory untouched, and the third directory a bunch of redundant files in a similar directory structure to the first and filled with yummy deletable files.

I've written non-recursive scripts that do this on a very small scale, but I don't quite have the confidence, time, or inclination to really expand it. Alternatively, I could handle a readout that just displayed the files that were "new" or "different," while ignoring files that are "same" or "missing." It wouldn't be as convenient, but I'm willing to settle.

My two requirements for it are:

1) Runs on Windows, preferably NTFS-friendly
2) Either freeware, shareware, or something relatively cheap. I'm not looking to spend $200 on it.

What on Earth do you call such software, aside from automagically delicious? Which do you use, and why do you like it?

No, CVS/Subversion isn't a solution at the moment.
posted by adipocere to Computers & Internet (15 answers total) 1 user marked this as a favorite
Don't fear the command line! rsync is the best solution in these scenarios, always. There are a million tutorials for how to use it online, it's cross platform, and can do those comparisons and transfers between different machines using ssh.

You'll need to do a little bit of reading and experimentation to get rsync doing exactly what you want, but the chances are *very* good that rsync can do it. The '-n' option will be your friend. It provides a 'dry run' mode, so you can see if you got all the switches and syntax correct without risking anything.
posted by dehowell at 8:07 AM on January 11, 2008

Robocopy is another commandline tool that can do this with one or two calls and some creative switch usage.

For example "ROBOCOPY diff\ standard\ /MOVE /E /XX /XC /XN /XO /XL /IS" would leave all of the differences between the two folders in the diff folder while keeping the standard folder the same.
posted by burnmp3s at 8:13 AM on January 11, 2008

Are you sure you just can't reboot and give the Ubuntu LiveCD a shot? It sounds like Meld is right up your alley. I don't see a Windows port, though...
posted by cdmwebs at 8:48 AM on January 11, 2008

I think dehowell's on the right track. 'rsync -r -n' will recursively show you what files would have been copied without actually doing it.

The trick is in copying those files to a separate directory. With some scripting, you can pipe that output to a separate copy command, but you have to be comfortable in that environment.

Is \Path\to\FILE considered a copy of \Different\Path\to\FILE? That would make things much more complicated.
posted by mkultra at 8:58 AM on January 11, 2008

Beyond Compare can easily do the folder comparison, and you can tell it to just compare file size, size and crc, binary comparison, or rules based (ignore whitespace kinda thing)

You can filter to show the diffs and then copy to a third directory.
posted by zeoslap at 9:01 AM on January 11, 2008

Gnu "diff", perhaps will get you part way there.

$ diff -r -q -s dirA dirB
Only in dirA: foo
Only in dirB: bar
Files a/all and b/all are identical
Files dirA/x/baz and dirB/x/baz differ

Save that to a file, and go through it, moving and deleting on lines matching "are identical". Ten lines of Python would do it.

r = recurse
q = quiet (no patch file)
s = report same files
posted by cmiller at 9:03 AM on January 11, 2008

Did you end up trying Beyond Compare? It does have a full-featured 30 day trial period before you have to buy it. It does do binary comparisons, showing only files that are different (the program basically has two views - a directory view that shows files are different, then a file view that shows the specific differences. You would only want the directory view). It also has a copy to folder feature that seems like it would do what you are looking for and would allow you to see the different files, highlight them, then copy to a separate area.

Beyond Compare is one of the few pieces of shareware I have been impressed with enough to purchase. It constantly amazes me.

(on preview, what zeoslap said)
posted by dforemsky at 9:03 AM on January 11, 2008

I use Ultra Compare, works fine and can handle what you're specifying.
posted by iamabot at 9:06 AM on January 11, 2008

# even fewer than 10.
# seriously, how do people live without Python? :)

import re
import os

for line in open("diffoutput"):
filematch = re.search("^Files (.*) and (.*) are identical$", line)
if filematch:
os.rename(filematch.group(1), os.path.join("common", filematch.group(1)))
posted by cmiller at 9:10 AM on January 11, 2008

(I curse you, MeFi HTML entity stripper! "for" and "if" should indent all successive lines once each.)
posted by cmiller at 9:11 AM on January 11, 2008

Beyond Compare can definitely do what you want and it's free for 30 days. It sounds like you just might want to tweak the session settings to make it less sensitive about what qualifies as "different". You can then sort the second directory by status and do what you need to with those files.
posted by yerfatma at 9:14 AM on January 11, 2008

This is more about finding duplicate files. Over the holidays, I used something that let me move files rather than delete them. I think I found it from Lifehacker. Let me dig a bit...

Oh, yeah: Duplicate File Finder.
posted by Pronoiac at 9:34 AM on January 11, 2008

CloneSpy worked for me. (Sorry, at work & don't have time to find a link.)
posted by desjardins at 2:37 PM on January 11, 2008

Directory Opus has a Synchronize option as well as a Duplicate Files option. It can compare at byte-level, several different date compare functions and a size function. To do what you want, you could have it mark the duplicates in the file lister windows, inspect to doublecheck that it has found the appropriate dupes, and then just manually delete those files - no need to have a third directory with deletable files, blow them away from the get-go! Trial version available - it's shareware.

Or, try SyncBack. There's a free version and a shareware version. Both have some very nice duplicate file finding features, but again, not sure that the third target directory concept is an 'automagically delicious' option (and still not sure whether there's even a point to that).
posted by tra at 4:20 PM on January 11, 2008

I'm not quit sure what you're trying to do when I read the description, but WinMerge does excellent directory (and file, if you like)-level comparisons. It doesn't move files around as your extended description asks for, but it excels at recursive directory diffs.
posted by RikiTikiTavi at 3:51 PM on January 14, 2008

« Older YouTube without the comments   |   Rainy days and mondays always get me down. Newer »
This thread is closed to new comments.