File management and getting two directories in sync.
June 30, 2014 9:07 AM   Subscribe

I'm in the process of cleaning up a hacked Wordpress install. How can I use the file list from a known-good directory as the basis for deleting all other files in the hacked one? e.g.: If a file exists in .../goodtheme/ leave it in .../badtheme. Otherwise, delete everything.

Complication: the known-good directory doesn't work correctly, otherwise I'd just restore it completely. This is just one step of many in the cleanup.
posted by piro to Technology (13 answers total) 4 users marked this as a favorite
Download WinMerge. It allows you to compare two directories side-by-side and see what's extra or missing. It'll even delete the unwanted files for you.
posted by pipeski at 9:11 AM on June 30, 2014

rsync. Take a look at --delete and similar options. Do a few dry runs first, you can even use the results of one to do the pruning by hand.
posted by kcm at 9:40 AM on June 30, 2014

ls -R gooddir/ baddir/ | sort | uniq | rm
posted by Setec Astronomy at 9:47 AM on June 30, 2014

If you're working in a shell or command line then something like
diff -r bad-directory good-directory | grep bad-directory | less
should give you a list of altered files.
posted by quinndexter at 9:48 AM on June 30, 2014

I loves me some BeyondCompare for these kinds of jobs.
posted by pyro979 at 10:24 AM on June 30, 2014 [1 favorite]

Setec Astronoy's command above gives me a "Missing Operand" error, but following on from kcm's rsync suggestion:
rsync -r -p -o -g --delete -c /gooddir /baddir
-traverse folders recursively
-preserve permissions
-delete files in baddir that do not exist in gooddir
-replace files in baddir if the checksum between two 'same' files differs.

I think. I'd want to test it on a copy of the gooddir a time or two before doing it for reals.
posted by quinndexter at 10:45 AM on June 30, 2014

Please do not try Setec Astronomy's suggestion. It seems to me to not be the answer, and more importantly to be dangerous. I'll give details in my next comment; I just wanted to say "please don't do that" as fast as possible.
posted by Flunkie at 11:06 AM on June 30, 2014 [1 favorite]

ls -R gooddir/ baddir/ | sort | uniq | rm
Whoa, whoa, wait a minute here. Maybe I'm misunderstanding this, but I think this has got all sorts of things wrong with it, and absolutely shouldn't be used to try to accomplish what the questioner wants to accomplish.

Here's what this seems to be attempting to say to me:

(1) List all files that are in the gooddir directory (including subdirectories), and all files that are in the baddir directory (including subdirectories).

(2) Sort the output alphabetically. Note at this point that "the output" will likely include things that are not filenames, e.g. "gooddir/:".

(3) Get rid of duplicates from that sorted list. So for example if there's a file named "x" in gooddir, and a file named "x" in baddir, then the sorted list will be culled to include only one "x" line instead of two. Note at this point -- and this seems very important to me -- that your list still includes all filenames in the directories.

(4) Delete all files in that list. There are at least four problems here, and at least two of them are extremely important:

(4.1) Your list includes all files. You're saying delete all files. You're not saying delete some files. You're saying delete all files.

(4.2) You're saying to delete them from the current directory. So if there's a file called "gooddir/x", you're saying to delete the file "x". Not the file "gooddir/x".

(4.3) You're also likely saying to delete things that aren't even files in the first place (due to what's noted in (2) - i.e. your list might include things like "gooddir/:".

(4.4) Even ignoring all of the above -- and it's not a good idea to ignore all of the above -- I believe you'd want "xargs rm", not "rm", right?

Am I misunderstanding all of this? Because it seems like this suggestion (1) Won't work, and (2) Even if it would "work", would delete the wrong files.
posted by Flunkie at 11:08 AM on June 30, 2014

To clarify, I'm looking to delete -- from baddir -- all files not in gooddir, regardless of whether the file contents are the same.
posted by piro at 1:25 PM on June 30, 2014

Maybe you could specify whether you want to do this on a PC, a Mac, or via a Linux command line.
posted by pipeski at 2:58 PM on June 30, 2014

D'oh. Somehow dropped "via a linux command line" from the Q.
posted by piro at 7:41 PM on June 30, 2014

...regardless of whether the file contents are the same.

I included the checksum check above so that any files the attacker may have changed would also be replaced (not just new, added files removed), but see that may not be desirable if you're replacing them with the files that are making the 'good' directory fail.

To put my concern another way, just adding files to Wordpress won't do any harm, unless the attacker changes something else somewhere to call/execute the new files, say a new line in footer.php calling the malicious script. Without replacing the existing footer, this call will still be in the 'new, good' directory unless replaced.

I guess the tradeoff here is:
a) Ignore any changed files, hope simply removing the new files is the end of it, but never be 100% sure.

b) Replace the changed files with versions from the 'broken' install, and do normal troubleshooting to resolve afterwards.

(Of course, all this may be considered in your further steps in the cleanup, in which case apologies, good luck, and carry on.)

A bit more info on the nature of the hack might help here - was it just a dodgy theme someone installed? Or did the attacker have full access to the backend/database at some point? Were they hoovering up login details, or just defacing the site with "lol u got haxxored" crap or whatever?
posted by quinndexter at 10:16 PM on June 30, 2014

I always forget all the flags for rsync and even with -n -v -v to make it not do anything and be verbose about what it's not doing the output isn't very helpful.

I tend to use the diff -q -r solution and then some munging.

$ diff -q -r d/good d/bad
Files d/good/foo and d/bad/foo differ
Only in d/bad: quux
Only in d/good: zaff

$ diff -q -r d/good d/bad | perl -lne 's{^Only in d/bad: }{}&&print'

$ diff -q -r d/good d/bad | perl -lne 's{^Only in d/bad: }{}&&print"d/bad/$_"'

$ diff -q -r d/good d/bad | perl -lne 's{^Only in d/bad: }{}&&unlink"d/bad/$_"'

$ diff -q -r d/good d/bad
Files d/good/foo and d/bad/foo differ
Only in d/good: zaff
Because enough is never enough, you could also do something with find, sort, and comm.

# 3 columns: 1) only in first list; 2) only in second list; 3) in both lists
$ comm <(cd d/good; find . -type f | sort) <(cd d/bad; find . -type f | sort)

# remove first and third columns leaving only things uniq to second list
$ comm -13 <(cd d/good; find . -type f | sort) <(cd d/bad; find . -type f | sort)

$ comm -13 <(cd d/good; find . -type f | sort) <(cd d/bad; find . -type f | sort) | (cd d/bad; xargs rm)

posted by zengargoyle at 5:34 AM on July 1, 2014

« Older Similar Music to This?   |   Help me avoid getting ripped off buying a used... Newer »
This thread is closed to new comments.