Comparing Excel Sheets for Sanity and Such Things
February 1, 2012 11:10 AM   Subscribe

ExcelFilter: Compare rows for data changes? I want to find out whether row values are different between two sheets on records that match.

MetaFilter, please help me achieve the "Hmm, it should be possible". I want to find a way to determine whether values in a row change between two different sheets. However, the sheets are not fixed -- the rows change order and add new records. The two sheets have the same column values, but possibly have different rows.

I have two excel files that I receive weekly from two of our systems. It's a roster that contains people and attributes that may change. For example:

UID | First | Last | Status | Location | Room
222 | John | Smith | Pending | EBW | 2

(Fictional data to protect the innocent!)

I generally want to figure out:
1. Who is missing from one sheet against the other.
2. For the records that match, if there's any changes in the field values.

Number 2 is what I need help with, number 1 is already done. I can determine records that are missing from either of the sheets, by using a formula to search for the UIDs. In case anyone is interested, this is what I'm doing in the first sheet: =IF(ISERROR(MATCH($C2,$Sheet2!$C$2:$C$2997,0)),"No","Yes (" & MATCH($C2,$Sheet2!$C$2:$C$2997,0) & ")") -- that returns Yes if it's found in the second sheet, and no if it isn't.

Now I'm trying to discover the differences in data, due to errors in our system or syncing problems. So, one sheet may say:

UID | First | Last | Status | Location | Room
222 | John | Smith | Pending | EBW | 2

and the other
UID | First | Last | Status | Location | Room
222 | John | Smith | Closed | CDL | 1

Any ideas, Mefites? Perhaps some macro magic? I've been running against a brick wall on searches -- I haven't found anything that's trying to do this with two sheets that have the same columns, but different rows.
posted by kensch to Computers & Internet (6 answers total) 1 user marked this as a favorite
I might try vlookup, if the UID is consistently correct.

So, with simpler columns -- A has UID, B has first, C has last, I might put the following in D:

=if(isna(vlookup(A2,[other sheet],2,false)),"N/A",if(B2=vlookup(A2,[other sheet],2,false),"okay","check"))

You could also just put the vlookup data in columns D and E, then in F and G put TRUE or FALSE depending on B=D, C=E, respectively. Then you can just autosort for false. (You can even put =AND(F2,G2) in H2, and then autosort there.)
posted by jeather at 11:20 AM on February 1, 2012

I'm an infrequent Excel user and so my inclination would be to export the two sheets to something like a .csv file and then use a text comparison tool like WinMerge (for Windows but free tools like this will exist for every OS because programmers use them.)

You might have to do some work to strip the files down so that the only differences are exactly what you're looking for. (Otherwise, if for example there were an extra column in one file, the tool would correctly show that every single line differs. Also, they'll need to be sorted in the same order to properly show which rows are missing.)
posted by XMLicious at 11:43 AM on February 1, 2012

Using VLOOKUP is the canonical way to do this.

In the interest of perverting excel to do things it wasn't meant to, try this:

* Add a column to both worksheets to indicate the data source (values e.g. "SYSTEM A", "SYSTEM B")

* Copy both datasets to a new sheet.

* Sort by UID, then by data source

* Compare each row with the next one. Let's say UID is in column A, data source in column B, attribute value in column C:
posted by Dr Dracator at 1:53 PM on February 1, 2012

Gah, comment cut short - here's what column D should look like:
=IF(A1<>A2;"Missing Row";IF(C1=C2;"Values Match";"Value Change")
posted by Dr Dracator at 1:56 PM on February 1, 2012

You have two files, not two sheets. Most of these formulas are assuming you have both sheets in the same file. I'd move them to the same file if possible.

Who is missing from one sheet against the other
This really simple to do with a COUNTIF formula. Let's say your UIDs are in column A on both sheets. This formula =COUNTIF(Sheet2!A:A,A2) in any cell in row 2 (except A2) will give you a 1 if the UID in row 2 exists on Sheet2 and a 0 if it does not. Paste that formula down the whole column.
The same formula will work on Sheet2 if you replace "Sheet2" in the formula with "Sheet1".

For the records that match, if there's any changes in the field values.
How many columns are you comparing? Again assuming your UID's are in column A, this formula =B2=VLOOKUP(A2,Sheet2!A:B,2,FALSE) will tell you if the value in B2 is the same as the value in column B in the row that has the same UID as A2. You get TRUE if they match and FALSE if they don't. For column C, =C2=VLOOKUP(A2,Sheet2!A:C,3,FALSE) is the formula. You have to change the bold parts of the formula to reference the new column each time, but keep the A2 and A: the same as long as you are working in row 2. You can paste these down the column as well.
posted by soelo at 3:37 PM on February 1, 2012

Thanks all! I came up with a "quick and dirty" way to meet my deadline, and I'm going to try to incorporate your answers into the next version.

This is my solution: I concatenated UID, Building and Room into a unique string, and used the IFERROR function that I posted earlier to match against that field. If an identical string wasn't found in the other file, I'd know there's some incongruity with the data.
posted by kensch at 10:06 AM on February 5, 2012

« Older lost iphone, how to buy unlocked?   |   Help me quantify my physical life. Newer »
This thread is closed to new comments.