How to reconcile two lists in Excel?
December 5, 2018 12:48 PM   Subscribe

I have two long lists that I need to manually reconcile. The lists contain many of the same items, but List B has lots of inconsistency in formatting. I need to rename the second list to have the standard names from List A. Also, each list has some items that are not in the other list, and one is about twice as long as the other.

I'm looking for a way to spread them out, putting items with similar names side-by-side, to make the manual work quicker. I imagine the most straightforward way to define "similar" is alphabetical order. Example:

List A:
Apple Granny Smith
Apple Yellow Delicious
Banana
Pear

List B:
Apple-GrannySmith
Banana-yellow
Pear-
Watermelon

Matched (two columns):
Apple Granny Smith, Apple-GrannySmith
Apple Yellow Delicious, [blank]
Banana, Banana-yellow
[blank], Watermelon
Pear, Pear-
posted by gray17 to Computers & Internet (9 answers total) 4 users marked this as a favorite
 
Excel isn't the greatest tool for this, but you could make some headway by creating new columns next to your lists that have more uniform versions of the entries. Here's an example formula that will remove dashes and spaces and convert all letters to uppercase:
=UPPER(SUBSTITUTE(SUBSTITUTE(A1," ",""),"-",""))
Then you use the MATCH() function to look up uniform versions of each entry in the opposing list. If nothing else, this would flag a lot of the matches.
posted by paper chromatographologist at 12:56 PM on December 5, 2018


Or, depending on your data, and the nature of your variations you might be able to just check first n characters, using a formula like this instead:
=LEFT(A1,5)
Or you could combine this with the formula I gave above.
posted by paper chromatographologist at 12:59 PM on December 5, 2018


Best answer: I would import the lists into OpenRefine and use the clustering function. It detects similar entries and can rename them to make everything consistent. You can then export back out to Excel.
posted by mariokrat at 1:27 PM on December 5, 2018 [8 favorites]


Best answer: Yes, this is a great use case for openRefine.
posted by rockindata at 1:30 PM on December 5, 2018 [1 favorite]


Best answer: nthing OpenRefine
posted by kbuxton at 1:31 PM on December 5, 2018


If list B always has the hyphen in between, you could try text to columns, and use the hyphen as a delimiter. Then you'd get a List C that was just the items after the - in List B.

You could use CONCAT to combine A and B then, though I'm not sure that's what you're going for?
posted by nakedmolerats at 2:13 PM on December 5, 2018


This isn't quite a direct answer, but I had a somewhat similar Excel challenge that I just could not figure out, and I ended up hiring someone on Fiverr to assist me. Not only did that get me out of the fire for what I needed in the immediate sense, but I also now have the Excel formulae that I need to adapt it for future use. Just food for thought!
posted by kryptondog at 8:40 AM on December 6, 2018


Best answer: If you really want to stay in excel to do it, and you're on windows, the fuzzy lookup add-in (it's from microsoft, not a random thing) should be just what you need.
posted by mosst at 10:48 AM on December 6, 2018 [2 favorites]


Response by poster: I ended up using this tool with OpenRefine. Lots of trial and error! http://okfnlabs.org/reconcile-csv/
posted by gray17 at 3:09 PM on January 6, 2019


« Older I want a new gmail email address. Can you help me?   |   SpiderFilter: Yet another spider ID question Newer »
This thread is closed to new comments.