How to reconcile two lists in Excel?
December 5, 2018 12:48 PM Subscribe
I have two long lists that I need to manually reconcile. The lists contain many of the same items, but List B has lots of inconsistency in formatting. I need to rename the second list to have the standard names from List A. Also, each list has some items that are not in the other list, and one is about twice as long as the other.
I'm looking for a way to spread them out, putting items with similar names side-by-side, to make the manual work quicker. I imagine the most straightforward way to define "similar" is alphabetical order. Example:
List A:
Apple Granny Smith
Apple Yellow Delicious
Banana
Pear
List B:
Apple-GrannySmith
Banana-yellow
Pear-
Watermelon
Matched (two columns):
Apple Granny Smith, Apple-GrannySmith
Apple Yellow Delicious, [blank]
Banana, Banana-yellow
[blank], Watermelon
Pear, Pear-
I'm looking for a way to spread them out, putting items with similar names side-by-side, to make the manual work quicker. I imagine the most straightforward way to define "similar" is alphabetical order. Example:
List A:
Apple Granny Smith
Apple Yellow Delicious
Banana
Pear
List B:
Apple-GrannySmith
Banana-yellow
Pear-
Watermelon
Matched (two columns):
Apple Granny Smith, Apple-GrannySmith
Apple Yellow Delicious, [blank]
Banana, Banana-yellow
[blank], Watermelon
Pear, Pear-
Or, depending on your data, and the nature of your variations you might be able to just check first n characters, using a formula like this instead:
posted by paper chromatographologist at 12:59 PM on December 5, 2018
=LEFT(A1,5)Or you could combine this with the formula I gave above.
posted by paper chromatographologist at 12:59 PM on December 5, 2018
Best answer: I would import the lists into OpenRefine and use the clustering function. It detects similar entries and can rename them to make everything consistent. You can then export back out to Excel.
posted by mariokrat at 1:27 PM on December 5, 2018 [8 favorites]
posted by mariokrat at 1:27 PM on December 5, 2018 [8 favorites]
Best answer: Yes, this is a great use case for openRefine.
posted by rockindata at 1:30 PM on December 5, 2018 [1 favorite]
posted by rockindata at 1:30 PM on December 5, 2018 [1 favorite]
If list B always has the hyphen in between, you could try text to columns, and use the hyphen as a delimiter. Then you'd get a List C that was just the items after the - in List B.
You could use CONCAT to combine A and B then, though I'm not sure that's what you're going for?
posted by nakedmolerats at 2:13 PM on December 5, 2018
You could use CONCAT to combine A and B then, though I'm not sure that's what you're going for?
posted by nakedmolerats at 2:13 PM on December 5, 2018
This isn't quite a direct answer, but I had a somewhat similar Excel challenge that I just could not figure out, and I ended up hiring someone on Fiverr to assist me. Not only did that get me out of the fire for what I needed in the immediate sense, but I also now have the Excel formulae that I need to adapt it for future use. Just food for thought!
posted by kryptondog at 8:40 AM on December 6, 2018
posted by kryptondog at 8:40 AM on December 6, 2018
Best answer: If you really want to stay in excel to do it, and you're on windows, the fuzzy lookup add-in (it's from microsoft, not a random thing) should be just what you need.
posted by mosst at 10:48 AM on December 6, 2018 [2 favorites]
posted by mosst at 10:48 AM on December 6, 2018 [2 favorites]
Response by poster: I ended up using this tool with OpenRefine. Lots of trial and error! http://okfnlabs.org/reconcile-csv/
posted by gray17 at 3:09 PM on January 6, 2019
posted by gray17 at 3:09 PM on January 6, 2019
« Older I want a new gmail email address. Can you help me? | SpiderFilter: Yet another spider ID question Newer »
This thread is closed to new comments.
posted by paper chromatographologist at 12:56 PM on December 5, 2018