Using Excel conditional formatting to do a name/ID data check?
February 25, 2013 9:21 AM
I'm hoping there's an easier way to do this than my current manual process: I want to highlight text cells in Excel 2010 that are a duplicate of ONLY the cell below. Specifics inside.
I have a spreadsheet exported from Access. Column A = Last Name, Column B = First Name, Column C = ID. Each row is an individual appointment with a person, so a person who comes in multiple times will have multiple rows. The ID is unique to the person. Sometimes the ID gets entered incorrectly the second or third time a person comes in for an appointment and I want to update the original Access database with the correct ID.
I am currently checking this in Excel by removing duplicates of ID numbers, which leaves me all the appointments who came in one time only and all the appointments where one of the times the person came in the ID number was entered incorrectly. I then sort this list (currently 6,000+ rows) by last name-first name and then scan down the first name column until I come across the same first name listed next to each other. If the IDs are really close together in number, it's probably a user-entry error, so I verify the correct ID number and update it back in the Access database.
I want to use conditional formatting to highlight every time two first names are the same right above & below each other. Is there a formula that will check Row 1 against Row 2, Row 2 against Row 3 and so on down the list, or am I stuck doing this manually?
(And yes, I know this isn't catching every instance where an ID is entered wrong, but I'm not about to individually check each appointment against the main system. This is just to get the data we have slightly more correct.)
I have a spreadsheet exported from Access. Column A = Last Name, Column B = First Name, Column C = ID. Each row is an individual appointment with a person, so a person who comes in multiple times will have multiple rows. The ID is unique to the person. Sometimes the ID gets entered incorrectly the second or third time a person comes in for an appointment and I want to update the original Access database with the correct ID.
I am currently checking this in Excel by removing duplicates of ID numbers, which leaves me all the appointments who came in one time only and all the appointments where one of the times the person came in the ID number was entered incorrectly. I then sort this list (currently 6,000+ rows) by last name-first name and then scan down the first name column until I come across the same first name listed next to each other. If the IDs are really close together in number, it's probably a user-entry error, so I verify the correct ID number and update it back in the Access database.
I want to use conditional formatting to highlight every time two first names are the same right above & below each other. Is there a formula that will check Row 1 against Row 2, Row 2 against Row 3 and so on down the list, or am I stuck doing this manually?
(And yes, I know this isn't catching every instance where an ID is entered wrong, but I'm not about to individually check each appointment against the main system. This is just to get the data we have slightly more correct.)
Found it (I think): Google Refine.
https://github.com/OpenRefine/OpenRefine/wiki
IIRC, this is exactly the kind of problem this software is for.
posted by etc. at 9:32 AM on February 25, 2013
https://github.com/OpenRefine/OpenRefine/wiki
IIRC, this is exactly the kind of problem this software is for.
posted by etc. at 9:32 AM on February 25, 2013
I think you're thinking of Google Refine, which is their tool for cleaning up data.
I've heard very positive things about it but have so far not got round to trying it out.
Also, if I am reading your question right then I would absolutely be doing it the way etc. suggests. Doesn't even need an if. Just =C4=C3 which would get a nice TRUE/FALSE result.
Then you can autofilter on True.
posted by Just this guy, y'know at 9:34 AM on February 25, 2013
I've heard very positive things about it but have so far not got round to trying it out.
Also, if I am reading your question right then I would absolutely be doing it the way etc. suggests. Doesn't even need an if. Just =C4=C3 which would get a nice TRUE/FALSE result.
Then you can autofilter on True.
posted by Just this guy, y'know at 9:34 AM on February 25, 2013
YES! It works!! This just saved at least an hour of work for me every time I do this. Thank you for helping me think out of the box there. :-)
posted by bibbit at 9:35 AM on February 25, 2013
posted by bibbit at 9:35 AM on February 25, 2013
« Older Us: med students, our OB-GYN attending: blatantly... | Yes but how do you WEAR the clothing? Newer »
This thread is closed to new comments.
Also, didn't Google have a spreadsheet cleanup program for just things like this? anyone?
posted by etc. at 9:27 AM on February 25, 2013