Data sort help
July 4, 2012 7:48 AM Subscribe
Is there an off-the-shelf MSExcel function or 2.0 tool that will allow me to "return" a US City/State based on 5-digit US postal codes. I have 160 postal code queries to perform. The postal code data is in MSExcel. Something like this. I do not have MSAccess and I am a Mac user.
You could also import this file to a workbook, and use a function like vlookup to pull the desired fields.
posted by circleofconfusion at 9:19 AM on July 4, 2012
posted by circleofconfusion at 9:19 AM on July 4, 2012
vlookup command is what you're looking for, as long as you have the data set in another sheet
posted by zombieApoc at 9:42 AM on July 4, 2012
posted by zombieApoc at 9:42 AM on July 4, 2012
Response by poster: Yes, vlookup seems to be the correct formula but I haven been unsuccessful thus far. I grabbed the data from excel here and pasted it on another sheet. I can't seem to get the correct syntax routine.
posted by jeffmac at 9:49 AM on July 4, 2012
posted by jeffmac at 9:49 AM on July 4, 2012
I'm no Excel fundi, but I love me a good vlookup... apologies for all the bits of this you already know.
You've got Sheet 1 that has your ZIP Codes and Sheet 2 (or Book 2) that you're referencing (has both the Cities and the ZIPs)
Stating the obvious: If you're using circleofconfusion's census data, make sure that the data has been parsed into columns.
Then:
=vlookup([A],[B],[C],0)
[A]: What are you looking for? Sheet1 cell containing the ZIP Code
[B]: Where are you looking for it? This is the entire range in which the ZIP Code and the associated City can be found. Ensure the item you're looking up (ZIP Code) is in the first column you highlight as your data array and the value you want to bring back (City name) is included, i.e. In the census sample, you'll highlight Columns B to D and as many rows as exist. And then remember to hit F4 (no clue what key on the Mac) to lock the range to which you're looking up. Assuming the data on Sheet 2 has 500 rows, [B] will be B1:D500 (or $B$1:$D$14 when you set the range).
[C]: What do you want returned? When the look-up finds your ZIP Code, what value do you want brought back? In this case, you want it to bring back the City (column D). The trick here is to start counting the column numbers from where your highlight begins and not necessarily its position on the Excel sheet. In this case, it's the third column in the highlighted area of [B], so: 3.
The final requires a TRUE/FALSE on whether it'll allow a partial match. You want a full match, so this is 0.
Too-Basic-Didn't-Read? f you've done this and you're still getting #N/A on the lookup, it could be that the formatting is different in your source and reference sheets (e.g. one ZIP Code may be formatted as a number and another as text) or one of them might have spaces and the other not; fix this with a CTRL+F to find " " and Replace with "".
Good luck!
posted by kreestar at 11:56 AM on July 4, 2012 [2 favorites]
You've got Sheet 1 that has your ZIP Codes and Sheet 2 (or Book 2) that you're referencing (has both the Cities and the ZIPs)
Stating the obvious: If you're using circleofconfusion's census data, make sure that the data has been parsed into columns.
Then:
=vlookup([A],[B],[C],0)
[A]: What are you looking for? Sheet1 cell containing the ZIP Code
[B]: Where are you looking for it? This is the entire range in which the ZIP Code and the associated City can be found. Ensure the item you're looking up (ZIP Code) is in the first column you highlight as your data array and the value you want to bring back (City name) is included, i.e. In the census sample, you'll highlight Columns B to D and as many rows as exist. And then remember to hit F4 (no clue what key on the Mac) to lock the range to which you're looking up. Assuming the data on Sheet 2 has 500 rows, [B] will be B1:D500 (or $B$1:$D$14 when you set the range).
[C]: What do you want returned? When the look-up finds your ZIP Code, what value do you want brought back? In this case, you want it to bring back the City (column D). The trick here is to start counting the column numbers from where your highlight begins and not necessarily its position on the Excel sheet. In this case, it's the third column in the highlighted area of [B], so: 3.
The final requires a TRUE/FALSE on whether it'll allow a partial match. You want a full match, so this is 0.
Too-Basic-Didn't-Read? f you've done this and you're still getting #N/A on the lookup, it could be that the formatting is different in your source and reference sheets (e.g. one ZIP Code may be formatted as a number and another as text) or one of them might have spaces and the other not; fix this with a CTRL+F to find " " and Replace with "".
Good luck!
posted by kreestar at 11:56 AM on July 4, 2012 [2 favorites]
Kreestar has pretty much covered it all but just as an extra, if you have some bad formatting (Specifically extra spaces) just replace [A] with trim([A]). Trim will cut out all extraneous spaces, and is usually good for clearing up data.
There's also a few functions which will let you enforce formatting, =text( , ) for example.
posted by Just this guy, y'know at 1:33 PM on July 4, 2012
There's also a few functions which will let you enforce formatting, =text( , ) for example.
posted by Just this guy, y'know at 1:33 PM on July 4, 2012
Just want to add that you should make sure to format the ZIPs as text, and not numbers (you may have to force-add 0's in front if you don't want those pesky east coast zip codes to show up and vlookup properly.
posted by thewumpusisdead at 1:36 PM on July 4, 2012
posted by thewumpusisdead at 1:36 PM on July 4, 2012
Hello jeffmac.
To use the data from the workbook at my site that was posted by beyond_pink, you must factor in that the zip codes themselves are stored as text in column A. Not numbers.
This makes a difference when retrieving the look-ups with worksheet functions.
The advice I see posted here to use VLOOKUP can work, but it is not the best way to go in this case. Assuming you wish to use worksheet functions (which VLOOKUP is one of), this situation screams for the binary lookup available from the MATCH() function.
With a list tens of thousands of records long (such as this zip code list), the combination of INDEX/MATCH is hundreds to thousands of times faster, if you choose to use the binary look-up option of the MATCH function. Here's how:
First sort the data so that the zip codes are in ascending order. You may wish to sort the cities at the same time in the case of ties for the zip code (please see note at the end of this comment).
Using the data sheet from my workbook, the zip codes are in column A, Cities are in column B, and States are in column C. Row one is header info, and the data start in row 2 and extend all the way down to row 76783.
Let's say you place a copy of the header information in row one in columns F, G, and H. And you enter your 160 zip codes that you want to look-up the City and State info for into cells F2 through F161.
Now you can enter a City lookup formula in cell G2: =INDEX(B2:B76783,MATCH(F2,A2:A76783,1))
And a State lookup formula in cell H2:
=INDEX(C2:C76783,MATCH(F2,A2:A76783,1))
...and then copy these formulas all the way down through row 161.
This will result in lightning fast lookups for each of the zip codes you enter in column F.
The secret here is the binary look-up option of the MATCH function, and that is enabled by the "1" you see near the end of each of the Excel formulas. For convenience the zip code data are in this workbook.
There is a potential glitch here (that would still be a glitch with any other look-up formula). Many zip codes have more than one City listing associated with them. For example, 92807 has Anaheim, CA, and also Anaheim Hills, CA.
The formula based look-up methods will return ONLY the FIRST match found. Keep this in mind when sorting the data.
If you get stuck, please feel free to contact me at daniel.ferry@excelhero.com.
I had never heard of metafilter before today when I noticed some traffic from here on my website analytics program. A big thank you to beyond_pink for sharing the link.
Regards,
Daniel Ferry
Microsoft MVP - Excel
Owner, Excel Hero Academy
posted by excelhero at 3:10 PM on July 4, 2012 [3 favorites]
To use the data from the workbook at my site that was posted by beyond_pink, you must factor in that the zip codes themselves are stored as text in column A. Not numbers.
This makes a difference when retrieving the look-ups with worksheet functions.
The advice I see posted here to use VLOOKUP can work, but it is not the best way to go in this case. Assuming you wish to use worksheet functions (which VLOOKUP is one of), this situation screams for the binary lookup available from the MATCH() function.
With a list tens of thousands of records long (such as this zip code list), the combination of INDEX/MATCH is hundreds to thousands of times faster, if you choose to use the binary look-up option of the MATCH function. Here's how:
First sort the data so that the zip codes are in ascending order. You may wish to sort the cities at the same time in the case of ties for the zip code (please see note at the end of this comment).
Using the data sheet from my workbook, the zip codes are in column A, Cities are in column B, and States are in column C. Row one is header info, and the data start in row 2 and extend all the way down to row 76783.
Let's say you place a copy of the header information in row one in columns F, G, and H. And you enter your 160 zip codes that you want to look-up the City and State info for into cells F2 through F161.
Now you can enter a City lookup formula in cell G2: =INDEX(B2:B76783,MATCH(F2,A2:A76783,1))
And a State lookup formula in cell H2:
=INDEX(C2:C76783,MATCH(F2,A2:A76783,1))
...and then copy these formulas all the way down through row 161.
This will result in lightning fast lookups for each of the zip codes you enter in column F.
The secret here is the binary look-up option of the MATCH function, and that is enabled by the "1" you see near the end of each of the Excel formulas. For convenience the zip code data are in this workbook.
There is a potential glitch here (that would still be a glitch with any other look-up formula). Many zip codes have more than one City listing associated with them. For example, 92807 has Anaheim, CA, and also Anaheim Hills, CA.
The formula based look-up methods will return ONLY the FIRST match found. Keep this in mind when sorting the data.
If you get stuck, please feel free to contact me at daniel.ferry@excelhero.com.
I had never heard of metafilter before today when I noticed some traffic from here on my website analytics program. A big thank you to beyond_pink for sharing the link.
Regards,
Daniel Ferry
Microsoft MVP - Excel
Owner, Excel Hero Academy
posted by excelhero at 3:10 PM on July 4, 2012 [3 favorites]
And one final bit!
Make sure to format column F as text BEFORE you enter your zip codes.
posted by excelhero at 3:15 PM on July 4, 2012
Make sure to format column F as text BEFORE you enter your zip codes.
posted by excelhero at 3:15 PM on July 4, 2012
« Older Dance Dance Revolution Crush in a Music Video? | Legal thoughts for use of clips in a homeschool... Newer »
This thread is closed to new comments.
http://www.excelhero.com/blog/2011/02/excel-zip-code-lookup.html
posted by beyond_pink at 7:55 AM on July 4, 2012