Excel filter: Nearest neighbor identification.
August 2, 2010 9:34 AM   Subscribe

Excel filter: Nearest neighbor identification. Do I need a UDF?

I have an excel sheet with these fields:

name,lat,lon

For each name, I'd like to add fields identifying the 5 nearest neighbors. For example, if my data set looks like this:

name1,lat1,lon1
name2,lat2,lon2
...

I'd like my output to look like:

name1,lat1,lon1,[name of nearest neighbor],[name of 2nd nearest neighbor],...
name2,lat2,lon2,[name of nearest neighbor],[name of 2nd nearest neighbor],...
...

Is there some easy way to do this?
posted by sisquoc15 to Computers & Internet (8 answers total) 2 users marked this as a favorite
 
There may be a better way to do this, but I would start creating a lookup table with the great circle distance between every point. You can calculate great circle distance using one of the formulas found on this page depending on your lat/lon format. Once you have a lookup table with all the values you can use SMALL to find the lowest values but you will need some excel magic glue to get from there to the names.
posted by ChrisHartley at 9:44 AM on August 2, 2010


What is a UDF? How many entries do you have?

You could make a table (n x n entries) and calculate the distance between each neighbor using a straight line distance ( sqrt( (x1 -x)^2 + (y1 - y)^2 ) ), and then pick the five smallest values.
posted by a womble is an active kind of sloth at 9:45 AM on August 2, 2010


UDF is a user defined function, programmed in vba
posted by dfriedman at 9:49 AM on August 2, 2010


Response by poster: I have 800 entries.
posted by sisquoc15 at 9:49 AM on August 2, 2010


Response by poster: Ended up making an autohotkey macro. Picked a record, calculated distances to all other record, sorted on distance, populated nearest neighbor names, rinse and repeat. PM me for details.
posted by sisquoc15 at 11:16 AM on August 2, 2010


Do you have access to ArcGIS? It's pretty easy to do in arc...
posted by stratastar at 2:26 PM on August 2, 2010


Just for reference; this free tool for the not free ArcGIS; is really nice for creating point distance matrices.
posted by stratastar at 2:31 PM on August 2, 2010


Response by poster: I've gotten a few PM's on how my autohotkey worked. Here's a clunky explanation. My data looked like this.

name1, longitude1, latitude1
name2, longitude2, latitude2
...

I added a dummy column of zeros's. And I added a column that contained a formula that calculated the distance (a squared+b squared=c squared) for each point to the point in row 1. Data now looks like:

0,name1,longitude1,latitude1,calculated_distance_to_name1,
0,name2,longitude2,latitude2,calculated_distance_to_name1,
0,name3,longitude3,latitude3,calculated_distance_to_name1,
...

Finally I added columns that simply reported the names (name2, for example) in the rows below. Data:

0,name1, longitude1, latitude1,calculated_distance_to_name1,[name in one row down, column 2],[name in two rows down,c2], ...
0,name2, longitude2, latitude2,calculated_distance_to_name1,[name in one row down, column 2],[name in two rows down,c2], ...
0,name3, longitude3, latitude3,distance_to_name1,[name in one row down, column 2],[name in two rows down,c2], ...
...

Ok, now the autohotkey. I won't include the actual program (ask me if you'd like it) but I'll lay out the format.

Begin Loop
activate excel
select all and sort by the "calculated_distance_to_name1" column. This will give you all your records in order of distance to name1. It will also populate the name1 row with [name in one row down, column 2],[name in two rows down,c2],... which are name1's nearest neighbors, in order. Sweet.
change the 0 in row1,column1 to a 1. Copy row one and paste-special it back on itself as values. This locks in the nearest neighbor fields.
Now sort on column 1. This throws the completed row (marked with a one) to the bottom.
End Loop

Set the loop count for the number of rows you have.

I hope this makes sense, I have no experience in databases and I'm sure there is a more elegant way to explain this.
posted by sisquoc15 at 10:19 AM on August 20, 2010


« Older Bulk branded webcams?   |   Basic audio interface for Intel iMac Newer »
This thread is closed to new comments.