Parse freetext postal addresses to structured form for geocoding to KML?
November 1, 2007 7:53 PM   Subscribe

Parse freetext postal addresses to structured form for geocoding to KML?

I have a bunch of contact information (harvested from a very user-unfriendly aetna docfind website) that I want to plot on a google map. Naturally, the info is unstructured. If I want to use batchgeocode.com I need to get it into a structured CSV format.

Back when I used windows, I used to use a product called ListGrabber to perform this operation. Now I'm on the Mac, and I don't pirate software anymore, so any good options?

I came across Geo::StreetAddress::US, but I'm not so perl-savvy anymore (I lean towards python and ruby), and it won't separate out the names nicely (though I can probably find a way around that).

How would you do this?
posted by joshwa to Computers & Internet (3 answers total) 2 users marked this as a favorite
 
sed
posted by pompomtom at 7:56 PM on November 1, 2007


Response by poster: >sed

I have no desire to write *that* many awful regexps. I've done it once before with this data, and it was very painful.

Looking for something more general purpose that I can use for other similar kinds of projects without writing a different set of regexps for every data source.

Also, forgot to mention Leopard's Data Detectors... sadly I will not be upgrading to Leopard anytime soon.
posted by joshwa at 8:12 PM on November 1, 2007


Well once you've got the names off the perl would just be

perl -MGeo::StreetAddress::US -ne 'print join ", ", map qq("$_"), @{Geo::StreetAddress::US->parse_location($_)}{qw(number street type state city zip)};' datafile.txt

posted by nicwolff at 10:18 PM on November 1, 2007


« Older Closer... closer... Too close!   |   I don't want a 3rd roommate! Newer »
This thread is closed to new comments.