ParseFilter: I have a CSV file full of leads I need to parse into a more, er, concise format. What would the hive mind recommend?
It seems to be quite a bit similar to
this thread, except I've already got the data in CSV format. But that doesn't mean it's worth anything to me!
It looks like this: NAME, ADDR1, ADDR2, ADDR3, ADDR4. But it might as well be NAME, ONEBIGLONGSTRINGOFSTUFF. Sometimes city and state are in ADDR3 and sometimes in ADDR4. There might be email addresses or phone or fax numbers mixed in, too.
At first I thought I might just try to geocode each record, but I think there's probably a smarter option. Someone mentioned using sed in the other post, but I can't seem to figure out exactly how to go about doing that. Ruby would be peachy, too!
Are all of the addresses inside the United States? If so, http://geocoder.us/ is a good, free resource for after you have your data in some useful (canonical or normalized) form.
posted by cmiller at 3:13 PM on November 3, 2007