Queryable, up-to-month US real estate data?
January 30, 2021 1:17 PM Subscribe
Where can I find or buy access to real estate data? I'm thinking of something that I could run queries on like "Show me all forms of real estate that have sold in county X in the last month, grouped first by zip code, and then ordered by price desc" or "Show me the average assessed or sale home price by county in state Y for each of the last 5 years" or "Compute the ratio of homes to empty lots in municipality Z."
Services like Zillow and Redfin almost certainly have much of this data, but seem largely focused on being a portal for buyers and sellers to connect them with agents. Zillow is scrape-hostile (I get recaptcha semi-regularly even as a casual browser). Redfin has some built in download facilities that are decent, but will only get you active listings so it's less good for questions like home-to-lot ratio or assessed price. I assume there's *some* sort of MLS access like what I'm describing, but am not sure. I'd guess most agents & brokers care far more about facilitating the next transaction than research questions like this, which are really ones of sociological or investment research.
I suspect county records are the ultimate source here, and if that's what the project takes I'm certainly willing to contact county offices directly (maybe only 100-200 counties I'm most interested in!), but I worry a little bit about fees (I could semi regularly pay double-digit per-county fees for a handful of counties, but it becomes a budget problem for more of them). And I would also guess other people in either academia or investment have been interested in this information and maybe a service catering to them exists.
Oh, and I'm perfectly comfortable with common software or data engineering questions, so willing to do some DIY. Dumps of CSVs? Cool, I'll import them into SQLite or Postgres myself, run scripts to clean or reformat, whatever.
Services like Zillow and Redfin almost certainly have much of this data, but seem largely focused on being a portal for buyers and sellers to connect them with agents. Zillow is scrape-hostile (I get recaptcha semi-regularly even as a casual browser). Redfin has some built in download facilities that are decent, but will only get you active listings so it's less good for questions like home-to-lot ratio or assessed price. I assume there's *some* sort of MLS access like what I'm describing, but am not sure. I'd guess most agents & brokers care far more about facilitating the next transaction than research questions like this, which are really ones of sociological or investment research.
I suspect county records are the ultimate source here, and if that's what the project takes I'm certainly willing to contact county offices directly (maybe only 100-200 counties I'm most interested in!), but I worry a little bit about fees (I could semi regularly pay double-digit per-county fees for a handful of counties, but it becomes a budget problem for more of them). And I would also guess other people in either academia or investment have been interested in this information and maybe a service catering to them exists.
Oh, and I'm perfectly comfortable with common software or data engineering questions, so willing to do some DIY. Dumps of CSVs? Cool, I'll import them into SQLite or Postgres myself, run scripts to clean or reformat, whatever.
I believe there are real estate options for SimplyAnalytics and Simmons, but they may be pretty pricey. You can also request access to Redfin data, which I believe is free via an API for non-commercial purposes. On closer inspection I see you've already looked at Redfin and this may not fit the bill.
posted by aspersioncast at 3:31 PM on January 30, 2021
posted by aspersioncast at 3:31 PM on January 30, 2021
Try talking with Costar. Not sure if they make it available via API, but they are the gold standard for commercial real estate data.
posted by lohmannn at 4:36 AM on January 31, 2021
posted by lohmannn at 4:36 AM on January 31, 2021
This thread is closed to new comments.
posted by Glomar response at 1:32 PM on January 30, 2021