Where can I find data?
June 7, 2015 9:32 AM Subscribe
Where can I just find spreadsheets of interesting data?
I'm working on a data visualization project, but I need some real data to put in it. I looked everywhere but I have no idea where people actually find their data. I tried all the gov't websites (UNESCO etc.) but nothing that interesting (or at least, nothing I can do with it). I'm basically just looking for sites/APIs that have interesting, correlated data that I can put in scatterplots and stuff. Preferably in CSV/XLS/Excel format, but I'm desperate for anything that can be scraped/parsed! Thanks in advance!
PS I can't use infographics or bar graphs that have already been made. I'm specifically looking for spreadsheets of numbers that I can put in my own graph.
I'm working on a data visualization project, but I need some real data to put in it. I looked everywhere but I have no idea where people actually find their data. I tried all the gov't websites (UNESCO etc.) but nothing that interesting (or at least, nothing I can do with it). I'm basically just looking for sites/APIs that have interesting, correlated data that I can put in scatterplots and stuff. Preferably in CSV/XLS/Excel format, but I'm desperate for anything that can be scraped/parsed! Thanks in advance!
PS I can't use infographics or bar graphs that have already been made. I'm specifically looking for spreadsheets of numbers that I can put in my own graph.
The Broad Institute has a lot of publicly available data sets but I don't know if its the kind of thing you're looking for. You probably need to be more specific.
GEO has Gene Expression data sets, so microarrays and stuff. Again, no idea if this is appropriate for you though.
posted by shelleycat at 9:45 AM on June 7, 2015
GEO has Gene Expression data sets, so microarrays and stuff. Again, no idea if this is appropriate for you though.
posted by shelleycat at 9:45 AM on June 7, 2015
How raw? What subject area? This is often called a "teaching dataset".
posted by unknowncommand at 9:52 AM on June 7, 2015
posted by unknowncommand at 9:52 AM on June 7, 2015
NYC Open Data has some good data sets, exportable in convenient formats.
posted by aparrish at 10:04 AM on June 7, 2015
posted by aparrish at 10:04 AM on June 7, 2015
100+ Interesting Data Sets for Statistics
Five Public Datasets, and Lots of Ideas for Exploring Them
posted by unknowncommand at 10:05 AM on June 7, 2015 [1 favorite]
Five Public Datasets, and Lots of Ideas for Exploring Them
posted by unknowncommand at 10:05 AM on June 7, 2015 [1 favorite]
I like these from the Pew Research Center.
Also ICPSR.
posted by pantarei70 at 10:05 AM on June 7, 2015
Also ICPSR.
posted by pantarei70 at 10:05 AM on June 7, 2015
The Greater London Authority make a, frankly, obscene amount of data available on everything to do with the city - including a shedload of interesting data about usage of the London Underground and bus routes.
You can find it all here in the London Datastore
posted by garius at 10:25 AM on June 7, 2015
You can find it all here in the London Datastore
posted by garius at 10:25 AM on June 7, 2015
I'm a fan of the US Bureau of Labor Statistics: http://www.bls.gov/data/
posted by ndfine at 10:26 AM on June 7, 2015
posted by ndfine at 10:26 AM on June 7, 2015
The MeFi Wiki:
Infodump
Infodump and Excel
MetaAnalysis
posted by Little Dawn at 10:29 AM on June 7, 2015
Infodump
Infodump and Excel
MetaAnalysis
posted by Little Dawn at 10:29 AM on June 7, 2015
Lahman baseball database, which is available in CSV format. It covers the standard statistics (no sabermetric stuff, no play-by-play) for every baseball player in history (with different entries for each season).
posted by vogon_poet at 11:29 AM on June 7, 2015
posted by vogon_poet at 11:29 AM on June 7, 2015
The Japanese Patent Office compiles statistics on the global Patent Prosecution Highway (.xlsx).
posted by invisible ink at 1:14 PM on June 7, 2015
posted by invisible ink at 1:14 PM on June 7, 2015
The UCI datasets are aimed more at machine learning people, but they cover a range of subjects; you might find something useful there.
More generally, the various US government agencies are generally pretty brilliant about putting their data into the public domain. You could start with the USGS or NOAA, but simply Googling $ABBREVIATION + "datasets" is a good way to find huge quantities of CSV data. I'm away from my work machine at the moment but will check my bookmarks when I get back.
posted by Zeinab Badawi's Twenty Hotels at 6:17 AM on June 8, 2015
More generally, the various US government agencies are generally pretty brilliant about putting their data into the public domain. You could start with the USGS or NOAA, but simply Googling $ABBREVIATION + "datasets" is a good way to find huge quantities of CSV data. I'm away from my work machine at the moment but will check my bookmarks when I get back.
posted by Zeinab Badawi's Twenty Hotels at 6:17 AM on June 8, 2015
I know you said you had looked at government data, but maybe take a second look at these?
Federal Reserve Economic Data (FRED)
EUROSTAT
US Bureau of Labor Statistics (BLS)
The data is generally dry, but if you poke around they have some interesting things. For instance:
Federal Debt, Corporate Profits After Tax, Per Capita Disposable Personal Income (FRED)
And also if you like maps they now have GeoFRED.
European Migratant Integration, Accidents at Work all broken down by country and region (EUROSTAT)
American Time Use Survey, Mass Layoff Statistics, Union Affiliation Data (BLS)
posted by BusyBusyBusy at 4:39 AM on June 9, 2015
Federal Reserve Economic Data (FRED)
EUROSTAT
US Bureau of Labor Statistics (BLS)
The data is generally dry, but if you poke around they have some interesting things. For instance:
Federal Debt, Corporate Profits After Tax, Per Capita Disposable Personal Income (FRED)
And also if you like maps they now have GeoFRED.
European Migratant Integration, Accidents at Work all broken down by country and region (EUROSTAT)
American Time Use Survey, Mass Layoff Statistics, Union Affiliation Data (BLS)
posted by BusyBusyBusy at 4:39 AM on June 9, 2015
This thread is closed to new comments.
posted by brainmouse at 9:39 AM on June 7, 2015 [2 favorites]