How to usefully download posts and comments from a secret Facebook group
August 29, 2013 12:13 PM   Subscribe

I have been running a Facebook group since 2012 and I got a grant to do research with the findings from it. I need to download the group's postings and responses into a table and thinking the most reliable way to do so is with SQL. However, I have only just started with SQL and although I have found some ideas on Github for doing it, have no idea how to get started with pulling the information. I am seeking advice and possible alternatives. Can you help?
posted by parmanparman to Computers & Internet (3 answers total) 5 users marked this as a favorite
You're going to need to pull the data down from Facebook somehow to get it in to a SQL environment. I'd check out the Facebook API to see if you can use it to pull what you need. If that does not work, you'll need to write or buy a web scraper to get the data that way.

Once you have it down off of FB, importing it into a SQL instance is very straightforward.

Depending on the size of the group you can probably do this all for free with a hosted app on AppFog or Heroku or something like that, using Python for the scraping and whatever DB they offer.
posted by Aizkolari at 12:44 PM on August 29, 2013 [1 favorite]

Best answer: SQL (which is just a language used to write queries against a data set) will help you analyze the data once you have it, but you have a couple of steps to go through before you get there -- namely, you have to get the Facebook information into a database first, and SQL won't help you do that.

There's probably no easy way to do that (unless Facebook offers an API to do so, which I doubt), and you may be reduced to copying-and-pasting the information from Facebook into some sort of text-parsing routine -- e.g., by continuing to scroll down the group "wall" until you hit the very end, then Ctrl-A (select all), then copying-and-pasting it into a text file.

But now you've got a bunch of weirdly-formatted text that you have to get into a structured format. That's probably where the bulk of the work lies, and your best bet is probably to find someone who knows their way around text parsing/processing and offer them a case of beer to convert the raw text into a structured format. The format, depending on which data you're looking to analyze, would probably be something like a single table that has fields for:
  • Post date/time;
  • Poster name;
  • Text of post;
  • Number of likes;
  • ... and whatever else.
Once you have a table that contains all of the data (whether it's in Excel or CSV or whatever), then you need to get it into a proper database (e.g., MySQL, PostgreSQL, SQL Server) that will allow you to execute the SQL statements you need to analyze the data.
posted by Doofus Magoo at 12:46 PM on August 29, 2013

Best answer: You may be able to do it through ThinkUp if you have some server space.
posted by neilbert at 4:09 PM on August 29, 2013 [1 favorite]

« Older Help me keep a bob haircut stylish   |   Labor Day Weekend first come, first served camping... Newer »
This thread is closed to new comments.