How to scrape comments and posts from a private Facebook group?
August 15, 2020 12:35 AM Subscribe
I'm part of a private Facebook support group for folks with a rare medical condition. I'd like to do some data analysis on the posts and comments. Is there any way to scrape the posts and comments from the group?
There are some privacy concerns around medical info that were not applicable to automatronic’s question, but I’m assuming you’ve considered that.
posted by likethenight at 9:08 AM on August 15, 2020 [4 favorites]
posted by likethenight at 9:08 AM on August 15, 2020 [4 favorites]
You are unlikely to find any automatic solution for this, other than from really sketchy places. I was talking to someone on the commercial side of web scraping/social listening recently and they explained that private Facebook groups are considered off limits for ethical reasons
posted by JZig at 7:34 PM on August 15, 2020
posted by JZig at 7:34 PM on August 15, 2020
I have a non-sketchy idea that might work, though clunky as hell. There is a tool called Page Vault. The tool itself is not clunky but what I suggest to do with it is.
Page Vault can expand comments and pages (after you're logged in) and then create PDFs of the entire thing. If it's really lengthy, you may need to do it in chunks. The PDFs are nice-looking--they essentially screenshot continuously. I'm pretty sure the PDFs are searchable too. You could then copy and paste all into something else. That would give you a raw data dump at the very least.
posted by purple_bird at 9:00 AM on August 17, 2020
Page Vault can expand comments and pages (after you're logged in) and then create PDFs of the entire thing. If it's really lengthy, you may need to do it in chunks. The PDFs are nice-looking--they essentially screenshot continuously. I'm pretty sure the PDFs are searchable too. You could then copy and paste all into something else. That would give you a raw data dump at the very least.
posted by purple_bird at 9:00 AM on August 17, 2020
This thread is closed to new comments.
posted by automatronic at 2:21 AM on August 15, 2020