Examples of real-life data tables describing social media users
October 2, 2018 5:39 PM   Subscribe

I'm looking for examples of actual real life published database tables, that have been or are still used to describe users of social media platforms. I want to use them as teaching examples, as it makes it more real, but I was having a hard time coming up with anything. So: does anyone knew of any examples published online? Thank you!

Spreadsheets with attribute/value pairs are good. I can also parse basic XML, and should be able to cope with parsing basic JSON.
posted by carter to Computers & Internet (10 answers total)
 
Information on individuals or aggregate statistics?
posted by supercres at 5:43 PM on October 2, 2018


Gapminder has data on internet users around the world, although I don't think it is broken down by social media specifically.
posted by whimsicalnymph at 5:52 PM on October 2, 2018


Response by poster: Individual users ... which is why I might be having trouble finding anything. Of course an example that is just a fake Joe Bloggs record is fine.
posted by carter at 5:58 PM on October 2, 2018


Response by poster: That is, what does a record that describes a single user account look like? Of course (again) I may be naive to assume that such a thing actually exists.
posted by carter at 5:59 PM on October 2, 2018


Mastadon is open-source, so you can have a look at their schema here.
posted by pompomtom at 6:13 PM on October 2, 2018 [1 favorite]


Response by poster: Thanks, pompomtom; I had not heard of Mastodon, very interesting!
posted by carter at 6:24 PM on October 2, 2018


Twitter TOS, for one, forbids sharing this sort of information. (Each interested party is expected to recollect it.) This is what a user object coming off the API looks like. Something like Tweepy can help you collect them.

What kinds of user data are you looking for? MyPersonality was an example of volunteered user data.

It’s not social media per se but there’s the blog authorship corpus. Age and gender attached to a bunch of text.
posted by supercres at 7:13 PM on October 2, 2018


You could have your students create one-off accounts on Reddit or Facebook or Tumblr etc., and then have them scrape their own accounts and pool them for analysis.
posted by SaltySalticid at 7:14 PM on October 2, 2018


Does Github count as a social media platform? If so, you might be interested in the github_timeline dataset.
posted by batter_my_heart at 10:14 PM on October 2, 2018


OkCupid Data for Introductory Statistics and Data Science Courses

At the end of the article they link to this page for the data: https://github.com/rudeboybert/JSE_OkCupid

I'm sort of assuming based on your question you have access to the article through an institutional library, but if not I can send it to you.
posted by kochenta at 3:00 PM on October 3, 2018


« Older We didn't call it the Lonely Room   |   Just hypothetically... Newer »
This thread is closed to new comments.