scrape web content from okcupid?
September 9, 2012 10:48 AM Subscribe
How do I extract content from a website like OKCupid for personal use?
I basically want to spider the site to extract info from matches that'll dump into a table. This is purely for my own personal nerdy fun (ie doing some sort of clustering on the matches they give me, trying my own search algorithsm, etc, and hopefully not illegal).
I don't know anything about web programming, so I don't know where to start even to search. For example, the match results page somehow causes web scraping tools I've found on online to only be able to lift one profile at a time. I'm not sure about the terminology of any of this. Any advice or directions would be great! Thanks!
posted by ribboncake to computers & internet (3 answers total) 3 users marked this as a favorite
This may be the only way to get the correct output, since the website relies on things like javascript and cookies to generate stuff dynamically from information not included in a simple HTTP request.
It is also a somewhat nontrivial programming task, as well as an exercise in frustration - lots of trial and error, things breaking for no apparent reason - and a really depressing idea overall.
posted by Dr Dracator at 11:18 AM on September 9, 2012