How to find scraped product reviews on my own sites.
January 10, 2013 10:10 AM   Subscribe

I work on a number of e-commerce websites. Of the ~6000 product reviews we have, a healthy percentage of them were manually scraped from other sites before my time. However, there is no indication and no log of which ones were scraped.

I now need to embark on the pleasant task of finding which ones were scraped and removing them from our sites.

Google can certainly help here by searching for review text. I can do this programmatically, but given daily API limits (100 requests a day) that would take awhile.

My current thought was to generate a list of links to likely review sources that use their in-site search for the specific products and then manually review the results. This is still going to be excessively manual. I'm looking for ideas or strategies on automating this process as much as possible.
posted by anonymous to Computers & Internet (1 answer total) 1 user marked this as a favorite
 
Sounds like a job for Mechanical Turk Just use their real people instead of doing it programatically, assigning batches of reviews as tasks.
posted by Another Fine Product From The Nonsense Factory at 10:21 AM on January 10, 2013


« Older Cultural anthropology: "ur trait" a term for...   |   Birders: Do you think this is a common mynah?... Newer »
This thread is closed to new comments.