Help me populate a spreadsheet by scraping an RSS feed.
April 4, 2012 10:21 AM Subscribe
I would like to scrape information from an RSS feed into an Excel-readable text file for a completely legal non-copyright violating use. In a better world, I'd have access to the database that generates the feed, but since this ain't a perfect world it appears that scraping is my best bet. Are there tools that will help me automate this, or programming tutorials that will help me figure it out myself (it's been 15 years since I last write any code beyond simple SQL queries)?
posted by croutonsupafreak to Computers & Internet (9 answers total) 5 users marked this as a favorite
The XML is formatted thus, for each new post (but with angle brackets where I've put square brackets):
[description]Description, which may include embedded links and images.
I'd like to scrape this into an Excel-readable format, where each row consists of:
TITLE, URL (from "guid" Permalink, not from "link"), DESCRIPTION (First 50 characters, don't need links or images).
In an even more ideal world, I'd be able to do this in a smart enough manner that if I scrape the feed every day my software/widget/whatever tool can distinguish new content and only scrape that.
I know this is possible, would be super easy for the right programmer, and that without any help I could probably even cobble something together in a month or two. But I'm a writer without access to the "right programmer," and I'd really prefer not to take 1-2 months to try to figure it out.