An aggregator that can also screen scrape?
March 29, 2010 8:07 AM   Subscribe

RSS aggregators that can also scrape pages without feeds?

I'm looking for an RSS reader that can also generate its own feeds by scraping pages. (The pages are on intranet sites, so I can't create a Yahoo Pipe, sadly).

Syndirella would have been ideal for this, but looks like it hasn't been updated since 2003, and it just crashes on launch on my machine every time.

Windows is preferred here, but I could stretch to Mac if there's one there.
posted by bonaldi to Computers & Internet (7 answers total) 4 users marked this as a favorite
 
As of January 25th of this year, Google Reader will do it.
posted by komara at 8:53 AM on March 29, 2010 [2 favorites]


Though now that I'm thinking about it, since you mention platform-specific I'm assuming you want a standalone program not a web-based reader.
posted by komara at 8:54 AM on March 29, 2010


also komara, i'm not sure if that would work for an intranet site since the google scrapers wouldnt be able to reach it.
posted by CharlesV42 at 8:57 AM on March 29, 2010


Response by poster: Yes, sadly that's no use -- same problem as Yahoo Pipes. Google's crawlers very definitely can't see these particular sites. I have trouble enough getting into them from inside the firewall :)
posted by bonaldi at 9:29 AM on March 29, 2010


You could code something up using a scripting language, running the page through Tidy HTML and then iterating over the meaningful chunks of the page. Of course if you are going to go through that effort you might as well try to get what ever system you have producing those pages to generate a RSS feed as well.
posted by mmascolino at 10:30 AM on March 29, 2010


Yes, well, apparently I am completely blind to the word 'intranet' - I saw "these pages are on mmmmnnnnmnmmmhh sites, so I can't create a Yahoo Pipe, sadly." My apologies.
posted by komara at 12:25 PM on March 29, 2010


I'm also using Google Reader for stuff like this.
posted by cowmix at 1:21 PM on March 29, 2010


« Older Need to find old-school pepperoni pizza in...   |   Are there rechargeable batteries that retain their... Newer »
This thread is closed to new comments.