querying the hive mind
Magical feed services, how do they work?
April 8, 2011
How do services that create a full-text RSS feed when a site only offers partial feeds actually
? As example, fulltextrssfeed.com.
Computers & Internet
(3 answers total)
4 users marked this as a favorite
Sites like that go to the links they find in the partial feed and grab the text. It works like Readability and Instapaper do as far as knowing what part of the page is the article and what's ads/navigation etc.
on April 8, 2011
You can kind of get a sense for this by using the RSS feed scraping feature (it's not called that; can't remember that module's name off the top of my head) in Yahoo! Pipes. It lets you specify an RSS feed or website link to search within, then add certain opening and closing tags to look for text in between (e.g., everything between <div class="blogpost"> and </div>), among other options.
on April 8, 2011
Yes, like limonaire said, Yahoo Pipes can do essentially the same thing if you need to fine-tune one that doesn't work on the automatic services. You want to use a Loop module with a Fetch Page module inside, and output the result to the Description field.
That works even better sometimes when you want to add something onto the URL (like, say, &pagenumber=all, or &page=printable). You can just put that step earlier in the pipe. I have a ton of these.
FYI - doing this with the New York Times appears to bypass the new paywall limits.
on April 13, 2011
Will my old pump organ send me falling to my death... | Anxious like a mofo over new relationship. Please...
This thread is closed to new comments.
Google Reader replacement?
May 12, 2013
Recommended replacements for Google Reader?
March 13, 2013
Google Reader Diaspora
October 31, 2011
The worst of the human condition, delivered right...
June 11, 2009
Need help with personal knowledge management!
December 28, 2008