How exactly do web feeds work?
July 10, 2010 2:38 PM   Subscribe

How do web feeds and news aggregators work? When a new blog post goes up does the aggregator automatically pick that up when it checks for updates to the feed or does each site's feed decide how often they update their feeds independent of how often the blog is updated? What is a good aggregator that lets me set how often to check feeds or even lets me check feeds manually?
posted by baking soda to Computers & Internet (3 answers total) 1 user marked this as a favorite
 
Usually a new blog post appears in the feed at the same time as it appears on the site. This is because both the feed and the website are usually working from the same database on the back end.

That said, among highly trafficked sites you might notice a post appear on the site slightly before it appears in the feed (or vice versa) due to caching rules. For example, on one of my sites I have the feed only update once every few minutes rather than on every single request to keep it nice and fast.

I don't know if there's any sort of tool that can tell you how often to check a feed for updates. It depends on what you're reading and what you're doing with the feed. Most web-based readers (like Google Reader) check once per hour. Most offline aggregators are configurable and will let you manually check for updates.
posted by meta_eli at 3:06 PM on July 10, 2010


Web feeds are just XML with a specific format. Each entry can contain a number of useful information, like the time posted and GUID (useful for update vs edit and moving the entire blog to a new URL), and the feed itself can contain a recommended refresh rate.

When it comes time to refresh, a good aggregator will use a cached version and ask the HTTP server if the feed has changed and only download the feed again if it's been updated. It will determine how often to refresh by the feed's stated interval, and use a reasonable default if none is provided by the feed.

A few examples I use:

1. Planet Venus. Aggregates feeds into an output template which I use for my website (and to generate an aggregated RSS). It caches data to reduce the price of polling. I make it run once an hour to pick up new feeds.

2. Liferea. A Linux GUI tool, I use it to track over a hundred feeds. I can override refresh rates, and specify how long to store items after they disappear from the feed itself.
posted by pwnguin at 3:18 PM on July 10, 2010


What is a good aggregator that lets me set how often to check feeds or even lets me check feeds manually?

Those are features that every RSS reader will have -- I've never seen one that doesn't have them. Just remember that the feed itself can specify a suggested update interval and it's considered rude and abusive to update more often than that; some RSS readers might try to dissuade or disallow you from setting a lower value.

Wikipedia's comparison of feed readers.
posted by Rhomboid at 10:11 PM on July 10, 2010


« Older Wet cleaning rag storage solutions   |   What to do with some awesome antique doors? Newer »
This thread is closed to new comments.