Fetch and Display remote content?
October 3, 2008 8:24 AM

What options do I have to fetch content from other websites for display on my own?

I am setting up a Wordpress blog, and likely a somewhat static html site, and possibly also Textpattern site (all hosted at Webfaction), etc. I am interested in fetching the content of up to 15 other websites, mostly in another language, for display (and proper crediting of course) on my blogs.

The idea is for some automatic process to fetch and display a number of different articles from Kazak language sources, then I add some commentary/translation of my own, for readers who might be interested in that region/culture, so that they don't have to go to all 15+ sites themselves.
posted by steppe to Computers & Internet (9 answers total) 1 user marked this as a favorite
Are you looking for technical, or legal advice?
posted by nitsuj at 8:31 AM on October 3, 2008


If technical, this Wordpress plugin looks like it'll do what you need.
posted by nitsuj at 8:33 AM on October 3, 2008


If the sites you're pulling content from have a feed, you would be better off using that. It's pretty easy to grab a whole page and store it on your server, but it gets a lot trickier if you want add your content and navigation around it.

Oh, also, if the content you're pulling is copyrighted, there's a good chance you're breaking the law, credited or not.
posted by meta_eli at 9:25 AM on October 3, 2008


On second thought, maybe you just want to use HTML Frames to add a navigation bar that sticks in the browser through different sites. This neatly avoids most of the technical and legal problems
posted by meta_eli at 9:26 AM on October 3, 2008


This is what RSS is for, and why it is called Really Simple Syndication. As long as the site has a feed, you can pull it into your blog. You can also format the output, if for example you just want headlines, or only the first 250 characters of a post, etc.

Be warned that people are less attentive than you might hope. It is common for people to publish full entries via RSS and then freak the fuck out when you republish... their full entries. There has also been extensive debate on whether RSS is really a license to republish, and while there is no settling this argument, people so occassionally attach a Creative Commons or other useage license to their feed. These absolutely must be respected, but regardless - were I you I would seriously consider limiting your republishing to a specific, automated number of characters.
posted by DarlingBri at 11:48 AM on October 3, 2008


Anytime anyone mentions frames it sounds like you'll have some problem in the future. My experience has been to avoid frames at all costs. Whats wrong with the plugin nitsuj mentioned?
posted by schindyguy at 11:49 AM on October 3, 2008


Thanks for the replies thus far. I am looking for technical advice, not legal.

Unfortunately most of the Kazak sites I'd like to reference don't use RSS feeds. I have found something called kwout which is rather nifty, but I have to be the one cropping. Something more automatic would be cool.

Can I generate a RSS feed for someone else's site?
posted by steppe at 6:03 PM on October 3, 2008


Yes, it is possible. You need to scrape the data; a MeFi search for the scraping tag isn't a bad place to start.
posted by DarlingBri at 4:12 PM on October 5, 2008


Thanks for the responses. It is handy to know what terms are in use related to this, such as "scraping".
posted by steppe at 8:52 PM on January 12, 2009


« Older Tips or tricks to help me with time management...   |   Art shopping in Austin Newer »
This thread is closed to new comments.