Join 3,380 readers in helping fund MetaFilter (Hide)


How to convert a dynamic web pages into static ones?
February 24, 2005 3:44 AM   Subscribe

I need to convert several dynamic websites (database driven and templated) into static ones so that they can put on a CD.

Is there a quick and easy way of doing this? Please help me from the insanity that would result from me going through many 100s of pages and saving the source individually.

(The websites are PHP/MySQL)

(Oh and: pesky clients)
posted by Hartster to Computers & Internet (13 answers total)
 
Depends how dynamic the sites are. If the sites contain links to most of the pages, I suspect you could do some of it with wget.
posted by handee at 4:10 AM on February 24, 2005


If you're using linux and there is a certain order in the dynamic pages, a script using wget should work:


#!/bin/sh
[some loop]
wget http://www.yoursite.com/SOME_REGULAR_EXPRESSIONS
[end of loop]


Alternatively, there are tools like staticshell that might do the job more comfortably. The service is not for free, though.
posted by tcp at 4:24 AM on February 24, 2005


oh, handee beat me to it.
posted by tcp at 4:25 AM on February 24, 2005


I usually use httrack whenever I have to grab a site for work. It usually work quite well for me. However, I have found that it doesn't really grab embedded files (such as flash) very well.

It's quite easy to use. And free.
posted by punkrockrat at 4:29 AM on February 24, 2005


wGet for Windows should allow you to recursively download a website.
use something like...
wget --mirror -w 2 -p --convert-links -P c:\mirror http://website
posted by seanyboy at 5:10 AM on February 24, 2005


I recently used httrack, a free tool, to download a decesased friend's web site for posterity. It worked great and downloaded all pages and graphics automagically.
posted by SteveInMaine at 5:40 AM on February 24, 2005


...and yes, decesased is a new word. It means dead.
posted by SteveInMaine at 5:41 AM on February 24, 2005


When it comes to generating offline mirrors, HTTrack beats Wget hands down. (On Windows it has a decent GUI, too.)

Read the documentation first. Specifically, you will need to understand how to select content.

Obviously, grabbing dynamic pages is going to be problematic if they include forms, such as seach forms, that are expected to work in the offline copy.
posted by gentle at 6:45 AM on February 24, 2005


Here's a (pretty old) article on this topic at phpbuilder.com. Note that in the comments, someone else also recommends httrack.
posted by danOstuporStar at 8:02 AM on February 24, 2005


wget --mirror --html-extension --convert-links http://www.example.com/

That's it.
posted by waldo at 8:05 AM on February 24, 2005


I'm also going to recommend wget. I've had no trouble with it in the past. You can set the recursion depth, and convert all the links on each page so that they work offline. I've used it several times to mirror a website to take with me later. It shouldn't matter how the site was generated, as long as it doesn't require any interaction to actually make the page, as gentle notes.
posted by odinsdream at 11:13 AM on February 24, 2005


Any good site sucker program for offline browsing should do this. What platform are you on? I always look for programs of this sort on tucows.com. There you can see what users have RATED the programs (and which ones are freeware vs shareware, etc.) Search for "offline". Betcha find a bunch.
(This assumes that the dynamic system you have links all of the pages in the site. Orphan pages would not be downloaded automatically as part of the whole site)
posted by spock at 5:09 PM on February 24, 2005


Thanks everybody for your help: I've ended up using httrack this time, and it worked even better than I hoped for (for example, got all the javascript: "zoom image" pages for an e-commerce site).
posted by Hartster at 1:11 AM on February 25, 2005


« Older If you have caught a slight co...   |  Via a FPP, I found this: a &qu... Newer »
This thread is closed to new comments.