How to save the coding of an entire website?
July 29, 2014 5:56 PM   Subscribe

I'm not referring to saving a page or website to view offline, but viewing/saving the coding of an entire website without manually clicking the view source on every single page. Is this possible? Is there a program?
posted by atinna to Computers & Internet (5 answers total) 4 users marked this as a favorite
 
The venerable command-line tool wget (available on most platforms, including windows and mac) can handle this for you. Lifehacker has a few pointers to useful sites in their article and the comments.
posted by jenkinsEar at 6:12 PM on July 29, 2014


I use the venerable WinHTTrack -- it tends to be "greedy", meaning it'll probably download stuff from other people's websites, too, but it does the job thoroughly. If you just want sourcecode, you can tell it to ignore images, too.
posted by AzraelBrown at 6:23 PM on July 29, 2014


A couple thoughts:
  1. I'm confused by your statement that you're not "saving a page to view offline." If you download all of the source code, you would be able to recreate the page exactly as you saved it by opening the files individually. Unless you're describing something else...?
  2. If you do just want to download the pages (to view offline or to look at the source code), there are numerous tools to do that, including those mentioned above. Check out the Download Managers section here: http://portableapps.com/apps/internet/ for more.
  3. It sounds like you already know that if you're looking to download the code from a page that is dynamically generated — say you wanted to download all of Google or something — the best you can do is download a snapshot of a results page, not the backend code that actually creates the pages during the search process, so you'll never really be able to get "the coding of an entire website" this way.

posted by FreelanceBureaucrat at 7:10 PM on July 29, 2014 [1 favorite]


Just in case you don't know, the majority of the code for a website can actually be living serverside, and not be accessible without actually getting the code from the owners of the site. If the site is newer, it may also have code that is written in one language and compiled to JavaScript. The same goes for CSS- a lot of responsive design uses some kind of compiled CSS(less or sass).
posted by rockindata at 7:13 PM on July 29, 2014 [1 favorite]


Note that the notion of "an entire website" is somewhat arbitrary - parts of what you see may be coming from other servers. If you go with a crawler like wget, it might take a few tries and some fiddling with the parameters to get everything you want without accidentally downloading the entire internet.
posted by Dr Dracator at 1:49 AM on July 30, 2014


« Older I have only 80 hours left of work, but should I...   |   I know nothing about grad school; is it for me? Newer »
This thread is closed to new comments.