Advertise here: Contact FM.


I want to download all files from a page, but there's a catch
February 1, 2008 7:42 AM   RSS feed for this thread Subscribe

How can I download a bunch of .pdfs from a webpage all at once? The page I'm looking at is a list of .pdfs available (for a class I'm in - course materials), with a link to download each .pdf file. I've tried using the Firefox extension downthemall, but there's a catch - the links I click to download the .pdfs individually are javascript popups that look like this: javascriptf:pop('docs/12004/andes_vegetation.pdf','yes',",",'12004'); Is there a way to grab all of these at once, or am I doomed to clicking each and every link to open a new window, save as, etc.?
posted by entropic to computers & internet (9 comments total) 1 user marked this as a favorite
If you're using Firefox, install the WebDeveloper addon.
Then once it's up, go to your download page and on the Webdev bar above you'll see a menu that says INFORMATION.

Click that. When the menu drops down, choose VIEW LINK INFORMATION.

A new tab will pop up that will have every link listed in the order it's placed. Copy the links and voila!
posted by damiano99 at 7:54 AM on February 1


Try going to http://the.url/and/path/docs/12004 and see if you can see an index there. If so, just grab Downthemall or wget that (and various other 12004 substitutes).

If not it would be pretty trivial to hack up a script to munge the data as needed given the source document.
posted by Skorgu at 8:02 AM on February 1


Are the pdfs all located in the same directory? (i.e., docs/12004/andes_vegetation.pdf) If so, just hack the URL, navigate to that directory and use downthemall, completely bypassing the javascript crap.

For example: http://cms.csc.com/cwf/downloads/docs/pdfs/
posted by desjardins at 8:03 AM on February 1


If you have them, viewing the source and using grep, cut and wget will probably make this very simple.
posted by Cat Pie Hurts at 8:12 AM on February 1


You could use the RegEx Coach (for Windows--uses regex, but more gui than grep and easier to see what you're doing) to pull a list of files out of the source and format them in a neat list, and then use the Windows version of wget.
posted by anaelith at 8:32 AM on February 1


I tried finding an index of the files, but the link to get to the page that has all the javascript popups is this:

http://*******.schoolname.edu.ezproxy.schoolname.edu:2048/eres/coursepage.aspx?cid=591&page=docs/12004#

which is the exact same link to get to any of the pages of course materials for this course. I tried several ways of hacking the url with /docs/12004 but it just takes me back to the listing of .pdfs.

When I tried damiano99's suggestion, I did get a new tab with all the links, but they're still all javascript popups as described in my original question and downthemall can't see them/do anything with them.

Too bad my school uses such a shitty system of linking to documents.
posted by entropic at 8:41 AM on February 1


wget can do this. There's no pretty gui for it, but configuration isnt too hard to figure out. Try appending 'docs/12004/' to the webpage. Or contact your schools support and see if they can put these files up on FTP or something.
posted by damn dirty ape at 9:11 AM on February 1


I've had to this before - save the main html page to your desktop, then open it in a text editor and do a Find & Replace to change the javascript links to <a href=... links. Then resave the page, open it in Firefox, and use Down Them All to grab the newly created links.
posted by Gortuk at 9:23 AM on February 1


Can't you do this with the download function of Acrobat Standard?
posted by ZenMasterThis at 9:48 AM on February 1


« Older Is there anything to stop fore...   |   What is the strongest Wireless... Newer »

You are not logged in, either login or create an account to post comments



Related Questions
Help me not download .pdfs I don't want! September 19, 2007
Javascript Acrobat Needed - Must work without net February 11, 2007
What's the step BEFORE "newbie" called? January 4, 2007
Why can't I open pdf's in Firefox? April 19, 2006
How prevalent are PDFs in Web publishing? Why are... September 8, 2004