How can I get a list of all files in a directory on a webserver?
April 20, 2007 8:09 AM   Subscribe

I think the title explains it all: I'm looking to get a list of all files in a specific directory from a web server- www.site.com/docs/. The thing is, there's an index.htm file in that directory so if I browse to it, I just get the index file. Is there any way to find out the names of all files in that directory, scripting or otherwise?

I don't have access to the system other than through the standard web interface(ie: no shell, etc.) Any ideas?
posted by newatom to Computers & Internet (13 answers total) 1 user marked this as a favorite
 
Best answer: Directory browsing is a server level thing. If they've put up an index.html file for that directory, there's not a lot you can do.

The only thing that might help you is a google search (or the yahoo/msn equivalent) that looks something like one of these

site:site.com inurl:docs
site:www.site.com inurl:docs

What that's saying is only search the site.com domain, and the word "docs" must be in the URL. This will only work, however, if the files in that directory were linked to at one point, and the search engine spider grabbed that page.
posted by alana at 8:22 AM on April 20, 2007


Best answer: If they're not linked from either the index.htm file (or elsewhere on the site or web), then no, not really.

However, if you know that they have a guessable naming scheme:
doc01.txt
doc02.txt
doc03.txt
...then it would be trivial to write a shell script that would attempt to grab all these files. It would be blatantly obvious in the web logs too - if anyone was checking.
posted by unixrat at 8:23 AM on April 20, 2007


The only way to do it is to find all references elsewhere to that path, possibly by searching for it at google or somewhere else that has a big crawler.

Or you could mail the owner and ask them.

The feature you're running into is specifically and deliberately intended to prevent outsiders like you from getting a listing of the directory contents, so that files can be put in there without being seen by you or anyone else. And there's no way I've ever heard of to bypass it. (Speaking as a web site owner who uses this feature for precisely the purpose of hiding the directory listing of some of my directories from outsiders I don't want poking around in them.)
posted by Steven C. Den Beste at 8:24 AM on April 20, 2007


You could use HTTrack and grab a local copy, assuming you've got the disk space.
posted by JaredSeth at 8:24 AM on April 20, 2007


Best answer: Solutions involving HTTrack and similar programs, or Google and similar crawlers, rely on a page somewhere linking to the files in the directory.

If the file isn't linked anywhere, i.e. you can't work out the filename, you're out of luck.
posted by Aloysius Bear at 8:31 AM on April 20, 2007


This would be a pretty serious security flaw if it were possible.
posted by rhizome at 8:51 AM on April 20, 2007


Good point about the limitations of HTTrack, Aloysius. I guess I just naturally assumed he was looking for the stuff referenced from other pages on the site.
posted by JaredSeth at 9:06 AM on April 20, 2007


The impossibility of this is one of the main reasons I put an empty index.html in most directories full of "shared stuff" on my web server.
posted by jozxyqk at 9:15 AM on April 20, 2007


The PHP function opendir() can do it.
posted by tremolo1970 at 9:21 AM on April 20, 2007


Oops...actually, I guess you would use opendir() and readdir().
posted by tremolo1970 at 9:26 AM on April 20, 2007


The PHP function opendir() can do it.

The server would need to be running PHP and you'd have to have permission to upload a PHP script to the server, so that solution is out.
posted by turaho at 9:29 AM on April 20, 2007


No.
posted by chundo at 10:00 AM on April 20, 2007


Google is the closest you'll get, but it can be surprisingly complete, at least if you weren't trying to do anything underhand.
posted by reklaw at 8:01 PM on April 20, 2007


« Older help me "subscribe" to websites that do not use...   |   Books About Online Culture? Newer »
This thread is closed to new comments.