how many pages in a website?
July 15, 2009 6:38 PM
geeks, help! how can I (relatively quickly) get a rough idea of how many pages are in a particular website? (more)
Googling around, I found and tried this site:
http://www.xml-sitemaps.com/
which basically does what i'm looking for (its a site-map generator, which I dont care about the sitemap, but it does tell you roughly the number of pages in the site, which is what i'd like to know). However, there the free version is limited to sites that have less than 500 pages. Booh.
Any ideas what I could use for larger sites? A program or some kind of analytics tool online? Preferably free of course. Thanks!
Googling around, I found and tried this site:
http://www.xml-sitemaps.com/
which basically does what i'm looking for (its a site-map generator, which I dont care about the sitemap, but it does tell you roughly the number of pages in the site, which is what i'd like to know). However, there the free version is limited to sites that have less than 500 pages. Booh.
Any ideas what I could use for larger sites? A program or some kind of analytics tool online? Preferably free of course. Thanks!
if you're on a mac and/or unix machine you can use the command:
wget -r http://www.websitename.com
to download said website.
then, something like
ls -R | grep html | wc -l
to list everything in the directory, grep for .html files, and count the number of results.
it's not fullproof, but it works
posted by jimmy0x52 at 6:42 PM on July 15, 2009
wget -r http://www.websitename.com
to download said website.
then, something like
ls -R | grep html | wc -l
to list everything in the directory, grep for .html files, and count the number of results.
it's not fullproof, but it works
posted by jimmy0x52 at 6:42 PM on July 15, 2009
I've done that using Visio. Here are instructions for generating a sitemap with Visio 2002. You can find similar guides for newer versions of Visio as well.
posted by ttyn at 6:46 PM on July 15, 2009
posted by ttyn at 6:46 PM on July 15, 2009
thanks jess, that worked very well as a ballpark.
if anyone has any ideas for methods that can count only content pages, I'd be happy to hear those suggestions too.
posted by jak68 at 6:48 PM on July 15, 2009
if anyone has any ideas for methods that can count only content pages, I'd be happy to hear those suggestions too.
posted by jak68 at 6:48 PM on July 15, 2009
The trouble you might have is that most websites, especially large ones, are dynamically generated. For all intents and purposes my own website only has one page (Default.aspx), and the content management system fills in the content of that page based on context. The days of static HTML pages are long gone.
posted by Lokheed at 8:10 PM on July 15, 2009
posted by Lokheed at 8:10 PM on July 15, 2009
Xenu's LinkSleuth. If you select "Statistics" in the preferences, you'll get a list showing the number of each type of page (HTML, image, etc.) We use it for link checking, and it's free.
posted by Joleta at 8:54 PM on July 15, 2009
posted by Joleta at 8:54 PM on July 15, 2009
This thread is closed to new comments.
posted by jessamyn at 6:40 PM on July 15, 2009