How do I figure out which pages on a web site are wider than X amount of pixels?
March 18, 2008 2:26 PM   Subscribe

How can I analyze the width of specific site's web pages? I'm envisioning either a list of links for pages exceeding a certain amount of pixels or a table that lists every page title, along with the link and the page's width.
posted by boombot to Computers & Internet (7 answers total) 1 user marked this as a favorite
 
How conversant are you with HTML and CSS?

Different websites will use different techniques to govern width. Some will use table-based layout; these may define the width for the table as a whole, or for the columns.

Some will use CSS—these may use floats to stack several columns next to each other, or absolute positioning, or a combination of the two, with widths to govern the columns, or a wrapper around them, or the body.

Some won't have a fixed width—Metafilter will expand to fill your screen (AFACT). Some may enumerate the width in terms of ems, in which case the width depends on your font size. Some may have a max-width and a min-width, but no hard fixed width.

So going at it on a code level will require you to analyze each page and do a lot of math.

The alternative (and probably more realistic) would be to render each page, take a screen shot of it, and use some super-smart technique for figuring out the "live" part of the page. Of course, you'll get some discrepancies depending on what rendering you use, but that's another matter.

Neither approach has an out-of-the-box solution for you that you can just fire and forget, AFAIK.
posted by adamrice at 2:44 PM on March 18, 2008


Are the pages all built around the same, or a few, templates?

The problem is that a web page is a text file - it doesn't have a "width" until a browser renders it. So, for example, a page can stretch infinitely (liquid layout, repeating background) or can be 800px wide on one browser (desktop PC) and 320px wide on another (mobile phone).

But given some specific details, we might be able to help you lash something together. Is there one particular element that's causing the pages to be too wide? We could search "problem" elements.
posted by Leon at 2:48 PM on March 18, 2008


Hmm. Ok, how about this.

You spider the site to get a list of URLs, then use [some kind of scriptable tool] to pass each URL over to your browser with a custom stylesheet that forces the background to #010101, and takes a screenshot.

Then each screenshot gets passed over to a Photoshop action which (a) crops everything that's not the browser canvas and (b) trims everything that's #010101.

Then you just have to map the widths of each output image with the input URLs. Some kind of scripting language again.

Whew. Do-able, I guess. I'd much rather find a solution that didn't involve rendering the pages, though.
posted by Leon at 2:57 PM on March 18, 2008


s/background/body
posted by Leon at 2:58 PM on March 18, 2008


Leon, your technique makes the assumption that the page consists of a visible background overlaid by a layer which exactly fits the page content. This may not be the case here. It won't account for margins and padding, will not work if the content is rendered directly onto the page body, or if it is rendered onto a layer which has no background of its own. It won't work for a site where an outer layer is used over the page body as a container for the content layers.

Anyway, if you're just concerned with the page widths in a single site, you'll probably find that there is a specific, consistently-named element that contains the content. Determining the width of this element is a simple bit of scripting. It probably comes down to how tidily the pages were constructed though. Give us the URL and we'll take it from there...
posted by le morte de bea arthur at 3:24 PM on March 18, 2008


Response by poster: The pages are laid out in an elaborate mess of tables - top nav, left nav, a bar on the right for specific content, and a center copy area. Everything is fixed width except the center content area. Width variations there are almost exclusively caused by wide images and tables. Le Morte, the site isn't public at this time.

Rendering-wise, I'd be happy with anything that approximates IE or a more standards-compliant browser. Mobile versions aren't a priority.
posted by boombot at 6:33 AM on March 19, 2008


Wide images. Figure out which images are too wide, then grep the code for links to those images.

Wide tables. If it's 'table width="1000"' or something, again, grep the code for that string. I don't think you're going to get a one-size-fits-all approach... better to pick off the problems one class at a time.
posted by Leon at 11:28 AM on March 20, 2008


« Older Help me find Shipping Pallet furniture plans...   |   Non-Indo-European language families Newer »
This thread is closed to new comments.