Problems with XHTML content type.
May 30, 2007 2:04 PM Subscribe
I seem to have hit a bit of a brick wall in approximating W3C XHTML standards compliance on my website. From what I've read, XHTML should be served with content type application/xhtml+xml, rather than text/html, and recent versions of Microsoft and Mozilla browsers should support serving them as such. Well, they aren't.
I set the content type and character set for all of my pages using the header statement in my primary include file, as shown here.
When I try to switch the content type (currently by commenting one line and uncommenting the other), however, the following happens:
Firefox 2.0.0.3 complains that this XML file does not appear to have any style information associated with it, and displays a bare document trees.
Internet Explorer 7.0.5450.4 opens an Open/Save/Cancel for a file of type php_auto_file.
Opera for Wii shows the bare interface, stripped of all styling.
Can anybody help me figure out what's going wrong?
I set the content type and character set for all of my pages using the header statement in my primary include file, as shown here.
When I try to switch the content type (currently by commenting one line and uncommenting the other), however, the following happens:
Firefox 2.0.0.3 complains that this XML file does not appear to have any style information associated with it, and displays a bare document trees.
Internet Explorer 7.0.5450.4 opens an Open/Save/Cancel for a file of type php_auto_file.
Opera for Wii shows the bare interface, stripped of all styling.
Can anybody help me figure out what's going wrong?
I use this PHP code on my company's web site:
posted by Khalad at 2:15 PM on May 30, 2007
if (preg_match('{\bapplication/xhtml\+xml\b[^;]*;q=(1.0|0.[1-9])}', $_SERVER['HTTP_ACCEPT']))
header('Content-Type: application/xhtml+xml; charset=UTF-8');
else
header('Content-Type: text/html; charset=UTF-8');
It ends up serving application/xhtml+xml to Firefox and text/html to Internet Explorer. I'm not sure if IE7 supports application/xhtml+xml, but IE6 definitely doesn't.posted by Khalad at 2:15 PM on May 30, 2007
Best answer: The first question you should be asking is why you need XHTML - if you don't have a specific reason (the need for XML parsers to be able to consume your pages) you're much better off with HTML 4.01
Seconded.
If you want to read a very comprehensive & informative take on the issue, check out Tommy Olson's article.
Also: Sending XHTML as text/html Considered Harmful
posted by deern the headlice at 2:46 PM on May 30, 2007 [1 favorite]
Seconded.
If you want to read a very comprehensive & informative take on the issue, check out Tommy Olson's article.
Also: Sending XHTML as text/html Considered Harmful
posted by deern the headlice at 2:46 PM on May 30, 2007 [1 favorite]
Forgot to mention: the best reason to use XHTML is if you want to embed SVG or MathML in your pages. Again, that trick won't work in any existing version of Internet Explorer.
Generally though it's best avoided. I served my personal site in XHTML with the correct content type for several years; it was a nuisance. The XML error model is fundamentally unsuited to the Web.
One of the reasons HTML 5 is so interesting is that it attempts to specify an error model that is as close as possible to the way actual browsers work today.
posted by simonw at 3:07 PM on May 30, 2007
Generally though it's best avoided. I served my personal site in XHTML with the correct content type for several years; it was a nuisance. The XML error model is fundamentally unsuited to the Web.
One of the reasons HTML 5 is so interesting is that it attempts to specify an error model that is as close as possible to the way actual browsers work today.
posted by simonw at 3:07 PM on May 30, 2007
Response by poster: Figured out my problem using deern the headlice's link to Tommy Olson's article. T'was missing the xmlns attribute in my html tag.
posted by The Confessor at 3:23 PM on May 30, 2007
posted by The Confessor at 3:23 PM on May 30, 2007
Response by poster: And not only does it bork IE, it also completely kills Google Maps API in every browser! So I guess I'll abandon that content type declaration at least until Google decides to support it...
I'm still grateful for the assistance, though, as I was able to check the XML well-formedness of markup that I would have never taken the time to expose to an actual validation service.
posted by The Confessor at 5:29 PM on May 30, 2007
I'm still grateful for the assistance, though, as I was able to check the XML well-formedness of markup that I would have never taken the time to expose to an actual validation service.
posted by The Confessor at 5:29 PM on May 30, 2007
Something to keep in mind when you make the switch back into "html-thinking" is that a lot of tags aren't supposed to be closed. For example, according to the W3C, end tags are forbidden in line breaks (<br>), as well as <input> tags.
posted by Civil_Disobedient at 6:29 PM on May 30, 2007
posted by Civil_Disobedient at 6:29 PM on May 30, 2007
That just means you can't write <br></br> or <input></input>.
posted by Khalad at 1:52 PM on June 1, 2007
posted by Khalad at 1:52 PM on June 1, 2007
No, it also means you can't write
posted by Civil_Disobedient at 2:11 PM on June 1, 2007
<br />
or <input type="text" value="Something" />
posted by Civil_Disobedient at 2:11 PM on June 1, 2007
Or, for that matter,
posted by Civil_Disobedient at 2:12 PM on June 1, 2007
<img src="something.jpg" />
posted by Civil_Disobedient at 2:12 PM on June 1, 2007
End tags are forbidden. HTML 4.x is not aware at all of "empty element syntax", which is what the slash is. In XHTML the slash at the end indicates an element element and is equivalent to an immediate end tag; the slash itself is not an end tag, however. In HTML 4.x the slash is bad syntax, but is valid because of the very precise rules the HTML specification lays out for handling parsing errors. Strictly speaking, the slash is actually supposed to be parsed as an invalid attribute by a proper HTML 4.x parser.
The end result is that an XHTML parser will see an empty element and an HTML parser will see an malformed attribute. For always-empty elements like <link> and <br> and <input> that works great, and is a nice way to write XHTML that won't choke a plain HTML parser.
<br /> might not validate as proper HTML, since '/' is not a valid attribute, but syntactically, it is just fine.
posted by Khalad at 6:53 PM on June 1, 2007
The end result is that an XHTML parser will see an empty element and an HTML parser will see an malformed attribute. For always-empty elements like <link> and <br> and <input> that works great, and is a nice way to write XHTML that won't choke a plain HTML parser.
<br /> might not validate as proper HTML, since '/' is not a valid attribute, but syntactically, it is just fine.
posted by Khalad at 6:53 PM on June 1, 2007
Um, just because it works doesn't mean it's syntactically correct. For example, when using XSLT to transform a document into HTML, there is a property to define the output method (<xsl:output method="html"/>). And what does it transform hard breaks into? No closing slash. Why's that? Because it's wrong.
Go try your end slash HTML in Navigator 3 and see what happens. It'll break, that's what. And why? Because it's wrong. Just because self-closing tags work in your modern, fancy-pants browser, does not make it correct.
The following tags are not closed in valid HTML (this includes self-closing): area, base, basefont, br, col, frame, hr, img, input, isindex, link, meta, param.
posted by Civil_Disobedient at 2:20 PM on June 3, 2007
Go try your end slash HTML in Navigator 3 and see what happens. It'll break, that's what. And why? Because it's wrong. Just because self-closing tags work in your modern, fancy-pants browser, does not make it correct.
The following tags are not closed in valid HTML (this includes self-closing): area, base, basefont, br, col, frame, hr, img, input, isindex, link, meta, param.
posted by Civil_Disobedient at 2:20 PM on June 3, 2007
Everything I said is according to the W3C's HTML 4 specification. <br /> is syntactically valid, and is equivalent to <br /="/">. Any conforming HTML 4 user agent, which Navigator 3 is not, must handle <br /> properly.
There's a distinction to be made between code that validates and code that is well-defined. The Acid2 Test, for instance, tests how browsers handle various constructs that are completely invalid HTML/CSS, but nevertheless must be handled in a well-defined manner by conforming HTML user agents.
The following tags are not closed in valid HTML (this includes self-closing): area, base, basefont, br, col, frame, hr, img, input, isindex, link, meta, param.
The tags are not closed, but the elements are.
posted by Khalad at 7:02 PM on June 3, 2007
There's a distinction to be made between code that validates and code that is well-defined. The Acid2 Test, for instance, tests how browsers handle various constructs that are completely invalid HTML/CSS, but nevertheless must be handled in a well-defined manner by conforming HTML user agents.
The following tags are not closed in valid HTML (this includes self-closing): area, base, basefont, br, col, frame, hr, img, input, isindex, link, meta, param.
The tags are not closed, but the elements are.
posted by Khalad at 7:02 PM on June 3, 2007
This thread is closed to new comments.
As for serving up XHTML, you need to check the browser's HTTP_ACCEPT header to decide whether or not to serve it (IE, including IE 7, can't handle it at all and will offer to download it). Be warned: there are a TON of gotchas involved in serving using the application/xhtml+xml content type, especially relating to JavaScript. This article from 2003 is still very relevant.
posted by simonw at 2:13 PM on May 30, 2007