HTML or XHTML?

In the course of developing ReviseMRI.com, I’ve read lots about how to make an interesting, interactive web site which is user (and browser) friendly. The biggest noise made about this concerns web standards.

If you’re designing websites, the following may provide some interesting reading. (If not…maybe skip this post.)

So perhaps you, like me, discovered that it is a good idea to separate style from content with CSS. Then you learned that web pages should not just work when you browsed to them, but that they should conform to standards so that they would work in any user agent. You learned that HTML 4.01 was out, and XHTML was they way forward because it ensured that your HTML was “good HTML“. You dutifully closed all your empty elements, nested tags properly, went tag-lowercase, specified a DOCTYPE, used HTML entities, et cetera. Finally you could sit back and exclaim “I validate, therefore I am.” The web standards illuminati would be proud.

Then you learned that your advanced, clean, logical, interoperable, accessible markup was tag soup, which meant that it was being treated like HTML with lots of funny slashes. You tore your robe and covered your head in ashes; you wanted error-free pages! But wait—you could fix it! Just serve your spiffy new XHTML markup as application/xhtml+xml! You learned about MIME types and HTTP headers (as opposed to meta tag declarations), and fixed it.

But there were more problems. Now your javascripts didn’t work. document.write was now forbidden, and so you obliged with document.createElementNS. Using Firebug, you fixed other javascript errors, too. When at last you thought everything was finished, you quickly checked your site in Internet Explorer. Oh dear.

Your site didn’t work. Whilst blazing a trail for everything good and pure in the world of web coding, you had inadvertently locked out about 80% of your visitors. So you learned about content negotiation, and after a bit of a struggle, you were serving application/xhtml+xml only to user agents who specified it in their accept header, text/html to legacy browsers, and peace ruled the land of Markup once more.

I did all these things. I finally reached the green pastures of XHTML for browsers who would accept the correct MIME type. Browsers which couldn’t got text/html (omitting the xml prologue which b0rks IE), which was second best, but not by much, I thought.

But what benefit did the visitors to my site experience? Pages loaded 1/10th of a second faster, perhaps? Guaranteed consistent rendering? What with all the CSS bugs in various browsers, that was hardly a given. In fact, perhaps things were actually worse when serving XHTML with the proper MIME type; one markup mistake, and visitors see only an error message.

I was beginning to think like this:

“I used to be very much into the whole XHTML bandwagon. It’s clean—I can be purist—someone 8000 miles away may think I’m cool. But in all honesty what is the real point?”

Actually, the most convicing reason to use XHTML is if you want to use XML tools (such as MathML), and I wasn’t doing that. I decided to go back to HTML 4.01, for the following reasons:

  • If a markup error exists, users are likely to still see a page. HTML 4.01 error handling is mature.
  • Popular legacy browsers (e.g. IE) can cope well.
  • I can still validate at an XHTML 1.0 Strict level.

Wondering about that last point? I have gotten used to writing to XHTML rules, and I prefer them. So I still write pages as XHTML 1.0 Strict. But I write the DOCTYPE dynamically using PHP. Pages are served to browsers as HTML 4.01 Strict, with a text/html MIME type, and trailing slashes are automagically removed from singleton tags such as br and img. Incidentally, I also detect if the W3C Validator is requesting a page, and serve XHTML 1.0 Strict as application/xhtml+xml to it, to check my code properly. Here’s the PHP include file, if you’d like to do the same.

To summarise: I write XHTML, browsers (you!) get HTML 4.01 Strict, the W3C Validator gets XHTML 1.0 Strict, and everyone’s happy. Should I ever decide to serve XHTML again, it’ll take 30 seconds to change the PHP include file.

[Wordpress serves XHTML as text/html, I know, I know. This discussion concerns to the main sections of ReviseMRI.com.]