XML Browser Differences

13th August 2005 · Last updated: 5th October 2016
 

Sections/Permalinks

  1. Introduction
  2. XML Test 1
  3. XML Test 1 Screenshots
  4. XHTML Test
  5. XHTML Test Screenshots
  6. XML Test 2
  7. XML Test 2 Screenshots
  8. XML Test 3
  9. XML Test 3 Screenshots
  10. Parsing Revelation
  11. Missing Tag Screenshots
  12. Validator Blues

Introduction

In a bid to serve XHTML as application/xhtml+xml (which it should be according to Ian Hickson's classic article Sending XHTML as text/html Considered Harmful) I looked for a cross-browser solution. The problem lies in browsers like IE6 which don't support application/xhtml+xml. They require serving XHTML pages as text/html, or converting the page to HTML 4. I also wanted to remind myself of the benefits of serving XHTML as XML, so I went back to a great article debating this issue by Roger Johansson entitled The perils of using XHTML properly. It's a tricky area because there aren't many real benefits yet to serving XHTML as XML, but a lot of hurdles for the designer to overcome, before they can claim their pages are valid. Many people have even abandoned XHTML altogether and gone back to HTML, as it works in many more browsers and allows for errors. But I had one key reason to think about upgrading my XHTML to application/xhtml+xml - if the page has an error in it, the browser will stop processing the page. So you'd know your pages weren't valid without having to run them through an online validator again and again. This is not good news for the user though, as any loading errors would mean the page stopping before it was displayed - they'd get a blank screen along with an error message. Also long pages would have to load completely before they could be displayed. Or so I thought... more on this later!

I successfully found a solution to the problem that used PHP. The script below checks if the browser can handle application/xhtml+xml, then sends the appropriate header for the page.

<?php
if (isset($_SERVER["HTTP_ACCEPT"]) and stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml")) {
 header("Content-type: application/xhtml+xml");
 echo '<?xml version="1.0" encoding="UTF-8"?>
';
} else {
 header("Content-type: text/html");
}
?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
 
<head>
<title>XHTML Test</title>
</head>
 
<body>
 
<p>Hello.</p>
 
</body>
</html>

You can download the script or try it now. A better approach would be to convert the code to HTML if the browser doesn't support XHTML. But it's a simple hack that works for now. Capable browsers get XHTML served as XML, lesser ones get XHTML as text/html. But that's pointless, you cry! Indeed - why bother? Why not just serve the page as text/html every time and be done with it? This is of course what a majority of XHTML sites are doing today (ahem).

Now if you try to view a page of XHTML served as XML in IE6, what happens is that it tries to download the page, as if it were a separate file. So XHTML served as XML in IE6 is a no-go. But then I remembered that IE6 can handle XML. If an XML page is opened in IE6, it shows a tree of the contents, but not if there is a stylesheet attached. I decided to make a test page of XML that would work in IE6, not just the other main browsers.

Breaking away from XHTML, I put together a series of XML tests to see how each browser handled them. Each test has a linked stylesheet using the XML format (not the XHTML one). I was expecting near identical results, but what I found surprised me. There were many differences between each browser - I used Opera 8.02, Firefox 1.0.6 and IE6 on Windows XP. I also found out some interesting things that I will reveal further on. For now, on with the tests!

XML Test 1

This tests a simple group of nested elements as shown below:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="xml-test.css" type="text/css"?>
<main>
 <item>
  <details>
   <name>XML Test</name>
   <dude>By Chris Hester</dude>
   <when>10 August 2005</when>
  </details>
 </item>
</main>

As you can see it links to a separate stylesheet. Try the test yourself and see what you get in your browser:

XML Test 1

For comparison, here are the relevant screenshots from each browser I ran the test with:

XML Test 1 Screenshots

The first thing to note is that if this page is served as text/xml but with an XHTML doctype, the viewport background is white. Of course none of the tags are valid XHTML. The point is that when the page is served as XML, the entire viewport takes on the background colour of the first element - in this case yellow. So this test proves straight away that XHTML is not the same as XML! If XHTML is served as text/xml, shouldn't it behave in exactly the same way as XML? I had thought XHTML was XML (being a reformulated version of HTML) but not so. The different mime type used clearly affects the browser's handling of XML, most likely because XHTML requires backwards compatibility with HTML. So it is wrong to assume XHTML and XML are treated in exactly the same way.

XHTML Test

Here is the first test repackaged as XHTML, followed by the screenshots:

XHTML Test

XHTML Test Screenshots

The screenshots show the differences between the two tests (the background colour being white, not yellow). Also the viewport padding (not shown) accurately revealed the browsers were working in XHTML mode, where a default margin and padding for the body are set by the browser. And of course, the default background colour modern browsers use is white. This is fine for XHTML, but how do you style the background in XML? There is no <body> or <html> element! Since the first element in any XML document may be given a width and height within the viewport, why then does the background colour spill out to fill the whole page? Shouldn't it be white unless styled otherwise? (Note that this is the case with IE6 - the first element has a yellow background, the rest of the screen is white.)

Notice also how IE6 displays the text between the <title> tags! This appears to show that Opera and Firefox have correctly switched from XML to XHTML mode (even though the file is served as text/xml). Yet IE6 cannot handle XHTML, so it carries on in XML mode. Hence the text is displayed as if it were any other element.

XML Test 2

I discovered that Opera does not completely switch out of XHTML mode when serving XML. Using HTML tags in an XML document causes them to be displayed as if the document was HTML! In other words, you have to avoid using tags with the same names as those reserved for HTML, such as <title>. The browser also adds the default styles for each element, so any content between <address> tags will be italic!

Here is the second test for you to see.

XML Test 2

I added the following HTML tags to this test (which are also valid XML tags):

<title>
<iframe>
<img>
<marquee>
<address>
<blink>
<textarea>

Since the HTML tags I've put in aren't styled in the CSS, Firefox and IE just show the text unstyled. But look at the Opera screenshot below! It is acting on the elements as if the test is an HTML document. The text between the <title> tags even appears at the top of the browser. (You can tell the test is still in XML mode though because the viewport background is yellow. If it was running in XHTML mode, the background would be white.)

One more thing - avoid calling an XML element <head>. In Opera, it disappears! (Because the head section is not meant to be displayed in HTML.)

XML Test 2 Screenshots

XML Test 3

For this test I was trying to create a default styling that could be applied to any XML documents. I found a 3D border effect worked well. I also added the universal selector "*" to style all elements at once. What I found then was that Opera and IE were also styling the viewport itself! In Opera this led to a bordered line at the top, above the actual XML elements used. IE took it further and wrapped the whole test in the viewport styled not only with borders, but also a scrollbar! This appeared within the borders, and was shaded out. Only Firefox displayed just the XML elements I'd used. So sadly the universal selector cannot be used without caution to style XML.

Here is the third test:

XML Test 3

Take a look at the screenshots, as once again, each browser displays the test differently. I also added some notes within the demo itself. Alas only Opera handles the generated content used for the ordered list.

XML Test 3 Screenshots

Parsing Revelation

Lastly, I realised something truly wonderful from the tests. As I said in the introduction, one drawback of serving XHTML as XML is that the slightest error means the page will no longer be valid and cannot be shown. I've argued before for browsers to attempt to repair broken XML rather than just throw up an error screen (so extremely long documents or ones that failed to load correctly would still show something). And now I'm happy to report that's just what Opera and IE do! It is only Firefox that stops dead and shows you nothing more than a yellow screen with an error message on it (that the poor user can't do anything about until the webmaster has fixed the error).

Don't believe me? Here are screenshots of what happens when the first XML test has a missing end tag from one of the elements:

Missing Tag Screenshots

Opera and IE are allowing the user to read the content, even though there is an error on the page. Firefox denies all access to the content. (How inaccessible!) I also found that if the outer elements are made invalid by removing the end tags, the document may still appear intact, complete with full borders and background colours. Opera even ignores a missing end tag on the first element, not giving any error at all! (I shall report this bug.) Firefox still displays nothing, meaning a single error is all you need to prevent the display of the document.

If you view the screenshots above, you might also have noticed that when an error is present in Opera, the viewport is white. Yet when the error is removed, it becomes yellow. This could mean that the errors force the document into XHTML mode. Either that or white is simply chosen so that the error message is always legible.

Note also how IE displays the error message inside the offending element. A good job I had used a readable background colour.

Now you're probably thinking 'Yes that's fine for XML documents with errors in them, but what about XHTML served as XML?'. Again, Opera comes to the rescue by displaying the whole document, even with a missing end tag on an element. I found that this would often carry a style on for the rest of the document. So an open <address> tag would result in the rest of the document being italic.

Validator Blues

Lastly, I must direct your attention to the W3C's online validator. Sadly, it appears unable (at the time of writing) to validate XML! (Surely this is easier than XHTML?) I attempted to validate my tests, only to receive the message "No DOCTYPE found! Attempting validation with XHTML 1.0 Transitional". Also "This page is not Valid XML!". I've news for the validator - it most certainly is! What's more, there is no need for a doctype, as I'm not using XHTML, but XML, which has a clear XML definition at the top. So it seems the silly validator can only validate XHTML and HTML documents, not XML. I was hoping it might at least check for missing end tags, but no. I'm glad we have browsers that can validate XML today then. If an error is found, the browser tells you.