Thursday, November 12, 2009

Beware of XML PrettyPrinting Service

The XML file format has become very popular recently. One of the features that makes it popular is the fact that since .xml files contain plain text, they can be read by anyone using a simple text editor. This is very useful when you are trying to debug a complex system.

Unfortunately, many large XML files are not really readable because the files are too complex to be easily understood. This problem is made worse by the fact that many machine generated XML files try to save space by writing the XML as a single line without any spacing or other nice layout.

Recently I was trying to understand a large complex XML file and was a little frustrated that it was hard to see how the elements had been nested because no indentation had been used to help poor humans like me (software programs are generally unaffected by aesthetic concerns like this). I found this free on-line service for pretyprinting XML. At first glance it seemed to do what I wanted becuase it produced a nicely formatted XML file and displayed it to me in multiple colours.

Unfortunately when I copied the text back into my development environment, I discovered that the nicely formatted XML did not actually match the original and my program started to report XML parsing errors. There were two errors that I saw:
  1. The first error (which was easy to fix) was the fact that the XML file was tagged as using the utf-16 encoding while I was using utf-8 in my editor (probably they were using utf-16 on their web site so this was technically correct)
  2. The second error (which was tricker to fix) was when I had XML tags which were both a begin and end tag (for example using the syntax ) the pretty printer converted these into end tags (e.g. for my example they used which has a very different meaning and caused the XML parsing errors).

So be warned, prettier does not always mean better!!!

3 comments:

  1. Hi Brian,

    I've done this before using XSLT. The advantage of this method is that it is local and you don't need to share your XML with any online tool. It is also faster.

    I don't have the XSLT file I used any more (it was some long time ago), but a quick google search shows some examples like this one:

    http://faq.javaranch.com/java/HowToPrettyPrintXmlWithXsl

    ReplyDelete
  2. Jacobo,

    Thanks for the tip. That XSLT code is so simple that it should be easy to implement a pretty printer which did just that and could not go wrong

    The reason I went for an on-line pretty printing service was because I was lazy and didn't want to bother installing anything. The better solution would be to use something more powerful than a text editor to edit my XML. Is there an eclipse plugin that you would recommend for editing XML?

    Brian

    ReplyDelete
  3. Sorry, can't help you there, I don't edit XML frequently, and when I do, I use Vim (as for basically everything else :P). If you ever want to turn to the "dark side", this can help you set it all up:

    http://www.pinkjuice.com/howto/vimxml/

    ReplyDelete