Upgrading libxml client code from 1.x to 2.x

Version 2 of libxml is the first version introducing serious backward incompatible changes. The main goals were:

So client code of libxml designed to run with version 1.x may have to be changed to compile against version 2.x of libxml. Here is a list of changes that I have collected, they may not be sufficient, so in case you find other change which are required, drop me a mail:

  1. Node childs field has been renamed children so s/childs/children/g should be applied (probablility of having "childs" anywere else is close to 0+
  2. The document don't have anymore a root element it has been replaced by children and usually you will get a list of element here. For example a Dtd element for the internal subset and it's declaration may be found in that list, as well as processing instructions or comments found before or after the document root element. Use xmlDocGetRootElement(doc) to get the root element of a document. Alternatively if you are sure to not reference Dtds nor have PIs or comments before or after the root element s/->root/->children/g will probably do it.
  3. The white space issue, this one is more complex, unless special case of validating parsing, the line breaks and spaces usually used for indenting and formatting the document content becomes significant. So they are reported by SAX and if your using the DOM tree, corresponding nodes are generated. Too approach can be taken:
    1. lazy one, use the compatibility call xmlKeepBlanksDefault(0) but be aware that you are relying on a special (and possibly broken) set of heuristics of libxml to detect ignorable blanks. Don't complain if it breaks or make your application not 100% clean w.r.t. to it's input.
    2. the Right Way: change you code to accept possibly unsignificant blanks characters, or have your tree populated with weird blank text nodes. You can spot them using the comodity function xmlIsBlankNode(node) returning 1 for such blank nodes.

    Note also that with the new default the output functions don't add any extra indentation when saving a tree in order to be able to round trip (read and save) without inflating the document with extra formatting chars.

  4. The include path has changed to $prefix/libxml/ and the includes themselves uses this new prefix in includes instructions... If you are using (as expected) the
    xml-config --cflags

    output to generate you compile commands this will probably work out of the box

Let me put some emphasis on the fact that there is far more changes from libxml 1.x to 2.x than the ones you may have to pacth for. The overall code has been considerably improved and the conformance to the XML specification has been drastically improve. Don't take those changes as an excuse to not upgrade, it may cost a lot on the long term ...

Daniel Veillard

$Id: upgrade.html,v 1.2 2000/03/06 07:41:49 veillard Exp $