From 10c6a8fdde3006c509fa4573bfc4af49a0b2c0a7 Mon Sep 17 00:00:00 2001
From: Daniel Veillard
-XML is a standard for markup based structured documents, here is an -example:
+XML is a standard for markup based structured documents, here is an example:<?xml version="1.0"?> <EXAMPLE prop1="gnome is great" prop2="&linux; too"> <head> @@ -32,9 +32,12 @@ example: </chapter> </EXAMPLE>
-
- -/>
.
@@ -53,13 +56,88 @@ one ELEMENT under the root):
-
+In the source package there is a small program (not installed by default) +called tester which parses XML files given as argument and +prints them back as parsed, this is useful to detect errors both in XML code +and in the XML parser itself. It has an option --debug which +prints the actual in-memory structure of the document, here is the result with +the example given before: +DOCUMENT +version=1.0 +standalone=true + ELEMENT EXAMPLE + ATTRIBUTE prop1 + TEXT + content=gnome is great + ATTRIBUTE prop2 + ENTITY_REF + TEXT + content= too + ELEMENT head + ELEMENT title + content=Welcome to Gnome + ELEMENT chapter + ELEMENT title + content=The Linux adventure + ELEMENT p + content=bla bla bla ... + ELEMENT image + ATTRIBUTE href + TEXT + content=linus.gif + ELEMENT p + content=...+
+This should be useful to learn the internal representation model.
-+This section is directly intended to help programmers getting bootstrapped +using the XML library from the C language. It doesn't intent to be extensive, +I hope the automatically generated docs will provide the completeness +required, but as a separated set of documents. The interfaces of the XML +library are by principle low level, there is nearly zero abstration. Those +interested in a higher level API should look at DOM +(unfortunately not completed).
-+Usually, the first thing to do is to read an XML input, the parser accepts to +parse both memory mapped documents or direct files. The functions are defined +in "parser.h":
++parse a zero terminated string containing the document
++parse an XML document contained in a file (possibly compressed)
++ This returns a pointer to the document structure (or NULL in case of +failure).
++A couple of comments can be made, first this mean that the parser is +memory-hungry, first to load the document in memory, second to build the tree. +Reading a document without building the tree will be possible in the future by +pluggin the code to the SAX interface (see SAX.c).
-+Basically by including "tree.h" your code has access to the internal structure +of all the element of the tree. The names should be somewhat simple like +parent, childs, next, +prev, properties, etc...
+ +DOM stands for the Document Object Model this is an API for accessing XML or HTML structured documents. diff --git a/tree.c b/tree.c index 4de4ee57..b358a5dc 100644 --- a/tree.c +++ b/tree.c @@ -747,7 +747,7 @@ xmlNewReference(xmlDocPtr doc, const CHAR *name) { } cur->type = XML_ENTITY_REF_NODE; - cur->doc = NULL; + cur->doc = doc; cur->parent = NULL; cur->next = NULL; cur->prev = NULL;