From 10c6a8fdde3006c509fa4573bfc4af49a0b2c0a7 Mon Sep 17 00:00:00 2001 From: Daniel Veillard Date: Wed, 28 Oct 1998 01:00:12 +0000 Subject: [PATCH] A small patch and more doc, Daniel. --- ChangeLog | 5 +++ doc/xml.html | 96 +++++++++++++++++++++++++++++++++++++++++++++++----- tree.c | 2 +- 3 files changed, 93 insertions(+), 10 deletions(-) diff --git a/ChangeLog b/ChangeLog index f4d4ca08..95ca5ef9 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,8 @@ +Tue Oct 27 17:54:00 EST 1998 Daniel Veillard + + * tree.c: corrected a small bug + * doc/xml.html: continuing writing documentation. + Tue Oct 27 17:54:00 EST 1998 Daniel Veillard * debugXML.h debugXML.c: added debugging utilities. diff --git a/doc/xml.html b/doc/xml.html index d0f57a4c..93d85ff3 100644 --- a/doc/xml.html +++ b/doc/xml.html @@ -17,8 +17,8 @@ href="http://www.w3.org/DOM/">DOM interfaces.

xml

-XML is a standard for markup based structured documents, here is an -example:

+XML is a standard for markup based structured documents, here is an example:

<?xml version="1.0"?>
 <EXAMPLE prop1="gnome is great" prop2="&linux; too">
   <head>
@@ -32,9 +32,12 @@ example:

</chapter> </EXAMPLE>

-

- -

Invoking the parser

+The first line specify that it's an XML document and gives useful informations +about it's encoding. Then the document is a text format whose structure is +specified by tags between brackets. Each tag opened have to be +closed XML is pedantic about this, not that for example the image +tage has no content (just an attribute) and is closed by ending up the tag +with />.

The tree output

@@ -53,13 +56,88 @@ one ELEMENT under the root):

 structure.gif

-

+In the source package there is a small program (not installed by default) +called tester which parses XML files given as argument and +prints them back as parsed, this is useful to detect errors both in XML code +and in the XML parser itself. It has an option --debug which +prints the actual in-memory structure of the document, here is the result with +the example given before:

+
DOCUMENT
+version=1.0
+standalone=true
+  ELEMENT EXAMPLE
+    ATTRIBUTE prop1
+      TEXT
+      content=gnome is great
+    ATTRIBUTE prop2
+      ENTITY_REF
+      TEXT
+      content= too
+    ELEMENT head
+      ELEMENT title
+      content=Welcome to Gnome
+    ELEMENT chapter
+      ELEMENT title
+      content=The Linux adventure
+      ELEMENT p
+      content=bla bla bla ...
+      ELEMENT image
+        ATTRIBUTE href
+          TEXT
+          content=linus.gif
+      ELEMENT p
+      content=...
+

+This should be useful to learn the internal representation model.

-

Modifying the tree

+

The XML library interfaces

+

+This section is directly intended to help programmers getting bootstrapped +using the XML library from the C language. It doesn't intent to be extensive, +I hope the automatically generated docs will provide the completeness +required, but as a separated set of documents. The interfaces of the XML +library are by principle low level, there is nearly zero abstration. Those +interested in a higher level API should look at DOM +(unfortunately not completed).

-

Saving a tree

+

Invoking the parser

+

+Usually, the first thing to do is to read an XML input, the parser accepts to +parse both memory mapped documents or direct files. The functions are defined +in "parser.h":

+
+
xmlDocPtr xmlParseMemory(char *buffer, int size);
+

+parse a zero terminated string containing the document

+
+
+
+
xmlDocPtr xmlParseFile(const char *filename);
+

+parse an XML document contained in a file (possibly compressed)

+
+
+

+ This returns a pointer to the document structure (or NULL in case of +failure).

+

+A couple of comments can be made, first this mean that the parser is +memory-hungry, first to load the document in memory, second to build the tree. +Reading a document without building the tree will be possible in the future by +pluggin the code to the SAX interface (see SAX.c).

-

DOM interfaces

+

Traversing the tree

+

+Basically by including "tree.h" your code has access to the internal structure +of all the element of the tree. The names should be somewhat simple like +parent, childs, next, +prev, properties, etc...

+ +

Modifying the tree

+ +

Saving a tree

+ +

DOM interfaces

DOM stands for the Document Object Model this is an API for accessing XML or HTML structured documents. diff --git a/tree.c b/tree.c index 4de4ee57..b358a5dc 100644 --- a/tree.c +++ b/tree.c @@ -747,7 +747,7 @@ xmlNewReference(xmlDocPtr doc, const CHAR *name) { } cur->type = XML_ENTITY_REF_NODE; - cur->doc = NULL; + cur->doc = doc; cur->parent = NULL; cur->next = NULL; cur->prev = NULL;