libxml2

mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-01-13 13:17:36 +03:00

Author	SHA1	Message	Date
Pranjal Jumde	a820dbeac2	Bug 758605: Heap-based buffer overread in xmlDictAddString <https://bugzilla.gnome.org/show_bug.cgi?id=758605 > Reviewed by David Kilzer. * HTMLparser.c: (htmlParseName): Add bounds check. (htmlParseNameComplex): Ditto. * result/HTML/758605.html: Added. * result/HTML/758605.html.err: Added. * result/HTML/758605.html.sax: Added. * runtest.c: (pushParseTest): The input for the new test case was so small (4 bytes) that htmlParseChunk() was never called after htmlCreatePushParserCtxt(), thereby creating a false positive test failure. Fixed by using a do-while loop so we always call htmlParseChunk() at least once. * test/HTML/758605.html: Added.	2016-05-23 15:01:07 +08:00
Jan Pokorný	bb654feb9a	Fix typos: dictio{ nn -> n }ar{y,ies} Signed-off-by: Jan Pokorný <jpokorny@redhat.com>	2016-04-15 22:22:48 +08:00
Hugh Davenport	8fb4a77007	CVE-2015-8242 Buffer overead with HTML parser in push mode For https://bugzilla.gnome.org/show_bug.cgi?id=756372 Error in the code pointing to the codepoint in the stack for the current char value instead of the pointer in the input that the SAX callback expects Reported and fixed by Hugh Davenport	2015-11-20 17:16:06 +08:00
Daniel Veillard	e724879d96	Fix parsing short unclosed comment uninitialized access For https://bugzilla.gnome.org/show_bug.cgi?id=746048 The HTML parser was too optimistic when processing comments and didn't check for the end of the stream on the first 2 characters	2015-10-30 21:14:55 +08:00
Daniel Veillard	140c251e8e	Recover unescaped less-than character in HTML recovery parsing As pointed by Christian Schoenebeck <schoenebeck@crudebyte.com> on the list and based on some of his early patches, this preserve content when unescaped opening angle brackets are not escaped in textual content like: <p> a < b </p> <p> a <0 </p> <p> a <=0 </p> while still reporting the error.	2015-06-30 11:36:28 +08:00
Daniel Veillard	292a9f293d	Possible overflow in HTMLParser.c For https://bugzilla.gnome.org/show_bug.cgi?id=720615 make sure that the encoding string passed is of reasonable size	2014-10-06 18:51:04 +08:00
Philip Withnall	579ebbcb3c	HTMLparser: Correctly initialise a stack allocated structure If not initialised, the ‘node’ member remains undefined. Coverity issue: #60466 https://bugzilla.gnome.org/show_bug.cgi?id=731990	2014-07-26 20:09:42 +08:00
Gaurav	3e0eec4319	Adding some missing NULL checks in SAX2 DOM building code and in the HTML parser	2014-06-13 14:45:20 +08:00
Daniel Veillard	b0c7e7e57f	Fix an typo 'onrest' in htmlScriptAttributes As pointed out by "Laurent <guitarneck@free.fr>"	2014-02-06 10:50:35 +01:00
Daniel Veillard	4e1476c5ea	adding init calls to xml and html Read parsing entry points As pointed out by "Tassyns, Bram <BramT@enfocus.com>" on the list some call had it other didn't, clean it up and add to all missing ones	2013-12-09 15:23:40 +08:00
Arnold Hendriks	826bc32020	Fix HTML push parser to accept HTML_PARSE_NODEFDTD For https://bugzilla.gnome.org/show_bug.cgi?id=719515 fixes htmlParseTryOrFinish to interpret HTML_PARSE_NODEFDTD, and updates xmllint to actually pass --nodefdtd to the push version of the HTML parser	2013-11-29 14:12:12 +08:00
Daniel Veillard	bf058dce13	Fix the flushing out of raw buffers on encoding conversions https://bugzilla.gnome.org/show_bug.cgi?id=692915 the new set of converting functions tried to limit the encoding conversion of the raw buffer to the consumption one to work in a more progressive fashion. Unfortunately this was bad for performances and led to errors on progressive parsing when a very large chunk was close to the end of the document. Fix the new internal function and switch back to the old way of converting. Fix another bug in the process.	2013-02-13 18:19:42 +08:00
Daniel Veillard	de0cc20c29	Fix some buffer conversion issues https://bugzilla.gnome.org/show_bug.cgi?id=690202 Buffer overflow errors originating from xmlBufGetInputBase in 2.9.0 The pointers from the context input were not properly reset after that call which can do reallocations.	2013-02-12 16:55:34 +08:00
Daniel Veillard	f8e3db0445	Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.	2012-09-11 13:26:36 +08:00
Daniel Veillard	f933c89813	Keep non-significant blanks node in HTML parser For https://bugzilla.gnome.org/show_bug.cgi?id=681822 Regardless if the option HTML_PARSE_NOBLANKS is set or not, blank nodes are removed from a HTML document, for example: <html> <head> <title>This is a test.</title> </head> <body> <p>This is a test.</p> </body> </html> is read as: <html><head><title>This is a test.</title></head><body> <p>This is a test.</p> </body></html> This changes the default behaviour but the old behaviour is available as expected when using the parser flag HTML_PARSE_NOBLANKS Based on original patch from Igor Ignatyuk <igor_ignatiouk@hotmail.com> * HTMLparser.c: change various places in the parser where ignorable_space SAX callback was called without checking for the parser flag preference * xmllint.c: make sure we use the new flag even for HTML parsing * result/HTML/*: this modifies the output of a number of tests	2012-09-07 19:32:12 +08:00
Conrad Irwin	b60061a7a5	Visible HTML elements close the head tag In HTML email it's common to find arbitrary fragments of HTML, the one that triggered this change was of the form: <meta><font></font><div>... Before this change the <font> tag was part of the implicit <head> that gets created for the <meta> tag, after this change, it is part of the <body>, which more closely matches the behaviour of modern HTML implementations.	2012-08-17 19:14:29 +08:00
Daniel Veillard	00ac0d3b96	More cleanups for input/buffers code When calling xmlParserInputBufferPush, the buffer may be reallocated and at the input level the pointers for base, cur and end need to be reevaluated. * buf.c buf.h: add two new functions, one to get the base from the input of the buffer, and another one to reset the pointers based on the cur and base inded * HTMLparser.c parser.c: cleanup to use the new helper functions as well as making sure size_t is used for the indexes computations	2012-07-23 14:24:27 +08:00
Daniel Veillard	61551a1eb7	Cleanup function xmlBufResetInput() to set input from Buffer This was scattered in a number of modules, xmlParserInputPtr have usually their base, cur and end pointer set from an xmlBuf used as input. * buf.c buf.h: add a new function implementing this setup * parser.c HTMLparser.c catalog.c parserInternals.c xmlreader.c use the new function instead of digging into the buffer in all those modules	2012-07-23 14:24:27 +08:00
Daniel Veillard	a78d803639	Convert of the HTML parser to new input buffers Changes similar to the ones done in the XML parser for the routines which are not shared.	2012-07-23 14:24:27 +08:00
Denis Pauk	a0cd075d94	HTML parser error with <noscript> in the <head> For https://bugzilla.gnome.org/show_bug.cgi?id=615785 When the <noscript> is found, <head> is closed and a <body> element is created. The real <body id="xxx"> gets skipped over, so I can't see any of the body's attributes. Just don't close <head> when encountering a <noscript> Add a regression test too	2012-05-11 19:31:12 +08:00
Denis Pauk	fdf990c2ef	Allow to parse 1 byte HTML files For https://bugzilla.gnome.org/show_bug.cgi?id=605740 File 1 byte long were not accepted by the HTML push parser	2012-05-10 20:40:49 +08:00
Martin Schröder	b91111b475	Patch that fixes the skipping of the HTML_PARSE_NOIMPLIED flag For https://bugzilla.gnome.org/show_bug.cgi?id=642916 I just noticed that the HTML_PARSE_NOIMPLIED flag that you can pass to the HTML-Parser methods doesn't do anything. Its intended purpose is to stop the HTML-parser from forcibly adding a pair of html/body tags if the stream does not contain any. This is highly useful when you don't need this level of strictness. Unfortunately, specifying it doesn't work, because the option is not copied into the parsing context.	2012-05-10 18:52:37 +08:00
Lin Yi-Li	24464be639	Avoid memory leak if xmlParserInputBufferCreateIO fails For https://bugzilla.gnome.org/show_bug.cgi?id=643949 In case of error on an IO creation input the given context is terminated with the given close function, except if the error happened in xmlParserInputBufferCreateIO. This can lead to a resource leak which is fixed by this patch.	2012-05-10 16:14:55 +08:00
Denis Pauk	868d92da89	Add HTML parser support for HTML5 meta charset encoding declaration For https://bugzilla.gnome.org/show_bug.cgi?id=655218 http://www.w3.org/TR/2011/WD-html5-20110525/semantics.html#the-meta-element """ The charset attribute specifies the character encoding used by the document. This is a character encoding declaration. If the attribute is present in an XML document, its value must be an ASCII case-insensitive match for the string "UTF-8" (and the document is therefore forced to use UTF-8 as its encoding). """ However, while <meta http-equiv="Content-Type" content="text/html; charset=utf8"> works, <meta charset="utf8"> does not. While libxml2 HTML parser is not tuned for HTML5, this is a simple addition Also added a testcase	2012-05-10 15:34:57 +08:00
Pavel Andrejs	8ad4da5f56	HTML element position is not detected propperly The data in node_seq in xmlParserCtxt was not updated properly when parsing HTML. This patch fixes the accounting for both pull and push mode of HTML parsing.	2012-05-08 11:01:12 +08:00
Daniel Veillard	c62efc847c	Add options to ignore the internal encoding For both XML and HTML, the document can provide an encoding either in XMLDecl in XML, or as a meta element in HTML head. This adds options to ignore those encodings if the encoding is known in advace for example if the content had been converted before being passed to the parser. * parser.c include/libxml/parser.h: add XML_PARSE_IGNORE_ENC option for XML parsing * include/libxml/HTMLparser.h HTMLparser.c: adds the HTML_PARSE_IGNORE_ENC for HTML parsing * HTMLtree.c: fix the handling of saving when an unknown encoding is defined in meta document header * xmllint.c: add a --noenc option to activate the new parser options	2011-05-26 11:47:37 +08:00
Denis Pauk	91d239c5cf	617468 fix progressive HTML parsing with style using "'" Style and script can contain ',"". This patch fixes call htmlParseLookupSequence with set flag 'ignoreattrval' to ignore this char	2010-11-04 12:39:18 +01:00
Pierre Belzile	d4b5447141	614005 Possible erroneous HTML parsing on unterminated script Fix a nasty error handling problem when an error happen at the end of the input buffer.	2010-11-04 10:18:17 +01:00
Daniel Veillard	8ad2930f62	make sure htmlCtxtReset do reset the disableSAX field As pointed out by Stefan Behnel <stefan_ml@behnel.de>	2010-10-28 11:51:22 +02:00
Michael Day	af58ee130f	Fix a couple of typo in HTML parser error messages	2010-08-02 13:43:28 +02:00
Daniel Veillard	f1121c48af	Add an HTML parser option to avoid a default doctype - include/libxml/HTMLparser.h: defines the new HTML parser option HTML_PARSE_NODEFDTD - HTMLparser.c: if option is set don't add a default DTD - xmllint.c: add the corresponding --nodefdtd option in xmllint	2010-07-26 14:02:42 +02:00
Daniel Veillard	06c93b7509	Remove a few warnings	2010-03-15 16:08:44 +01:00
Daniel Veillard	3c080d6d72	Don't give default HTML boolean attribute values in parser * HTMLparser.c: don't default value of HTML boolean attributes in the parser * SAX2.c: move this to SAX2 tree building backend * result/HTML/doc2.htm.sax result/HTML/doc3.htm.sax result/HTML/wired.html.sax: this changes a few HTML SAX regression tests	2010-03-15 15:47:50 +01:00
Eugene Pimenov	615904f582	Switch the HTML parser to be non-recursive * HTMLparser.c: new htmlParseElementInternal non recursive, with htmlParseContentInternal and new function to handle node info and element end. * include/libxml/parser.h: add new stack for element info in parser context * parserInternals.c: fee element info stack	2010-03-15 15:16:02 +01:00
Eugene Pimenov	ef9c636ac1	Cleanup a couple of weirdness in HTML parser	2010-03-15 11:37:48 +01:00
Eugene Pimenov	1e60fbcb6f	htmlCheckEncoding doesn't update input-end after shrink * HTMLparser.c: add the missing update to the end pointer	2010-03-10 18:10:49 +01:00
Daniel Veillard	e20fb5a72c	Fix xmlParseInNodeContext for HTML content xmlParseInNodeContext notices that the enclosing document is an HTML document, so invoke the HTML parser for that fragment, and the HTML parser finding a "<p>hello world!</p>" document automatically augment it with defaulted <html> and <body>. This defaulting should be turned off in the HTML parser for this to work, but there is no such HTML parser option. There is an htmlOmittedDefaultValue global variable that you could use, but really we should not rely on global variable for processing options anymore, best is to add an HTML_PARSE_NOIMPLIED. * include/libxml/HTMLparser.h: add the HTML_PARSE_NOIMPLIED parser flag * HTMLparser.c: do add implied element if HTML_PARSE_NOIMPLIED is set * parser.c: add HTML_PARSE_NOIMPLIED to options for xmlParseInNodeContext on HTML documents	2010-01-29 20:47:08 +01:00
Eugene Pimenov	4b41f15dcd	Fix some missing commas in HTML element lists * HTMLparse.c: fix the macros BLOCK and INLINE to use commas and avoid transparent contatenation of strings	2010-01-20 14:25:59 +01:00
Daniel Veillard	13cee4e37b	Fix a bunch of scan 'dead increments' and cleanup * HTMLparser.c c14n.c debugXML.c entities.c nanohttp.c parser.c testC14N.c uri.c xmlcatalog.c xmllint.c xmlregexp.c xpath.c: fix unused variables, or unneeded increments as well as a couple of space issues * runtest.c: check for NULL before calling unlink()	2009-09-05 14:52:55 +02:00
Daniel Veillard	eeb9932990	444994 HTML chunked failure for attribute with <> * HTMLparser.c: fix htmlParseLookupSequence to not save ctxt->checkIndex when the current buffer ends within an attribute value, as this information would be missed in next pass.	2009-08-25 14:42:16 +02:00
Adiel Mittmann	8a103793f2	Non ASCII character may be split at buffer end * HTMLparser.c: make sure when we call xmlParserInputGrow in htmlCurrentChar, to reset the current pointer	2009-08-25 11:27:13 +02:00
Markus Kull	56a03035bf	572129 speed up parasing of large HTML text nodes * HTMLparser.c: use a different lookup function htmlParseLookupChars() to avoid the quadratic behaviour	2009-08-24 19:00:23 +02:00
Daniel Veillard	b468f7444c	Remove a pedantic warning	2009-08-24 18:45:33 +02:00
Daniel Veillard	856c668c1a	Fix HTML parsing with 0 character in CDATA * HTMLparser.c: 0 before the end of the input need some special case handling, raise the error and return a space instead	2009-08-24 18:16:56 +02:00
Daniel Veillard	029a04d265	541335 HTML avoid creating 2 head or 2 body element * HTMLparser.c: check when we see an head or a body tag and avoid autogenerating them * include/libxml/parser.h: the values for ctxt->html change depending on the head or body tags being seen	2009-08-24 12:50:23 +02:00
Daniel Veillard	6339c1a886	541237 error correcting missing end tags in HTML * HTMLparser.c: make sure /p closes the FONTSTYLE list of elements	2009-08-24 11:59:51 +02:00
Daniel Veillard	db4ac221f0	Fix a small problem on previous HTML parser patch	2009-08-22 17:58:31 +02:00
Daniel Veillard	e77db16ab1	592430 - HTML parser runs into endless loop * HTMLparser.c: fix the problem with detection erroring absolutely, and properly popping up the stack when in EOF, also passes XML_PARSE_HUGE when decoding options.	2009-08-22 11:32:38 +02:00
Daniel Veillard	7459c595a0	588441 allow '.' in HTML Names even if invalid * HTMLparser.c: just allow '.' in htmlParseHTMLName list of characters	2009-08-13 10:10:29 +02:00
Daniel Veillard	533ec0e073	579317 Try to find the HTML encoding information * HTMLparser.c: if we hit an encoding error before parsing a potential <meta> with the info look in the input buffer to see if we can find it instead of forcing a blind switch to ISO-8859-1	2009-08-12 23:00:22 +02:00
Jiri Netolicky	446e126de5	576368 – htmlChunkParser with special attributes * HTMLparser.c: htmlChunkParsing failed when the chunk ends inside element after some attribute which has a '>' char in its value.	2009-08-07 17:05:36 +02:00
Daniel Veillard	4d3e2da7f8	* HTMLparser.c: make sure we keep line numbers fixes #580705 based Aaron Patterson patch Daniel	2009-05-15 17:55:45 +02:00
Roland Steiner	04f8eef852	* HTMLparser.c: a broken HTML table attributes initialization, fixes #581803, by Roland Steiner <rolandsteiner@google.com> Daniel	2009-05-12 09:16:16 +02:00
Daniel Veillard	7f4547cdbd	preparing the release of 2.7.2 fix the Solaris portability issue * configure.in doc/* NEWS: preparing the release of 2.7.2 * dict.c: fix the Solaris portability issue * parser.c: additional cleanup on #554660 fix * test/ent13 result/ent13* result/noent/ent13: added the example in the regression test suite. HTMLparser.c: handle leading BOM in htmlParseElement() Daniel svn path=/trunk/; revision=3799	2008-10-03 07:58:23 +00:00
Daniel Veillard	a57ba4ce96	fix an HTML parsing error on large data sections reported by Mike Day add * HTMLparser.c: fix an HTML parsing error on large data sections reported by Mike Day * test/HTML/utf8bug.html result/HTML/utf8bug.html.err result/HTML/utf8bug.html.sax result/HTML/utf8bug.html: add the reproducer to the test suite daniel svn path=/trunk/; revision=3797	2008-09-25 16:06:18 +00:00
Daniel Veillard	4cc67bb77e	patch from Robert Schwebel , allows to compile the example if configured * doc/examples/reader3.c: patch from Robert Schwebel , allows to compile the example if configured without output support fixes #545582 * Makefile.am: add testrecurse to the make check tests * HTMLparser.c: if the parser got a encoding argument it should be used over what the meta specifies, patch fixing #536346 Daniel svn path=/trunk/; revision=3785	2008-08-29 19:58:23 +00:00
Daniel Veillard	ae0765b681	more progresses against the official regression tests small cleanup for * runxmlconf.c: more progresses against the official regression tests * runsuite.c: small cleanup for non-leak reports * include/libxml/tree.h: parsing flags and other properties are now added to the document node, this is generally useful and allow to make Name and NmToken validations based on the parser flags, more specifically the 5th edition of XML or not * HTMLparser.c tree.c: small side effects for the previous changes * parser.c SAX2.c valid.c: the bulk of teh changes are here, the parser and validation behaviour can be affected, parsing flags need to be copied, lot of changes. Also fixing various validation problems in the regression tests. Daniel svn path=/trunk/; revision=3762	2008-07-31 19:54:59 +00:00
Daniel Veillard	ed86dc2383	applied patch from Ashwin fixing a number of realloc problems improve * uri.c: applied patch from Ashwin fixing a number of realloc problems * HTMLparser.c: improve handling for misplaced html/head/body Daniel svn path=/trunk/; revision=3740	2008-04-24 11:58:41 +00:00
Daniel Veillard	36de63e71d	apparently it's okay to forget the semicolumn after entity refs in HTML, * HTMLparser.c: apparently it's okay to forget the semicolumn after entity refs in HTML, fixing char refs parsing accordingly based on T. Manske patch, this should fix #517653 Daniel svn path=/trunk/; revision=3726	2008-04-03 09:05:05 +00:00
Daniel Veillard	35fcbb84d2	patch from Arnold Hendriks improving parsing of html within html bogus * HTMLparser.c: patch from Arnold Hendriks improving parsing of html within html bogus data, still not a complete fix though Daniel svn path=/trunk/; revision=3704	2008-03-12 21:43:39 +00:00
Daniel Veillard	c5b43cc03a	avoid stopping parsing when encountering out of range characters in an * HTMLparser.c: avoid stopping parsing when encountering out of range characters in an HTML file, report and continue processing instead, should fix #472696 Daniel svn path=/trunk/; revision=3675	2008-01-11 07:41:39 +00:00
Daniel Veillard	640f89ef61	fix definition for <embed> to avoid error when saving back, patch from * HTMLparser.c: fix definition for <embed> to avoid error when saving back, patch from Stefan Behnel fixing 495213 Daniel svn path=/trunk/; revision=3671	2008-01-11 06:24:09 +00:00
Daniel Veillard	861101d1fa	fixed bug #381877 , avoid reading over the end of stream when generating an * HTMLparser.c: fixed bug #381877, avoid reading over the end of stream when generating an UTF-8 encoding error. Daniel svn path=/trunk/; revision=3627	2007-06-12 08:38:57 +00:00
Daniel Veillard	491e58e575	applied patch from Michael Day to add support for <embed> Daniel * HTMLparser.c: applied patch from Michael Day to add support for <embed> Daniel svn path=/trunk/; revision=3611	2007-05-02 16:15:18 +00:00
Daniel Veillard	739e9d0981	Dohh ! Daniel svn path=/trunk/; revision=3610	2007-04-27 09:33:58 +00:00
Daniel Veillard	4d1320fa5b	Jean-Daniel Dupas pointed a couple of problems in htmlCreateDocParserCtxt. * HTMLparser.c: Jean-Daniel Dupas pointed a couple of problems in htmlCreateDocParserCtxt. Daniel svn path=/trunk/; revision=3609	2007-04-26 08:55:33 +00:00
Daniel Veillard	42720248e6	change the way script/style are parsed to not try to detect comments, * HTMLparser.c: change the way script/style are parsed to not try to detect comments, reported by Mike Day * result/HTML/doc3.*: affects the result of that test Daniel svn path=/trunk/; revision=3598	2007-04-16 07:02:31 +00:00
William M. Brack	e978ae25ca	fixed memory access error on parsing of meta data which had errors (bug * HTMLparser.c: fixed memory access error on parsing of meta data which had errors (bug #382206). Also cleaned up a few warnings by adding some additional DECL macros. svn path=/trunk/; revision=3593	2007-03-21 06:16:02 +00:00
Daniel Veillard	1032ac4c5c	applied patch from Steven Rainwater to fix UTF8ToHtml behaviour on code * HTMLparser.c: applied patch from Steven Rainwater to fix UTF8ToHtml behaviour on code points which are not mappable to predefined HTML entities, fixes #377544 Daniel	2006-11-23 16:18:30 +00:00
Daniel Veillard	772869fe10	change htmlCtxtReset() following Michael Day bug report and suggestion. * HTMLparser.c: change htmlCtxtReset() following Michael Day bug report and suggestion. Daniel	2006-11-08 09:16:56 +00:00
Daniel Veillard	890fd9f9f3	applied a reworked version of Usamah Malik patch to avoid growing the * HTMLparser.c: applied a reworked version of Usamah Malik patch to avoid growing the parser stack in some autoclose cases, should fix #361221 Daniel	2006-10-27 12:53:28 +00:00
Daniel Veillard	af616a7386	fix one problem found in htmlCtxtUseOptions() and pointed in #340591 * HTMLparser.c: fix one problem found in htmlCtxtUseOptions() and pointed in #340591 Daniel	2006-10-17 20:18:39 +00:00
Daniel Veillard	8a82ae12c3	fixed teh 2 stupid bugs affecting htmlReadDoc() and htmlReadIO() this * HTMLparser.c: fixed teh 2 stupid bugs affecting htmlReadDoc() and htmlReadIO() this should fix #340322 Daniel	2006-10-17 20:04:10 +00:00
Daniel Veillard	c47d263049	fixing HTML minimized attribute values to be generated internally if not * HTMLparser.c: fixing HTML minimized attribute values to be generated internally if not present, fixes bug #332124 * result/HTML/doc2.htm.sax result/HTML/doc3.htm.sax result/HTML/wired.html.sax: this affects the SAX event strem for a few test cases Daniel	2006-10-17 16:13:27 +00:00
Daniel Veillard	48519092e5	fixing HTML entities in attributes parsing bug #362552 added to the * HTMLparser.c: fixing HTML entities in attributes parsing bug #362552 * result/HTML/entities2.html* test/HTML/entities2.html: added to the regression suite Daniel	2006-10-17 15:56:35 +00:00
Daniel Veillard	7e30356556	fix #348252 if the document clains to be in a different encoding in the * HTMLparser.c: fix #348252 if the document clains to be in a different encoding in the meta tag and it's obviously wrong, don't screw up the end of the content. Daniel	2006-10-16 13:14:55 +00:00
Daniel Veillard	68716a772c	fix a chunking and script bug #347708 Daniel * HTMLparser.c: fix a chunking and script bug #347708 Daniel	2006-10-16 09:32:17 +00:00
Daniel Veillard	28aac0b0f4	remove a warning check with uppercase for AIX iconv() should fix #352644 * HTMLparser.c: remove a warning * encoding.c: check with uppercase for AIX iconv() should fix #352644 * doc/examples/Makefile.am: partially handle one bug report Daniel	2006-10-16 08:31:18 +00:00
Daniel Veillard	f1a27c659e	added --html --memory to test htmlReadMemory to test #321632 added various * xmllint.c: added --html --memory to test htmlReadMemory to test #321632 * HTMLparser.c: added various initialization calls which may help #321632 but not conclusive * testapi.c tree.c include/libxml/tree.h: fixed compilation with --with-minimum --with-sax1 and --with-minimum --with-schemas fixing #326442 Daniel	2006-10-13 22:33:03 +00:00
Daniel Veillard	34c647cfae	exports htmlNewParserCtxt() as Michael Day pointed out this is needed to * HTMLparser.c include/libxml/HTMLparser.h: exports htmlNewParserCtxt() as Michael Day pointed out this is needed to use htmlCtxtRead*() Daniel	2006-09-21 06:53:59 +00:00
Daniel Veillard	065abe8565	applied const'ification of strings patch from Matthias Clasen Daniel * HTMLparser.c: applied const'ification of strings patch from Matthias Clasen Daniel	2006-07-03 08:55:04 +00:00
Daniel Veillard	30e7607b7a	a bunch of small cleanups based on coverity reports. Daniel * HTMLparser.c parser.c parserInternals.c pattern.c uri.c: a bunch of small cleanups based on coverity reports. Daniel	2006-03-09 14:13:55 +00:00
Daniel Veillard	499cc9204f	try to fix xmlParseInNodeContext when operating on an HTML document. * HTMLparser.c libxml.h parser.c: try to fix xmlParseInNodeContext when operating on an HTML document. Daniel	2006-01-18 17:22:35 +00:00
Daniel Veillard	6a0baa0cd8	fixed a number of warnings shown by HP-UX compiler and reported by Rick * HTMLparser.c configure.in parserInternals.c runsuite.c runtest.c testapi.c xmlschemas.c xmlschemastypes.c xmlstring.c: fixed a number of warnings shown by HP-UX compiler and reported by Rick Jones Daniel	2005-12-10 11:11:12 +00:00
Daniel Veillard	b990008f05	script HTML parser error fix, corrects bug #319715 added test from Michael * HTMLparser.c: script HTML parser error fix, corrects bug #319715 * result/HTML/53867* test/HTML/53867.html: added test from Michael Day to the regression suite Daniel	2005-10-25 12:36:29 +00:00
Daniel Veillard	2cf36a1cc1	typo fix from Michael Day Daniel * HTMLparser.c: typo fix from Michael Day Daniel	2005-10-25 12:21:29 +00:00
Daniel Veillard	36d73403ff	Applied the last patch from Gary Coady for #304637 changing the behaviour * HTMLparser.c: Applied the last patch from Gary Coady for #304637 changing the behaviour when text nodes are found in body * result/HTML/*: this changes the output of some tests Daniel	2005-09-01 09:52:30 +00:00
Daniel Veillard	8874b94cd2	added a parser XML_PARSE_COMPACT option to allocate small text nodes (less * HTMLparser.c parser.c SAX2.c debugXML.c tree.c valid.c xmlreader.c xmllint.c include/libxml/HTMLparser.h include/libxml/parser.h: added a parser XML_PARSE_COMPACT option to allocate small text nodes (less than 8 bytes on 32bits, less than 16bytes on 64bits) directly within the node, various changes to cope with this. * result/XPath/tests/* result/XPath/xptr/* result/xmlid/*: this slightly change the output Daniel	2005-08-25 13:19:21 +00:00
Daniel Veillard	ea4b0baef2	added a recovery mode for the HTML parser based on the suggestions of bug * HTMLparser.c include/libxml/HTMLparser.h: added a recovery mode for the HTML parser based on the suggestions of bug #169834 by Paul Loberg Daniel	2005-08-23 16:06:08 +00:00
Daniel Veillard	d2755a8134	fixed an uninitialized memory access spotted by valgrind Daniel * HTMLparser.c: fixed an uninitialized memory access spotted by valgrind Daniel	2005-08-07 23:42:39 +00:00
Daniel Veillard	24505b0f5c	a lot of small cleanups based on Linus' sparse check output. Daniel * HTMLparser.c SAX2.c encoding.c globals.c parser.c relaxng.c runsuite.c runtest.c schematron.c testHTML.c testReader.c testRegexp.c testSAX.c testThreads.c valid.c xinclude.c xmlIO.c xmllint.c xmlmodule.c xmlschemas.c xpath.c xpointer.c: a lot of small cleanups based on Linus' sparse check output. Daniel	2005-07-28 23:49:35 +00:00
Daniel Veillard	7d2b323ed6	fixed a potential buffer overrun error introduced on last commit to * HTMLparser.c: fixed a potential buffer overrun error introduced on last commit to htmlParseScript() c.f. #310229 Daniel	2005-07-14 08:57:39 +00:00
Daniel Veillard	358fef4b1e	applied UTF-8 script parsing bug #310229 fix from Jiri Netolicky added the * HTMLparser.c: applied UTF-8 script parsing bug #310229 fix from Jiri Netolicky * result/HTML/script2.html* test/HTML/script2.html: added the test case from the regression suite Daniel	2005-07-13 16:37:38 +00:00
Daniel Veillard	597f1c1f34	applied patch from James Bursa fixing an html parsing bug in push mode * HTMLparser.c: applied patch from James Bursa fixing an html parsing bug in push mode * result/HTML/repeat.html* test/HTML/repeat.html: added the test to the regression suite Daniel	2005-07-03 23:00:18 +00:00
Daniel Veillard	5d4644ef6e	revamped the elfgcchack.h format to cope with gcc4 change of aliasing * doc/apibuild.py doc/elfgcchack.xsl: revamped the elfgcchack.h format to cope with gcc4 change of aliasing allowed scopes, had to add extra informations to doc/libxml2-api.xml to separate the header from the c module source. * .c: updated all c library files to add a #define bottom_xxx and reimport elfgcchack.h thereafter, and a bit of cleanups. doc//* testapi.c: regenerated when rebuilding the API Daniel	2005-04-01 13:11:58 +00:00
William M. Brack	21e4ef20f6	Re-examined the problems of configuring a "minimal" library. Synchronized the header files with the library code in order to assure that all the various conditionals (LIBXML_xxxx_ENABLED) were the same in both. Modified the API database content to more accurately reflect the conditionals. Enhanced the generation of that database. Although there was no substantial change to any of the library code's logic, a large number of files were modified to achieve the above, and the configuration script was enhanced to do some automatic enabling of features (e.g. --with-xinclude forces --with-xpath). Additionally, all the format errors discovered by apibuild.py were corrected. * configure.in: enhanced cross-checking of options * doc/apibuild.py, doc/elfgcchack.xsl, doc/libxml2-refs.xml, doc/libxml2-api.xml, gentest.py: changed the usage of the <cond> element in module descriptions * elfgcchack.h, testapi.c: regenerated with proper conditionals * HTMLparser.c, SAX.c, globals.c, tree.c, xmlschemas.c, xpath.c, testSAX.c: cleaned up conditionals * include/libxml/[SAX.h, SAX2.h, debugXML.h, encoding.h, entities.h, hash.h, parser.h, parserInternals.h, schemasInternals.h, tree.h, valid.h, xlink.h, xmlIO.h, xmlautomata.h, xmlreader.h, xpath.h]: synchronized the conditionals with the corresponding module code * doc/examples/tree2.c, doc/examples/xpath1.c, doc/examples/xpath2.c: added additional conditions required for compilation * doc/.html, doc/html/.html: rebuilt the docs	2005-01-02 09:53:13 +00:00
Daniel Veillard	29614c7040	make sure xmlCtxtReadFile and htmlCtxtReadFile go through the catalog * HTMLparser.c parser.c: make sure xmlCtxtReadFile and htmlCtxtReadFile go through the catalog resolution. * gentest.py testapi.c: fix a side effect wrning of the change Daniel	2004-11-26 10:47:26 +00:00
Daniel Veillard	a521d28751	better handling of conditional features more testing on parser contexts * gentest.py testapi.c: better handling of conditional features * HTMLparser.c SAX2.c parserInternals.c xmlwriter.c: more testing on parser contexts closed leaks, error messages Daniel	2004-11-09 14:59:59 +00:00
Daniel Veillard	4259532303	more types, more coverage more problems fixed Daniel * gentest.py testapi.c: more types, more coverage * parser.c parserInternals.c relaxng.c valid.c xmlIO.c xmlschemastypes.c: more problems fixed Daniel	2004-11-08 10:52:06 +00:00
Daniel Veillard	ce682bc24b	autogenerate a minimal NULL value sequence for unknown pointer types This * gentest.py testapi.c: autogenerate a minimal NULL value sequence for unknown pointer types * HTMLparser.c SAX2.c chvalid.c encoding.c entities.c parser.c parserInternals.c relaxng.c valid.c xmlIO.c xmlreader.c xmlsave.c xmlschemas.c xmlschemastypes.c xmlstring.c xpath.c xpointer.c: This uncovered an impressive amount of entry points not checking for NULL pointers when they ought to, closing all the open gaps. Daniel	2004-11-05 17:22:25 +00:00
Daniel Veillard	a03e36566b	more developments on the API testing more cleanups rebuilt Daniel * gentest.py testapi.c: more developments on the API testing * HTMLparser.c tree.c: more cleanups * doc/*: rebuilt Daniel	2004-11-02 18:45:30 +00:00
Daniel Veillard	eff45a92da	register xmlSchemaSetValidErrors, patch from Brent Hendricks in the * python/libxml.c: register xmlSchemaSetValidErrors, patch from Brent Hendricks in the mailing-list * include/libxml/valid.h HTMLparser.c SAX2.c valid.c parserInternals.c: fix #156626 and more generally how to find out if a validation contect is part of a parsing context or not. This can probably be improved to make 100% sure that vctxt->userData is the parser context too. It's a bit hairy because we can't change the xmlValidCtxt structure without breaking the ABI since this change xmlParserCtxt information indexes. Daniel	2004-10-29 12:10:55 +00:00
Daniel Veillard	fc484dd0a0	added support for HTML PIs #156087 added specific tests Daniel * HTMLparser.c: added support for HTML PIs #156087 * test/HTML/python.html result/HTML/python.html*: added specific tests Daniel	2004-10-22 14:34:23 +00:00
William M. Brack	d1757abcb8	added two new macros IS_ASCII_LETTER and IS_ASCII_DIGIT used with (html) * include/libxml/parserInternals.h: added two new macros IS_ASCII_LETTER and IS_ASCII_DIGIT used with (html) parsing and xpath for testing data not necessarily unicode. * HTMLparser.c, xpath.c: changed use of IS_LETTER_CH and IS_DIGIT_CH macros to ascii versions (bug 153936).	2004-10-02 22:07:48 +00:00
Daniel Veillard	079f6a7559	more memory related code cleanups. Daniel * HTMLparser.c parser.c relaxng.c xmlschemas.c: more memory related code cleanups. Daniel	2004-09-23 13:15:03 +00:00
Daniel Veillard	7a5e0dd1fc	removed some extern before function code reported by Kjartan Maraas on IRC * parser.c: removed some extern before function code reported by Kjartan Maraas on IRC * legacy.c: fixed compiling when configuring out the HTML parser * Makefile.am: added a declaration for CVS_EXTRA_DIST * HTMLparser.c: beginning of an attempt at cleaning up the construction of the HTML parser data structures, current data generate a huge amount of ELF relocations at loading time. Daniel	2004-09-17 08:45:25 +00:00
William M. Brack	d43cdcd6a2	fixed initialisation problem for htmlReadMemory (bug 149041) * HTMLparser.c: fixed initialisation problem for htmlReadMemory (bug 149041)	2004-08-03 15:13:29 +00:00
Daniel Veillard	7cc235722c	1 line patch, apparently htmlNewDoc() was not setting doc->charset. Daniel * HTMLparser.c: 1 line patch, apparently htmlNewDoc() was not setting doc->charset. Daniel	2004-07-29 11:20:30 +00:00
Daniel Veillard	18a65095e0	fix to the fix for #141864 from Paul Elseth apply fix from David Gatwood * xmlIO.c: fix to the fix for #141864 from Paul Elseth * HTMLparser.c result/HTML/doc3.htm: apply fix from David Gatwood for #141195 about text between comments. Daniel	2004-05-11 15:57:42 +00:00
Daniel Veillard	25d5d9ac65	applied patch from James Bursa, frameset should close head. Daniel * HTMLparser.c: applied patch from James Bursa, frameset should close head. Daniel	2004-04-05 07:08:42 +00:00
Daniel Veillard	500a1de533	applied patch from Alfred Mickautsch for better DTD support. fixed bug * xmlwriter.c include/libxml/xmlwriter.h doc/* : applied patch from Alfred Mickautsch for better DTD support. * SAX2.c HTMLparser.c parser.c xinclude.c xmllint.c xmlreader.c xmlschemas.c: fixed bug #137867 i.e. fixed properly the way reference counting is handled in the XML parser which had the side effect of removing a lot of hazardous cruft added to try to fix the problems associated as they popped up. * xmlIO.c: FILE * close fixup for stderr/stdout Daniel	2004-03-22 15:22:58 +00:00
Daniel Veillard	d3669b2fd1	avoid ID error message if using HTML_PARSE_NOERROR should fix #130762 * valid.c HTMLparser.c: avoid ID error message if using HTML_PARSE_NOERROR should fix #130762 Daniel	2004-02-25 12:34:55 +00:00
William M. Brack	edb65a7ad0	added initialisation for ctxt->vctxt in HTMLInitParser (bug 133127) minor * HTMLparser.c: added initialisation for ctxt->vctxt in HTMLInitParser (bug 133127) * valid.c: minor cosmetic change (removed ATTRIBUTE_UNUSED from several function params)	2004-02-06 07:36:04 +00:00
Daniel Veillard	87247e8740	applied patch from Mark Vadoc to not use SAX1 unless necessary. Daniel * HTMLparser.c relaxng.c testRelax.c testSchemas.c: applied patch from Mark Vadoc to not use SAX1 unless necessary. Daniel	2004-01-13 20:42:02 +00:00
Daniel Veillard	c59d826ef9	applied two parsing fixes from James Bursa Daniel * HTMLparser.c: applied two parsing fixes from James Bursa Daniel	2003-11-20 21:59:12 +00:00
Daniel Veillard	157fee019d	previous fix for #124044 was broken, correct fix provided. fix * python/libxml.c: previous fix for #124044 was broken, correct fix provided. * HTMLparser.c parser.c parserInternals.c xmlIO.c: fix xmlStopParser() and the error handlers to address #125877 Daniel	2003-10-31 10:36:03 +00:00
Daniel Veillard	652f9aa966	Fix #124907 by simply backporting the same fix as for the XML parser * HTMLparser.c: Fix #124907 by simply backporting the same fix as for the XML parser * result/HTML/doc3.htm.err: change to ID detecting modified one test result. Daniel	2003-10-28 22:04:45 +00:00
Daniel Veillard	05bcb7ed30	fixed to not send NULL to %s printing cleaning up some of the regression * HTMLparser.c: fixed to not send NULL to %s printing * python/tests/error.py result/HTML/doc3.htm.err result/HTML/test3.html.err result/HTML/wired.html.err result/valid/t8.xml.err result/valid/t8a.xml.err: cleaning up some of the regression tests error Daniel	2003-10-19 14:26:34 +00:00
William M. Brack	76e95df055	Changed all (?) occurences where validation macros (IS_xxx) had * include/libxml/parserInternals.h HTMLparser.c HTMLtree.c SAX2.c catalog.c debugXML.c entities.c parser.c relaxng.c testSAX.c tree.c valid.c xmlschemas.c xmlschemastypes.c xpath.c: Changed all (?) occurences where validation macros (IS_xxx) had single-byte arguments to use IS_xxx_CH instead (e.g. IS_BLANK changed to IS_BLANK_CH). This gets rid of many warning messages on certain platforms, and also high- lights places in the library which may need to be enhanced for proper UTF8 handling.	2003-10-18 16:20:14 +00:00
Daniel Veillard	659e71ec24	Setting up the framework for structured error reporting, touches a lot of * HTMLparser.c c14n.c catalog.c error.c globals.c parser.c parserInternals.c relaxng.c valid.c xinclude.c xmlIO.c xmlregexp.c xmlschemas.c xpath.c xpointer.c include/libxml/globals.h include/libxml/parser.h include/libxml/valid.h include/libxml/xmlerror.h: Setting up the framework for structured error reporting, touches a lot of modules, but little code now the error handling trail has been cleaned up. Daniel	2003-10-10 14:10:40 +00:00
Daniel Veillard	f403d298c3	more code cleanup, especially around error messages, the HTML parser has * HTMLparser.c Makefile.am legacy.c parser.c parserInternals.c include/libxml/xmlerror.h: more code cleanup, especially around error messages, the HTML parser has now been upgraded to the new handling. * result/HTML/*: a few changes in the resulting error messages Daniel	2003-10-05 13:51:35 +00:00
Daniel Veillard	73b013fc17	added a new configure option --with-push, some cleanups, chased code size * HTMLparser.c Makefile.am configure.in legacy.c parser.c parserInternals.c testHTML.c xmllint.c include/libxml/HTMLparser.h include/libxml/parser.h include/libxml/parserInternals.h include/libxml/xmlversion.h.in: added a new configure option --with-push, some cleanups, chased code size anomalies. Now a library configured --with-minimum is around 150KB, sounds good enough. Daniel	2003-09-30 12:36:01 +00:00
William M. Brack	899e64aa2f	minor change to avoid compilation warnings on some (e.g. AIX) systems * HTMLparser.c, entities.c, xmlreader.c: minor change to avoid compilation warnings on some (e.g. AIX) systems	2003-09-26 18:03:42 +00:00
Daniel Veillard	9475a352bd	added the same htmlRead APIs than their XML counterparts new parser * HTMLparser.c testHTML.c xmllint.c include/libxml/HTMLparser.h: added the same htmlRead APIs than their XML counterparts * include/libxml/parser.h: new parser options, not yet implemented, added an options field to the context. * tree.c: patch from Shaun McCance to fix bug #123238 when ]]> is found within a cdata section. * result/noent/cdata2 result/cdata2 result/cdata2.rdr result/cdata2.sax test/cdata2: add one more cdata test Daniel	2003-09-26 12:47:50 +00:00
Daniel Veillard	092643b52d	preparing a beta3 solving the ABI problems make sure the global variables * configure.in: preparing a beta3 solving the ABI problems * globals.c parser.c parserInternals.c testHTML.c HTMLparser.c SAX.c include/libxml/globals.h include/libxml/SAX.h: make sure the global variables for the default SAX handler are V1 ones to avoid ABI compat problems. * xmlreader.c: cleanup of uneeded code * hash.c: fix a comment Daniel	2003-09-25 14:29:29 +00:00
Daniel Veillard	40412cda44	when creating a DOCTYPE use "html" lowercase by default instead of "HTML" * HTMLparser.c: when creating a DOCTYPE use "html" lowercase by default instead of "HTML" * parser.c xmlreader.c: optimization, gain a few % parsing speed by avoiding calls to "areBlanks" when not needed. * include/libxml/parser.h include/libxml/tree.h: some structure extensions for future work on using per-document dictionaries. Daniel	2003-09-03 13:28:32 +00:00
Igor Zlatkovic	d37c1394a7	added few casts to shut the compiler warnings	2003-08-28 10:34:33 +00:00
Daniel Veillard	2fdbd32d51	new dictionary module to keep a single instance of the names used by the * dict.c include/libxml/dict.h Makefile.am include/libxml/Makefile.am: new dictionary module to keep a single instance of the names used by the parser * DOCBparser.c HTMLparser.c parser.c parserInternals.c valid.c: switched all parsers to use the dictionary internally * include/libxml/HTMLparser.h include/libxml/parser.h include/libxml/parserInternals.h include/libxml/valid.h: Some of the interfaces changed as a result to receive or return "const xmlChar " instead of "xmlChar ", this is either insignificant from an user point of view or when the returning value changed, those function are really parser internal methods that no user code should really change * doc/libxml2-api.xml doc/html/*: the API interface changed and the docs were regenerated Daniel	2003-08-18 12:15:38 +00:00
Daniel Veillard	e8ed62033c	allocation error #119784 raised by Oliver Stoeneberg Daniel * HTMLparser.c: allocation error #119784 raised by Oliver Stoeneberg Daniel	2003-08-14 23:39:01 +00:00
Daniel Veillard	b19ba83f07	fixed the serious CPU usage problem reported by Grant Goodale applied * parser.c: fixed the serious CPU usage problem reported by Grant Goodale * HTMLparser.c: applied patch from Oliver Kidman about a free missing in htmlSAXParseDoc Daniel	2003-08-14 00:33:46 +00:00
Daniel Veillard	14f752c2b7	fixed a nasty bug #119387 , bad heuristic from the progressive HTML parser * HTMLparser.c: fixed a nasty bug #119387, bad heuristic from the progressive HTML parser front-end on large character data island leading to an erroneous end of data detection by the parser. Some cleanup too to get closer from the XML progressive parser. Daniel	2003-08-09 11:44:50 +00:00
William M. Brack	c193956ee1	small changes to syntax to get rid of compiler warnings. No changes to * error.c HTMLparser.c testC14N.c testHTML.c testURI.c xmlcatalog.c xmlmemory.c xmlreader.c xmlschemastypes.c python/libxml.c include/libxml/xmlmemory.h: small changes to syntax to get rid of compiler warnings. No changes to logic.	2003-08-05 15:52:22 +00:00
Daniel Veillard	8d73bcb50f	added a new API to split a QName without generating any memory allocation * tree.c include/libxml/tree.h: added a new API to split a QName without generating any memory allocation * valid.c: fixed another problem with namespaces on element in mixed content case * python/tests/reader2.py: updated the testcase with Bjorn Reese fix to reader for unsignificant white space * parser.c HTMLparser.c: cleanup. Daniel	2003-08-04 01:06:15 +00:00
William M. Brack	78637da0ea	fixing bug 118559	2003-07-31 14:47:38 +00:00
Daniel Veillard	97e018861b	applied a patch from William Brack about the problem of parsing very large * HTMLparser.c: applied a patch from William Brack about the problem of parsing very large HTML instance with comments as raised by Nick Kew Daniel	2003-07-30 18:59:19 +00:00
William M. Brack	4a557d97bf	fixed problem with comments reported by Nick Kew added routines * HTMLparser.c: fixed problem with comments reported by Nick Kew * encoding.c: added routines xmlUTF8Size and xmlUTF8Charcmp for some future cleanup of UTF8 handling	2003-07-29 04:28:04 +00:00
Daniel Veillard	34ba387936	removed some warnings by casting xmlChar to unsigned int and a couple of * DOCBparser.c HTMLparser.c entities.c parser.c relaxng.c xmlschemas.c xpath.c: removed some warnings by casting xmlChar to unsigned int and a couple of others. * xmlschemastypes.c: fixes a segfault on empty hexBinary strings Daniel	2003-07-15 13:34:05 +00:00
Daniel Veillard	d9d32aebd3	use the character() SAX callback if the cdataBlock ain't defined. fix bug * parser.c HTMLparser.c: use the character() SAX callback if the cdataBlock ain't defined. * xpath.c: fix bug #115349 allowing compilation when configured with --without-xpath since the Schemas code needs NAN and co. Daniel	2003-07-05 20:32:43 +00:00
Daniel Veillard	104caa3df0	oops last commit introduced a memory leak. Daniel * HTMLparser.c: oops last commit introduced a memory leak. Daniel	2003-05-13 22:54:05 +00:00
Daniel Veillard	e8b09e40f7	added --nonet option fixing #112803 by adding --nonet when calling * xmllint.c doc/xmllint.xml: added --nonet option * doc/Makefile.am: fixing #112803 by adding --nonet when calling xsltproc or xmllint * doc/xmllint.xml doc/xmllint.1: also added --schema doc and rebuilt * HTMLparser.c: cleaned up the HTML parser context build when using an URL Daniel	2003-05-13 22:14:13 +00:00
Daniel Veillard	45269b8bb9	tried to fix #98879 again in a more solid way. Daniel * HTMLparser.c: tried to fix #98879 again in a more solid way. Daniel	2003-04-22 13:21:57 +00:00
Daniel Veillard	3c908dca47	added xmlMallocAtomic() to be used when allocating blocks which do not * DOCBparser.c HTMLparser.c c14n.c catalog.c encoding.c globals.c nanohttp.c parser.c parserInternals.c relaxng.c tree.c uri.c xmlmemory.c xmlreader.c xmlregexp.c xpath.c xpointer.c include/libxml/globals.h include/libxml/xmlmemory.h: added xmlMallocAtomic() to be used when allocating blocks which do not contains pointers, add xmlGcMemSetup() and xmlGcMemGet() to allow registering the full set of functions needed by a garbage collecting allocator like libgc, ref #109944 Daniel	2003-04-19 00:07:51 +00:00
Daniel Veillard	02ea141495	exported htmlCreateMemoryParserCtxt() it was static Daniel * HTMLparser.c include/libxml/HTMLparser.h: exported htmlCreateMemoryParserCtxt() it was static Daniel	2003-04-09 12:08:47 +00:00
Daniel Veillard	6560a42c7b	two patches from James Bursa on the HTML parser and a typo reindenting, * HTMLparser.c tree.c: two patches from James Bursa on the HTML parser and a typo * xmlschemastypes.c: reindenting, fixing a memory access problem with dates. Daniel	2003-03-27 21:25:38 +00:00
Daniel Veillard	77a90a7f8e	patch from johan@evenhuis.nl for #107937 fixing some line counting * HTMLparser.c parser.c parserInternals.c: patch from johan@evenhuis.nl for #107937 fixing some line counting problems, and some other cleanups. * result/HTML/: this result in some line number changes Daniel	2003-03-22 00:04:05 +00:00
Daniel Veillard	5f704afe98	made powten array static it should not be exported fix bug #107361 by * xmlschemastype.c: made powten array static it should not be exported * HTMLparser.c: fix bug #107361 by reusing the code from the XML parser function. * testHTML.c: get rid of valgrind messages on the HTML SAX tests Daniel	2003-03-05 10:01:43 +00:00
Igor Zlatkovic	5f9fada355	obsoleted xmlNormalizeWindowsPath	2003-02-19 14:51:00 +00:00
Daniel Veillard	1703c5fc23	OASIS RelaxNG testsuite python script to run regression against OASIS * test/relaxng/OASIS/spectest.xml: OASIS RelaxNG testsuite * check-relaxng-test-suite.py: python script to run regression against OASIS RelaxNG testsuite * relaxng.c: some cleanup tweaks * HTMLparser.c globals.c: cleanups in comments * doc/libxml2-api.xml: updated the API * result/relaxng/*: errors moved files, so large diffs but no changes at the semantic level. Daniel	2003-02-10 14:28:44 +00:00
Daniel Veillard	71531f3345	comments cleanups use xmllint for doing the RelaxNG tests preparing 2.5.2 * HTMLparser.c tree.c xmlIO.c: comments cleanups * Makefile.am: use xmllint for doing the RelaxNG tests * configure.in: preparing 2.5.2 made schemas support default to on instead of off * relaxng.c: removed the verbosity * xmllint.c: added --relaxng option * python/generator.py python/libxml_wrap.h: prepared the integration of the new RelaxNG module and schemas * result/relaxng/*: less verbose output Daniel	2003-02-05 13:19:53 +00:00
Daniel Veillard	930dfb6324	applied HTML improvements from Nick Kew, allowing to do more checking to * HTMLparser.c include/libxml/HTMLparser.h: applied HTML improvements from Nick Kew, allowing to do more checking to HTML elements and attributes. Daniel	2003-02-05 10:17:38 +00:00
Daniel Veillard	358a98961b	applied patch from Arne de Bruijn fixing bug #103827 Daniel * HTMLparser.c: applied patch from Arne de Bruijn fixing bug #103827 Daniel	2003-02-04 15:22:32 +00:00
Daniel Veillard	eb1371795f	updating a comment, fixing #103776 Daniel * HTMLparser.c: updating a comment, fixing #103776 Daniel	2003-02-04 15:18:06 +00:00
Daniel Veillard	e5b110b384	try to fix # 105049 a couple of changes and extensions updated a function * HTMLparser.c: try to fix # 105049 * relaxng.c xmlschemastypes.c: a couple of changes and extensions * tree.c: updated a function comment Daniel	2003-02-04 14:43:39 +00:00
Daniel Veillard	e55e8e4833	fixed bug #102960 by reusing the XML name parsing routines. Daniel * HTMLparser.c: fixed bug #102960 by reusing the XML name parsing routines. Daniel	2003-01-10 12:50:02 +00:00
Daniel Veillard	01c13b5be2	code cleanup, especially the function comments. fixed a small bug when * DOCBparser.c HTMLparser.c c14n.c debugXML.c encoding.c hash.c nanoftp.c nanohttp.c parser.c parserInternals.c testC14N.c testDocbook.c threads.c tree.c valid.c xmlIO.c xmllint.c xmlmemory.c xmlreader.c xmlregexp.c xmlschemas.c xmlschemastypes.c xpath.c: code cleanup, especially the function comments. * tree.c: fixed a small bug when freeing nodes which are XInclude ones. Daniel	2002-12-10 15:19:08 +00:00
Daniel Veillard	1c732d2e10	code cleanup Daniel * DOCBparser.c HTMLparser.c parser.c valid.c xpath.c: code cleanup Daniel	2002-11-30 11:22:59 +00:00
Daniel Veillard	fee408f5eb	final touch at closing #87235 </p> end tags need to be generated. this * HTMLparser.c: final touch at closing #87235 </p> end tags need to be generated. * result/HTML/cf_128.html result/HTML/test2.html result/HTML/test3.html: this change slightly the output of a few tests * doc/*: regenerated Daniel	2002-11-22 13:18:30 +00:00
Daniel Veillard	bc6e1a3857	fixed bug #98879 a corner case when 0 is included in HTML documents and * HTMLparser.c: fixed bug #98879 a corner case when 0 is included in HTML documents and using the push parser. Daniel	2002-11-18 15:07:25 +00:00
Daniel Veillard	dad3f680e5	preparing release 2.4.27 updated and rebuilt the docs try to make sure the * configure.in: preparing release 2.4.27 * doc/* : updated and rebuilt the docs * doc/Makefile.am libxml.spec.in: try to make sure the tutorial and all the docs are actually packaged and in the final RPMs * parser.c parserInternals.c include/libxml/parser.h: restore xmllint --recover feature. Daniel	2002-11-17 16:47:27 +00:00
Daniel Veillard	8dd86a5b61	strengthen the guard in the Pop macros, like in the XML parser, closes bug * HTMLparser.c: strengthen the guard in the Pop macros, like in the XML parser, closes bug #97315 Daniel	2002-11-12 21:14:17 +00:00
Daniel Veillard	ce02dbc430	Mikhail Sogrine pointed out a bug in HTML parsing, applied his patch added * HTMLparser.c: Mikhail Sogrine pointed out a bug in HTML parsing, applied his patch * result/HTML/attrents.html result/HTML/attrents.html.err result/HTML/attrents.html.sax test/HTML/attrents.html: added the test and result case provided by Mikhail Sogrine Daniel	2002-10-22 19:14:58 +00:00
Daniel Veillard	e645e8c141	Applied the VMS update patch from Craig A. Berry update Daniel * vms/build_libxml.com vms/config.vms vms/readme.vms include/libxml/parser.h include/libxml/parserInternals.h include/libxml/tree.h include/libxml/xmlIO.h HTMLparser.c catalog.c debugXML.c parser.c parserInternals.c tree.c triodef.h trionan.c uri.c xmlIO.c xpath.c: Applied the VMS update patch from Craig A. Berry * doc/*.html: update Daniel	2002-10-22 17:35:37 +00:00
Daniel Veillard	a646cfdb14	small cleanup switched DTD validation to use only regexp when configured * HTMLparser.c: small cleanup * valid.c xmlregexp.c: switched DTD validation to use only regexp when configured with them. A bit of debugging around the determinism checks is still needed Daniel	2002-09-17 21:50:03 +00:00
Daniel Veillard	f4862f0f36	messing around with support for Windows path, cleanups, trying to identify * include/libxml/xmlIO.h xmlIO.c parser.c HTMLparser.c DOCBparser.c: messing around with support for Windows path, cleanups, trying to identify and fix the various code path to the filename access. Added xmlNormalizeWindowsPath() Daniel	2002-09-10 11:13:43 +00:00
Daniel Veillard	3487c8d9bb	get rid of all the perror() calls made in the library execution paths. * DOCBparser.c HTMLparser.c c14n.c entities.c list.c parser.c parserInternals.c xmlIO.c: get rid of all the perror() calls made in the library execution paths. This should fix both #92059 and #92385 Daniel	2002-09-05 11:33:25 +00:00
Daniel Veillard	1d9952716d	fixing bug #84876 based on the xml working code. Daniel * HTMLparser.c: fixing bug #84876 based on the xml working code. Daniel	2002-07-22 16:43:32 +00:00
Daniel Veillard	8c9872ca2e	trying to fix 87235 about discarded white spaces in the HTML parser. this * HTMLparser.c: trying to fix 87235 about discarded white spaces in the HTML parser. * result/HTML/*: this changes the output of a number of HTML regression tests Daniel	2002-07-05 18:17:10 +00:00
Aleksey Sanin	49cc97565f	replaced sprintf() with snprintf() to prevent possible buffer overflow * DOCBparser.c HTMLparser.c debugXML.c encoding.c nanoftp.c nanohttp.c parser.c tree.c uri.c xmlIO.c xmllint.c xpath.c: replaced sprintf() with snprintf() to prevent possible buffer overflow (the bug was pointed out by Anju Premachandran)	2002-06-14 17:07:10 +00:00
Daniel Veillard	1b31e4a0b2	fixing #79334 making htmlParseDocument a public entry point. rebuilt the * HTMLparser.c win32/libxml2.def.src win32/dsp/libxml2.def.src include/libxml/HTMLparser.h: fixing #79334 making htmlParseDocument a public entry point. * doc/*: rebuilt the API and docs Daniel	2002-05-27 14:44:50 +00:00
Daniel Veillard	561b7f883e	dohh I really didn't intended to commit this test version :-( Daniel * HTMLparser.c error.c parser.c parserInternals.c tree.c xmlIO.c include/libxml/tree.h: dohh I really didn't intended to commit this test version :-( Daniel	2002-03-20 21:55:57 +00:00
Daniel Veillard	e50f3b5d54	I wanted to see the real speed at the SAX interface after a little too * testSAX.c: I wanted to see the real speed at the SAX interface after a little too many Ximianer started complaining about the parser speed. added a --quiet option: paphio:~/XML -> ls -l db100000.xml -rw-rw-r-- 1 veillard www 20182040 Mar 20 10:30 db100000.xml paphio:~/XML -> time ./testSAX --quiet db100000.xml 3200006 callbacks generated real 0m1.270s Which means 16MBytes/s and 3Mcallback/s Daniel	2002-03-20 19:24:21 +00:00
Daniel Veillard	34ce8bece2	preparing 2.4.18 updated and rebuilt the web site implement the new * configure.in: preparing 2.4.18 * doc/: updated and rebuilt the web site .c libxml.h: implement the new IN_LIBXML scheme discussed with the Windows and Cygwin maintainers. parser.c: humm, changed the way the SAX parser work when xmlSubstituteEntitiesDefault(1) is set, it will then do the entity registration and loading by itself in case the user provided SAX getEntity() returns NULL. * testSAX.c: added --noent to test the behaviour. Daniel	2002-03-18 19:37:11 +00:00
Daniel Veillard	044fc6b747	fixing #61290 "namespace nodes have no parent" long standing divergence * xpath.c: fixing #61290 "namespace nodes have no parent" long standing divergence from the XPath REC. NodeSets simply hold a copy of namespace nodes and those node ->next points to the parent (which may not be the node carrying the definition). * include/libxml/xpath.h: flagged but didn't added a possible speedup * DOCBparser.c HTMLparser.c: removed some warnings from push parser due to new state being added. * tree.c: new fix from Boris Erdmann * configure.in c14n.c include/libxml/c14n.h testC14N.c: added the XML Canonalization support from Aleksey Sanin Daniel	2002-03-04 17:09:44 +00:00
Daniel Veillard	cbaf399537	applied 42 documentation patches from Charlie Bozeman. Regenerated the * .c include/libxml/.h doc/html/*: applied 42 documentation patches from Charlie Bozeman. Regenerated the HTML docs. Daniel	2001-12-31 16:16:02 +00:00
Daniel Veillard	c1f78343b6	fix comment in scripts element parsing. updated the results. Daniel * HTMLparser.c: fix comment in scripts element parsing. * result/HTML/doc3*: updated the results. Daniel	2001-11-10 11:43:05 +00:00
Daniel Veillard	957fdcf2a3	handle the case of < in quoted attributes, Bastian Kleineidam Daniel * HTMLparser.c test/HTML/lt.html result/HTML/lt.html*: handle the case of < in quoted attributes, Bastian Kleineidam Daniel	2001-11-06 22:50:19 +00:00
Daniel Veillard	635ef72a94	apply fixes to close #63271 and avoid segfaults when the error routine * parser.c globals.c DOCBparser.c HTMLparser.c error.c: apply fixes to close #63271 and avoid segfaults when the error routine gets callbed before xmlInitParser() get called. * nanoftp.c error.c: Applied patches from Justin Fletcher correcting some xmlGenericError misuses. Daniel	2001-10-29 11:48:19 +00:00
Daniel Veillard	5151c06f30	fixed an erroneous validation bug when PE refs occurs in external parsed * parser.c: fixed an erroneous validation bug when PE refs occurs in external parsed entities referenced from the internals subset * test/valid/index.xml test/valid/dtds/nitf-2-5.dtd test/valid/dtds/NewsMLv1.0.dtd result/valid/index.xml: added the associated testcase, it's a nice one. HTMLparser.c: generate the DTD node as HTML still ... * HTMLtree.c: fixed errors in Set/GetMetaEncoding Daniel	2001-10-23 13:10:19 +00:00
Daniel Veillard	b6b0fd8962	Fixed a bug when creating a new HTML document, doc->children was set to NULL with a DTD child Daniel	2001-10-22 12:31:11 +00:00
Daniel Veillard	3c01b1d81b	- include/libxml/globals.h include/libxml/threads.h threads.c testThreads.c: far more testing, cleaning up bugs - *.c : make sure globals.h is always included. Daniel	2001-10-17 15:58:35 +00:00
Daniel Veillard	7cc95c0b6a	try to get rid of parser loops for good. Daniel * HTMLparser.c: try to get rid of parser loops for good. Daniel	2001-10-17 15:45:12 +00:00
Daniel Veillard	d046356030	Applied the last patches from Gary, cleanup, activated threading all user * include/libxml/SAX.h include/libxml/globals.h include/libxml/parser.h include/libxml/parserInternals.h include/libxml/tree.h include/libxml/xmlerror.h HTMLparser.c SAX.c error.c globals.c nanoftp.c nanohttp.c parser.c parserInternals.c testDocbook.c testHTML.c testSAX.c tree.c uri.c xlink.c xmlmemory.c: Applied the last patches from Gary, cleanup, activated threading all user accessible global variables are now handled in globals.[ch] Still a bit rought but make tests passes with either --with-threads defined at configure time or not. * Makefile.am example/Makefile.am: added globals.[ch] and threads linking options Daniel	2001-10-13 09:15:48 +00:00
Daniel Veillard	60087f30f3	preparing 2.4.6 release updated and rebuilt the docs fixed a number of * configure.in: preparing 2.4.6 release * doc/xml.html doc/html/: updated and rebuilt the docs include/libxml/.h .c: fixed a number of teh/the widht/width typos Daniel	2001-10-10 09:45:09 +00:00
Daniel Veillard	3fbe8e30c1	closing bug #61832 removed a warning Daniel * configure.in: closing bug #61832 * HTMLparser.c: removed a warning Daniel	2001-10-06 13:30:33 +00:00
William M. Brack	1633d187cd	fixed HTMLparser.c	2001-10-05 15:41:19 +00:00
Daniel Veillard	f6ed8bc7b2	Igor Zlatkovic patches fixed typos Daniel * win32/dsp/libxml2.def.src: Igor Zlatkovic patches * DOCBparser.c HTMLparser.c parser.c: fixed typos Daniel	2001-10-02 09:22:47 +00:00
William M. Brack	d28e48ab49	fix loop in HTMLparser.c	2001-09-23 01:55:08 +00:00
Daniel Veillard	dc2cee29d0	Added the part about section 7.2 on URI resolution, fixed a side effect in * include/libxml/catalog.h catalog.c xmlIO.c HTMLparser.c: Added the part about section 7.2 on URI resolution, fixed a side effect in the HTML parser, look complete and ready to rock except the URI/SystemID part! Daniel	2001-08-22 16:30:37 +00:00
Daniel Veillard	bb3712974b	trying to fix some troubles w.r.t. function returning const xxxPtr. Daniel * HTMLparser.c HTMLtree.c include/libxml/HTMLparser.h: trying to fix some troubles w.r.t. function returning const xxxPtr. Daniel	2001-08-16 23:26:59 +00:00
Daniel Veillard	5e2dace1ca	Cleanup, cleanup .. removed libxml softlink for good cleanup to get 100% Cleanup, cleanup .. * configure.in Makefile.am: removed libxml softlink for good * include/libxml/.h .c doc/Makefile.am: cleanup to get 100% coverage by gtk-doc Daniel	2001-07-18 19:30:27 +00:00
Daniel Veillard	220907319a	cleanup of global variables, marking some const or private. Daniel * include/libxml/parserInternals.h include/libxml/HTMLparser.h xmlIO.c tree.c parserInternals.c entities.c encoding.c HTMLparser.c: cleanup of global variables, marking some const or private. Daniel	2001-07-16 00:06:07 +00:00
Daniel Veillard	7db3773a5c	store the line numbder in element->content, may break some software, need * DOCBparser.c HTMLparser.c HTMLtree.c SAX.c debugXML.c parser.c tree.c xpointer.c: store the line numbder in element->content, may break some software, need a configuration mechanism Daniel	2001-07-12 01:20:08 +00:00
Daniel Veillard	4d65a1c55b	- parser.c: improved the description of a couple of interfaces upon Larry Stamper suggestion Daniel	2001-07-04 22:06:23 +00:00
Daniel Veillard	f420ac55f8	fixing a too early root closing problem raised byt Prashanth Naidu Daniel * HTMLparser.c: fixing a too early root closing problem raised byt Prashanth Naidu Daniel	2001-07-04 16:04:09 +00:00
Daniel Veillard	c5d64345cf	Summer's cleanup, a really big one: * AUTHORS: added William and Bjorn * include/libxml/.h .c README doc/.html etc.: changed old email to daniel@veillard.com hopefully I won't have to do this again doc/Makefile.am doc/html/.html: cleanup makefile, checked that docs can be rebuilt cleanly now include/libxml/xmlversion.h: removed include/libxml/xmlversion.h from CVs it's generated, added include/libxml/xmlwin32version.h also generated but which should change far less frequently. * catalog.c nanoftp.c: made sure to include libxml.h not libxml/xmlversion.h directly * include/libxml/*.h: include xmlwin32version.h instead of xmlversion.h when compiling on WIN32 and MSC Daniel	2001-06-24 12:13:24 +00:00
Daniel Veillard	017b108fcf	- Makefile.am: cleanup when --without-debug is specified - xinclude.c xpath.c xpathInternals.h xpointer.c: cleanup w.r.t. --without-debug and other include points - catalog.h testCatalog.c: a bit of cleanup and prepare for XML Catalogs - configure.in entities.h tree.h HTMLparser.c: removed --without-corba, made the _private field mandatory Daniel	2001-06-21 11:20:21 +00:00
Daniel Veillard	02bb170a8b	- HTMLparser.[ch] HTMLtree.c: stored the inline/block property of element and use it to avoid outputting formatting spaces at the wrong place. Implemented the format parameter for HTML save. - result/HTML/doc2.htm result/HTML/doc3.htm result/HTML/fp40.htm result/HTML/script.html result/HTML/test2.html result/HTML/test3.html result/HTML/wired.html: of course this impact the result of a number of HTML tests Daniel	2001-06-13 21:11:59 +00:00
Daniel Veillard	f69bb4b5bf	- HTMLparser.c: Closed bug #54891 - result/HTML/cf_128.html* test/HTML/cf_128.html: added the test to the suite forgot to commit this one yesterday - encoding.h hash.c nanoftp.h parser.h tree.h uri.h xlink.h xpointer.c: applied a documentation patch from LotR and filled in a few missing descriptions Daniel	2001-05-19 13:24:56 +00:00
Daniel Veillard	0a2a163d2e	- HTMLparser.c: Patch from Jonas Borgstr�m (htmlGetEndPriority): New function, returns the priority of a certain element. (htmlAutoCloseOnClose): Only close inline elements if they all have lower or equal priority. - result/HTML: this of course changed a number of tests results. Daniel	2001-05-11 14:18:03 +00:00
Daniel Veillard	6426935a9a	- HTMLparser.c: fixed htmlNewDoc SYSTEM and PUBLIC ID inversion when both parameters are NULL. Daniel	2001-05-04 17:52:34 +00:00

... 2 3 4 5 6 ...

432 Commits