libxml2

mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2024-12-27 03:21:26 +03:00

Author	SHA1	Message	Date
Nick Wellnhofer	76d6b0d768	html: Don't escape ASCII chars in href attributes In several cases, href attributes can contain ASCII characters which are illegal in URIs. Escaping them often does more harm than good. Fixes #321.	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	ad338ca737	Remove explicit integer casts Remove explicit integer casts as final operation - in assignments - when passing arguments - when returning values Remove casts - to the same type - from certain range-bound values The main motivation is that these explicit casts don't change the result of operations and only render UBSan's implicit-conversion checks useless. Removing these casts allows UBSan to detect cases where truncation or sign-changes occur unexpectedly. Document some explicit casts as truncating and add a few missing ones.	2022-09-01 02:33:57 +02:00
Nick Wellnhofer	0f568c0b73	Consolidate private header files Private functions were previously declared - in header files in the root directory - in public headers guarded with IN_LIBXML - in libxml.h - redundantly in source files that used them. Consolidate all private header files in include/private.	2022-08-26 02:11:56 +02:00
David Kilzer	054e46b097	Restore behavior of htmlDocContentDumpFormatOutput() Patch by J Pascoe of Apple. * HTMLtree.c: (htmlDocContentDumpFormatOutput): - Prior to commit `b79ab6e6d9`, xmlDoc.type was set to XML_HTML_DOCUMENT_NODE before dumping the HTML output, then restored before returning.	2022-05-14 08:56:47 -07:00
David Kilzer	21561e833a	Mark more static data as `const` Similar to `8f5710379`, mark more static data structures with `const` keyword. Also fix placement of `const` in encoding.c. Original patch by Sarah Wilkin.	2022-04-07 12:01:23 -07:00
Nick Wellnhofer	776d15d383	Don't check for standard C89 headers Don't check for - ctype.h - errno.h - float.h - limits.h - math.h - signal.h - stdarg.h - stdlib.h - string.h - time.h Stop including non-standard headers - malloc.h - strings.h	2022-03-02 00:43:54 +01:00
Nick Wellnhofer	346c3a930c	Remove elfgcchack.h The same optimization can be enabled with -fno-semantic-interposition since GCC 5. clang has always used this option by default.	2022-02-20 21:49:04 +01:00
Nick Wellnhofer	92d9ab4c28	Fix whitespace when serializing empty HTML documents The old, non-recursive HTML serialization code would always terminate the output with a newline. The new implementation omitted the newline if the document node had no children. Readd the newline when serializing empty documents. Fixes #266.	2021-06-07 15:09:53 +02:00
Nick Wellnhofer	85b1792e37	Work around lxml API abuse Make xmlNodeDumpOutput and htmlNodeDumpFormatOutput work with corrupted parent pointers. This used to work with the old recursive code but the non-recursive rewrite required parent pointers to be set correctly. Unfortunately, lxml relies on the old behavior and passes subtrees with a corrupted structure. Fall back to a recursive function call if an invalid parent pointer is detected. Fixes #255.	2021-05-21 12:19:25 +02:00
Nick Wellnhofer	e6495e4789	Remove unused encoding parameter of HTML output functions The encoding string is unused. Encodings are set by way of the output buffer.	2021-02-07 14:39:55 +01:00
Nick Wellnhofer	0b3c64d9f2	Handle dumps of corrupted documents more gracefully Check parent pointers for NULL after the non-recursive rewrite of the serialization code. This avoids segfaults with corrupted documents which can apparently be seen with lxml, see issue #187.	2020-09-29 18:08:37 +02:00
Nick Wellnhofer	c1ba6f54d3	Revert "Do not URI escape in server side includes" This reverts commit `960f0e2756`. This commit introduced - an infinite loop, found by OSS-Fuzz, which could be easily fixed. - an algorithm with quadratic runtime - a security issue, see https://bugzilla.gnome.org/show_bug.cgi?id=769760 A better approach is to add an option not to escape URLs at all which libxml2 should have possibly done in the first place.	2020-08-15 18:32:29 +02:00
Nick Wellnhofer	b79ab6e6d9	Make htmlNodeDumpFormatOutput non-recursive Fixes stack overflow with deeply nested HTML documents. Found by OSS-Fuzz.	2020-07-28 03:44:30 +02:00
Nick Wellnhofer	20c60886e4	Fix typos Resolves #133.	2020-03-08 17:41:53 +01:00
Jared Yanovich	2a350ee9b4	Large batch of typo fixes Closes #109.	2019-09-30 18:04:38 +02:00
Nick Wellnhofer	d459831c1b	Fix HTML serialization with UTF-8 encoding If the encoding is specified as UTF-8, make sure to use a NULL encoding handler.	2018-10-13 16:47:13 +02:00
Nick Wellnhofer	ee501f5449	Stop using doc->charset outside parser code doc->charset does not specify the in-memory encoding which is always UTF-8.	2018-10-13 16:47:01 +02:00
Shaun McCance	7607d9dd45	Allow HTML serializer to output HTML5 DOCTYPE For https://bugzilla.gnome.org/show_bug.cgi?id=747301 Use simple HTML5 DOCTYPE for about:legacy-compat HTML5 uses a DOCTYPE without a PUBLIC or SYSTEM identifier. It looks like this: <!DOCTYPE html> I can't use XSLT to output this, because to get a DOCTYPE I have to provide a PUBLIC or SYSTEM identifier. Luckily, the standards folks recognized this and provided this semantically equivalent form for the HTML DOCTYPE: <!DOCTYPE html SYSTEM "about:legacy-compat"> But people don't like seeing the "legacy" identifier in their output. They'd rather see the shiny new DOCTYPE. Since we know that about:legacy-compat is defined by the W3C to be semantically equivalent to the sans-SYSTEM DOCTYPE, we could just special-case it in the HTML serializer in libxml2. So if you set the SYSTEM identifier to "about:legacy-compat", you get an HTML5 short-form DOCTYPE.	2015-04-03 22:52:36 +08:00
Romain Bondue	960f0e2756	Do not URI escape in server side includes	2013-04-23 20:44:55 +08:00
Daniel Veillard	f8e3db0445	Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.	2012-09-11 13:26:36 +08:00
Daniel Veillard	7d4c529a33	Improve HTML escaping of attribute on output Handle special cases of &{...} constructs as hinted in the spec http://www.w3.org/TR/html401/appendix/notes.html#h-B.7.1 and special values as comment <!-- ... --> used for server side includes This is limited to attribute values in HTML content.	2012-09-05 12:11:43 +08:00
Daniel Veillard	7b9b07198f	Convert the HTML tree module to the new buffers The new input buffers induced a couple of changes, the others are related to the switch to xmlBuf in saving routines.	2012-07-23 14:24:27 +08:00
Daniel Veillard	39d027cdb7	Fix html serialization error and htmlSetMetaEncoding() For https://bugzilla.gnome.org/show_bug.cgi?id=630682 The python tests were reporting errors, some of it was due to a small change in case encoding, but the main one was about htmlSetMetaEncoding(doc, NULL) being broken by not removing the associated meta tag anymore	2012-05-11 12:38:23 +08:00
Daniel Veillard	c62efc847c	Add options to ignore the internal encoding For both XML and HTML, the document can provide an encoding either in XMLDecl in XML, or as a meta element in HTML head. This adds options to ignore those encodings if the encoding is known in advace for example if the content had been converted before being passed to the parser. * parser.c include/libxml/parser.h: add XML_PARSE_IGNORE_ENC option for XML parsing * include/libxml/HTMLparser.h HTMLparser.c: adds the HTML_PARSE_IGNORE_ENC for HTML parsing * HTMLtree.c: fix the handling of saving when an unknown encoding is defined in meta document header * xmllint.c: add a --noenc option to activate the new parser options	2011-05-26 11:47:37 +08:00
Daniel Veillard	8d7c1b7ab2	582913 Fix htmlSetMetaEncoding() to be nicer * HTMLtree.c: htmlSetMetaEncoding should not destroy existing meta encoding elements, plus it should not change things at all if the encoding is the same. Also fixed htmlSaveFileFormat() to ask for change if outputing to UTF-8.	2009-08-12 23:03:23 +02:00
Daniel Veillard	74eb54b5b7	575875 don't output charset=html * HTMLtree.c: don't output charset=html in htmlSetMetaEncoding() as this is clearly a libxml2 only thingused for import only	2009-08-12 15:59:01 +02:00
Daniel Veillard	da3fee406d	Borland C fix from Moritz Both regenerate, workaround a problem for buffer * trionan.c: Borland C fix from Moritz Both * testapi.c: regenerate, workaround a problem for buffer testing * xmlIO.c HTMLtree.c: new internal entry point to hide even better xmlAllocOutputBufferInternal * tree.c: harden the code around buffer allocation schemes * parser.c: restore the warning when namespace names are not absolute URIs * runxmlconf.c: continue regression tests if we get the expected number of errors * Makefile.am: run the python tests on make check * xmlsave.c: handle the HTML documents and trees * python/libxml.c: convert python serialization to the xmlSave APIs and avoid some horrible hacks Daniel svn path=/trunk/; revision=3790	2008-09-01 13:08:57 +00:00
Daniel Veillard	fcd02adb71	htmlNodeDumpFormatOutput didn't handle XML_ATTRIBUTE_NODe fixes bug * HTMLtree.c: htmlNodeDumpFormatOutput didn't handle XML_ATTRIBUTE_NODe fixes bug #438390 Daniel svn path=/trunk/; revision=3631	2007-06-12 09:49:40 +00:00
Rob Richards	417b74d0b1	Add linefeeds to error messages allowing for consistant handling. * HTMLtree.c xmlsave.c: Add linefeeds to error messages allowing for consistant handling.	2006-08-15 23:14:24 +00:00
Rob Richards	77b92ff6a8	fix bug #322136 in xmlNodeBufGetContent when entity ref is a child of an * tree.c: fix bug #322136 in xmlNodeBufGetContent when entity ref is a child of an element (fix by Oleksandr Kononenko). * HTMLtree.c include/libxml/HTMLtree.h: Add htmlDocDumpMemoryFormat.	2005-12-20 15:55:14 +00:00
Daniel Veillard	b8c8016044	fixed bug #310333 with a patch close to the provided patch for HTML UTF-8 * HTMLtree.c: fixed bug #310333 with a patch close to the provided patch for HTML UTF-8 serialization * result/HTML/script2.html: this changed the output of that test Daniel	2005-08-08 13:46:45 +00:00
Daniel Veillard	5d4644ef6e	revamped the elfgcchack.h format to cope with gcc4 change of aliasing * doc/apibuild.py doc/elfgcchack.xsl: revamped the elfgcchack.h format to cope with gcc4 change of aliasing allowed scopes, had to add extra informations to doc/libxml2-api.xml to separate the header from the c module source. * .c: updated all c library files to add a #define bottom_xxx and reimport elfgcchack.h thereafter, and a bit of cleanups. doc//* testapi.c: regenerated when rebuilding the API Daniel	2005-04-01 13:11:58 +00:00
Daniel Veillard	aa9a983dbd	fixing bug 168196, <a name=""> must be URI escaped too Daniel * HTMLtree.c: fixing bug 168196, <a name=""> must be URI escaped too Daniel	2005-03-29 20:30:17 +00:00
Daniel Veillard	d5cc0f7f51	augmented types supported a number of new bug fixes and documentation * gentest.py testapi.c: augmented types supported * HTMLtree.c tree.c xmlreader.c xmlwriter.c: a number of new bug fixes and documentation updates. Daniel	2004-11-06 19:24:28 +00:00
Daniel Veillard	ce244ad595	fixed the way the generator works, extended the testing, especially with * gentest.py testapi.c: fixed the way the generator works, extended the testing, especially with more real trees and nodes. * HTMLtree.c tree.c valid.c xinclude.c xmlIO.c xmlsave.c: a bunch of real problems found and fixed. * entities.c: fix error reporting to go through the new handlers Daniel	2004-11-05 10:03:46 +00:00
Daniel Veillard	3d97e669ec	extending the tests coverage more fixes and cleanups Daniel * gentest.py testapi.c: extending the tests coverage * HTMLtree.c tree.c xmlsave.c xpointer.c: more fixes and cleanups Daniel	2004-11-04 10:49:00 +00:00
Daniel Veillard	36e5cd5064	adding xmlMemBlocks() work on generator of an automatic API regression * xmlmemory.c include/libxml/xmlmemory.h: adding xmlMemBlocks() * Makefile.am gentest.py testapi.c: work on generator of an automatic API regression test tool. * SAX2.c nanoftp.c parser.c parserInternals.c tree.c xmlIO.c xmlstring.c: various API hardeing changes as a result of running teh first set of automatic API regression tests. * test/slashdot16.xml: apparently missing from CVS, commited it Daniel	2004-11-02 14:52:23 +00:00
William M. Brack	13dfa87e91	added the routine xmlNanoHTTPContentLength to the external API * nanohttp.c, include/libxml/nanohttp.h: added the routine xmlNanoHTTPContentLength to the external API (bug151968). * parser.c: fixed unnecessary internal error message (bug152060); also changed call to strncmp over to xmlStrncmp. * encoding.c: fixed compilation warning (bug152307). * tree.c: fixed segfault in xmlCopyPropList (bug152368); fixed a couple of compilation warnings. * HTMLtree.c, debugXML.c, xmlmemory.c: fixed a few compilation warnings; no change to logic.	2004-09-18 04:52:08 +00:00
Daniel Veillard	42fd412637	change --html to make sure we use the HTML serialization rule by default * xmllint.c: change --html to make sure we use the HTML serialization rule by default when HTML parser is used, add --xmlout to allow to force the XML serializer on HTML. * HTMLtree.c: ugly tweak to fix the output on <p> element and solve #125093 * result/HTML/*: this changes the output of some tests Daniel	2003-11-04 08:47:48 +00:00
William M. Brack	76e95df055	Changed all (?) occurences where validation macros (IS_xxx) had * include/libxml/parserInternals.h HTMLparser.c HTMLtree.c SAX2.c catalog.c debugXML.c entities.c parser.c relaxng.c testSAX.c tree.c valid.c xmlschemas.c xmlschemastypes.c xpath.c: Changed all (?) occurences where validation macros (IS_xxx) had single-byte arguments to use IS_xxx_CH instead (e.g. IS_BLANK changed to IS_BLANK_CH). This gets rid of many warning messages on certain platforms, and also high- lights places in the library which may need to be enhanced for proper UTF8 handling.	2003-10-18 16:20:14 +00:00
Daniel Veillard	e2238d5617	converted too small cleanup Daniel * HTMLtree.c include/libxml/xmlerror.h: converted too * tree.c: small cleanup Daniel	2003-10-09 13:14:55 +00:00
Daniel Veillard	a9cce9cd0d	Okay this is scary but it is just adding a configure option to disable * HTMLtree.c SAX2.c c14n.c catalog.c configure.in debugXML.c encoding.c entities.c nanoftp.c nanohttp.c parser.c relaxng.c testAutomata.c testC14N.c testHTML.c testRegexp.c testRelax.c testSchemas.c testXPath.c threads.c tree.c valid.c xmlIO.c xmlcatalog.c xmllint.c xmlmemory.c xmlreader.c xmlschemas.c example/gjobread.c include/libxml/HTMLtree.h include/libxml/c14n.h include/libxml/catalog.h include/libxml/debugXML.h include/libxml/entities.h include/libxml/nanohttp.h include/libxml/relaxng.h include/libxml/tree.h include/libxml/valid.h include/libxml/xmlIO.h include/libxml/xmlschemas.h include/libxml/xmlversion.h.in include/libxml/xpathInternals.h python/libxml.c: Okay this is scary but it is just adding a configure option to disable output, this touches most of the files. Daniel	2003-09-29 13:20:24 +00:00
William M. Brack	3a6da760c5	Fixed bug 121394 - missing ns on attributes * HTMLtree.c: Fixed bug 121394 - missing ns on attributes	2003-09-15 04:58:14 +00:00
Daniel Veillard	70bcb0ea24	hum try to avoid some troubles when the library is not initialized and one * HTMLtree.c tree.c threads.c: hum try to avoid some troubles when the library is not initialized and one try to save, the locks in threaded env might not been initialized, playing safe * xmlschemastypes.c: apply patch for hexBinary from Charles Bozeman * test/schemas/hexbinary_* result/schemas/hexbinary_*: also added his tests to the regression suite. Daniel	2003-08-08 14:00:28 +00:00
Daniel Veillard	5f5b7bb78e	fixing bug #112904 : html output method escaped plus sign character in URI * HTMLtree.c: fixing bug #112904: html output method escaped plus sign character in URI attribute. Daniel	2003-05-16 17:19:40 +00:00
Daniel Veillard	645c690d49	patch from Vasily Tchekalkin to fix #109865 Daniel * HTMLtree.c: patch from Vasily Tchekalkin to fix #109865 Daniel	2003-04-10 21:40:49 +00:00
Daniel Veillard	c7e9b194e7	Fixed reopening of #78662 <form action="..."> is an URI reference Daniel * HTMLtree.c: Fixed reopening of #78662 <form action="..."> is an URI reference Daniel	2003-03-27 14:08:24 +00:00
Daniel Veillard	04ee2f2d00	avoid escaping ',' in URIs Daniel * HTMLtree.c: avoid escaping ',' in URIs Daniel	2003-03-23 20:31:46 +00:00
Daniel Veillard	5ecaf7f9a7	fixes #102920 about namespace handling in HTML output and section 16.2 * HTMLtree.c tree.c: fixes #102920 about namespace handling in HTML output and section 16.2 "HTML Output Method" of XSLT-1.0 * README: fixed a link Daniel	2003-01-09 13:19:33 +00:00
Daniel Veillard	024b57019f	patch from Mark Vadok about htmlNodeDumpOutput location. removed an * HTMLtree.c include/libxml/HTMLtree.h: patch from Mark Vadok about htmlNodeDumpOutput location. * xpath.c: removed an undefined function signature * doc/apibuild.py doc/libxml2-api.xml: the script was exporting too many symbols in the API breaking the python bindings. Updated with the libxslt/libexslt changes. Daniel	2002-12-12 00:15:55 +00:00

1 2 3

129 Commits