1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-03-14 22:50:08 +03:00

125 Commits

Author SHA1 Message Date
Nick Wellnhofer
e5107772ff Fix pathological performance when outputting charrefs
If a character can't be represented in the output encoding, it is
converted to a character reference. This used to to replace the
character in the input stream by calling xmlBufAddHead or
xmlBufferAddHead. These functions shifted the entire input array
around, leading to quadratic performance when converting a run of
non-representable characters. This is most pronounced when dumping to
memory.

Output the charref directly instead.

Found with libFuzzer.
2017-06-19 16:06:21 +02:00
Nick Wellnhofer
c9ccbd6a6d Deduplicate code in encoding.c
Introduce static functions xmlEncInputChunk and xmlEncOutputChunk
that handle the internal/iconv/ICU branching.
2017-06-19 16:06:21 +02:00
David Kilzer
4472c3a5a5 Fix some format string warnings with possible format string vulnerability
For https://bugzilla.gnome.org/show_bug.cgi?id=761029

Decorate every method in libxml2 with the appropriate
LIBXML_ATTR_FORMAT(fmt,args) macro and add some cleanups
following the reports.
2016-05-23 15:01:07 +08:00
Gaurav
080a22c5ea Avoid a possibility of dangling encoding handler
For https://bugzilla.gnome.org/show_bug.cgi?id=711149

In Function:
int xmlCharEncCloseFunc(xmlCharEncodingHandler *handler)

If the freed handler is any one of handlers[i] list, then it will make that
hanldlers[i] as dangling. This may lead to crash issues at places where
handlers is read.
2013-11-29 23:10:50 +08:00
Denis Pauk
e28c8a1ace #705267 - add additional defines checks for support "./configure --with-minimum"
https://bugzilla.gnome.org/show_bug.cgi?id=705267
2013-08-03 22:00:17 +08:00
Daniel Veillard
bf058dce13 Fix the flushing out of raw buffers on encoding conversions
https://bugzilla.gnome.org/show_bug.cgi?id=692915

the new set of converting functions tried to limit the encoding
conversion of the raw buffer to the consumption one to work in
a more progressive fashion. Unfortunately this was bad for
performances and led to errors on progressive parsing when
a very large chunk was close to the end of the document. Fix
the new internal function and switch back to the old way of
converting. Fix another bug in the process.
2013-02-13 18:19:42 +08:00
Petr Sumbera
6f49c73b53 Try IBM-037 when looking for EBCDIC handlers
http://en.wikipedia.org/wiki/EBCDIC_037
as it is another variat of EBCDIC
2012-12-12 15:41:30 +08:00
Daniel Veillard
f8e3db0445 Big space and tab cleanup
Remove all space before tabs and space and tabs at end of lines.
2012-09-11 13:26:36 +08:00
Daniel Veillard
28cc42d068 Regenerating docs and API files
Various cleanups
* configure.in: force regeneration of APIs in my environment
* buf.c buf.h enc.h encoding.c include/libxml/tree.h
  include/libxml/xmlerror.h save.h tree.c: various comment cleanups
  pointed by apibuild
* doc/apibuild.py: added the 3 new internal headers in the excludes
* doc/libxml2-api.xml doc/libxml2-refs.xml: regenerated the API
* doc/symbols.xml: listing new entry points for 2.9.0
* doc/devhelp/*: regenerated
2012-08-10 10:00:18 +08:00
Daniel Veillard
18d0db2503 Adding new encoding function to deal with the new structures
* encoding.c: adds xmlCharEncFirstLineInput, xmlCharEncInput and
  xmlCharEncOutput
* enc.h: the functions are not made public but added to this new header
2012-07-23 14:24:26 +08:00
Timothy Elliott
689408bd86 Prevent an infinite loop when dumping a node with encoding problems
When a node is dumped with a new encoding, we may encounter characters
that are not supported in the new encoding. libxml2 handles this by
replacing the character with character references, but in some encodings
this can result in an infinite loop when the character references
themselves contain unsupported characters.

This fixes the infinite loop by undoing a character reference substitution
when it cannot be inserted, and returning an encoder error.

This bug was noticed when looking into an infinite loop bug report for
the Ruby Nokogiri project. The original bug report, "nokogiri process
hangs on call to inner_html" is here:
https://github.com/tenderlove/nokogiri/issues/400
2012-05-08 22:03:22 +08:00
Daniel Veillard
69f04562f7 Fix an off by one error in encoding
this off by one error doesn't seems to reproduce on linux
but the error is real.
2011-08-19 11:05:04 +08:00
Giuseppe Iuculano
48f7dcb724 480323 add code to plug in ICU converters by default
This is not configured in by default but after some serious massaging
incorporate that patch from Chromium/Chrome.
2010-11-04 17:42:42 +01:00
Daniel Veillard
ad4f0a2dc8 630140 better fix for iso995x encoding error
Changing semantic of xmlCharEncInFunc() wasn't the proper way to
do this, better change UTF8ToISO8859x() appropriately
2010-11-03 20:40:46 +01:00
Daniel Veillard
1cc912ec7e Various cleanups on encoding handling
Done while chasing previous bug
2010-11-03 19:26:35 +01:00
Daniel Veillard
083caf5ec8 630140 fix iso995x encoding error
https://bugzilla.gnome.org/show_bug.cgi?id=630140
Fix the bug, which happen when using the embedded converters and
not iconv
2010-11-03 19:24:05 +01:00
Daniel Veillard
d44b936499 A few more safety cleanup raised by scan
* SAX2.c encoding.c parser.c xmlschemas.c: a few more safety checks
* relaxng.c: remove an unused intitialization
2009-09-07 12:15:08 +02:00
Daniel Veillard
76d364583e Fixing assorted potential problems raised by scan
* encoding.c parser.c relaxng.c runsuite.c tree.c xmlreader.c
  xmlschemas.c: nothing really serious but better safe than sorry
2009-09-07 11:19:33 +02:00
Daniel Veillard
7e385bd4e2 566012 autodetected encoding and encoding conflict
* encoding.c parser.c parserInternals.c: when we autodetect an encoding
  but it's actually not completely compatible with the one declared
  great care must be taken to not convert more than just the first line.
  Led to some refactoring, more private functions and a bit of cleanup.
2009-08-26 11:38:49 +02:00
Martin Kögler
c78988acb7 566012 Incomplete EBCDIC parsing support
* encoding.c: the iconv converter is sometimes only found as "EBCDIC-US"
2009-08-24 16:47:48 +02:00
Daniel Veillard
e83e93e715 make a new kind of buffer where shrinking and adding in head can avoid
* include/libxml/tree.h tree.c: make a new kind of buffer where
  shrinking and adding in head can avoid reallocation or full
  buffer memmoves
* encoding.c xmlIO.c: use the new kind of buffers for output
  buffers
Daniel

svn path=/trunk/; revision=3787
2008-08-30 12:52:26 +00:00
Daniel Veillard
f124539f7a buffer may not be large enough to convert to UCS4, patch from Christian
* encoding.c: buffer may not be  large enough to convert to
  UCS4, patch from Christian Fruth , fixes #504015
Daniel

svn path=/trunk/; revision=3727
2008-04-03 09:46:34 +00:00
Daniel Veillard
57c9db0725 poblem with encoding detection for UTF-16 reported by Ashwin and found by
* encoding.c: poblem with encoding detection for UTF-16 reported by
  Ashwin and found by Bill
* test/valid/dtds/utf16b.ent test/valid/dtds/utf16l.ent
  test/valid/UTF16Entity.xml result/valid/UTF16Entity.xml*: added
  the example to the regression tests
Daniel

svn path=/trunk/; revision=3700
2008-03-06 14:37:10 +00:00
Daniel Veillard
8e1a46d526 patch from Roumen Petrov to detect if iconv() needs a const for the second
* config.h.in configure.in encoding.c: patch from Roumen Petrov
  to detect if iconv() needs a const for the second parameter
Daniel

svn path=/trunk/; revision=3693
2008-02-15 07:47:26 +00:00
William M. Brack
38d452ac1c Fixed typo in xmlCharEncFirstLine pointed out by Mark Rowe (bug #440159)
* encoding.c: Fixed typo in xmlCharEncFirstLine pointed out
  by Mark Rowe (bug #440159)
* include/libxml/xmlversion.h.in: Added check for definition of
  _POSIX_C_SOURCE to avoid warnings on Apple OS/X (patch from
  Wendy Doyle and Mark Rowe, bug #346675)
* schematron.c, testapi.c, tree.c, xmlIO.c, xmlsave.c: minor
  changes to fix compilation warnings - no change to logic.

svn path=/trunk/; revision=3618
2007-05-22 16:00:06 +00:00
Daniel Veillard
28aac0b0f4 remove a warning check with uppercase for AIX iconv() should fix #352644
* HTMLparser.c: remove a warning
* encoding.c: check with uppercase for AIX iconv() should fix #352644
* doc/examples/Makefile.am: partially handle one bug report
Daniel
2006-10-16 08:31:18 +00:00
Daniel Veillard
df750627d6 fixing bug #340398 xmlCharEncOutFunc writing to input buffer Daniel
* encoding.c: fixing bug #340398 xmlCharEncOutFunc writing to
  input buffer
Daniel
2006-05-02 12:24:06 +00:00
Daniel Veillard
aac7c68e87 fix a few warning raised by gcc-4.1 and latests changes Daniel
* c14n.c encoding.c xmlschemas.c xpath.c xpointer.c: fix a few
  warning raised by gcc-4.1 and latests changes
Daniel
2006-03-10 13:40:16 +00:00
Daniel Veillard
2728f845c5 more cleanups based on coverity reports. Daniel
* SAX2.c catalog.c encoding.c entities.c example/gjobread.c
  python/libxml.c: more cleanups based on coverity reports.
Daniel
2006-03-09 16:49:24 +00:00
Daniel Veillard
2e7598cb06 avoid passing a char[] as snprintf first argument. implemented
* encoding.c parserInternals.c: avoid passing a char[] as snprintf
  first argument.
* threads.c include/libxml/threads.h: implemented xmlIsThreadsEnabled()
  based on Andrew W. Nosenko idea.
* doc/* elfgcchack.h: regenerated the API
Daniel
2005-09-02 12:28:34 +00:00
Daniel Veillard
2644ab270e applied the patch suggested #309565 which can avoid looping in error
* encoding.c: applied the patch suggested #309565 which can avoid
  looping in error conditions.
Daniel
2005-08-24 14:22:55 +00:00
Daniel Veillard
1fc3ed0280 finally converted the encoding module to the common error reporting
* encoding.c error.c include/libxml/xmlerror.h: finally converted
  the encoding module to the common error reporting mechanism
* doc/* doc/html/libxml-xmlerror.html: rebuilt
Daniel
2005-08-24 12:46:09 +00:00
Daniel Veillard
24505b0f5c a lot of small cleanups based on Linus' sparse check output. Daniel
* HTMLparser.c SAX2.c encoding.c globals.c parser.c relaxng.c
  runsuite.c runtest.c schematron.c testHTML.c testReader.c
  testRegexp.c testSAX.c testThreads.c valid.c xinclude.c xmlIO.c
  xmllint.c xmlmodule.c xmlschemas.c xpath.c xpointer.c: a lot of
  small cleanups based on Linus' sparse check output.
Daniel
2005-07-28 23:49:35 +00:00
Daniel Veillard
5d4644ef6e revamped the elfgcchack.h format to cope with gcc4 change of aliasing
* doc/apibuild.py doc/elfgcchack.xsl: revamped the elfgcchack.h
  format to cope with gcc4 change of aliasing allowed scopes, had
  to add extra informations to doc/libxml2-api.xml to separate
  the header from the c module source.
* *.c: updated all c library files to add a #define bottom_xxx
  and reimport elfgcchack.h thereafter, and a bit of cleanups.
* doc//* testapi.c: regenerated when rebuilding the API
Daniel
2005-04-01 13:11:58 +00:00
Daniel Veillard
394902e0d2 fix unitinialized variable in not frequently used code bug #172182 Daniel
* encoding.c: fix unitinialized variable in not frequently used
  code bug #172182
Daniel
2005-03-31 08:43:44 +00:00
Daniel Veillard
cffc1c7af1 removed a static buffer in xmlByteConsumed(), as pointed by Ben Maurer,
* encoding.c: removed a static buffer in xmlByteConsumed(),
  as pointed by Ben Maurer, fixes #170086
* xmlschemas.c: remove a potentially uninitialized pointer warning
Daniel
2005-03-12 18:54:55 +00:00
Daniel Veillard
56de87ee0d fix the comment to describe the real return values lot of work on the
* encoding.c: fix the comment to describe the real return values
* pattern.c xpath.c include/libxml/pattern.h: lot of work on
  the patterns, pluggin in the XPath default evaluation, but
  disabled right now because it's not yet good enough for XSLT.
  pattern.h streaming API are likely to be changed to handle
  relative and absolute paths in the same expression.
Daniel
2005-02-16 00:22:29 +00:00
Daniel Veillard
aba37dffd7 forgot a $(srcdir) stupid error wrong name #157976 Daniel
* Makefile.am: forgot a $(srcdir)
* encoding.c: stupid error wrong name #157976
Daniel
2004-11-11 20:42:04 +00:00
Daniel Veillard
01ca83cd4c fixed a regression in iconv support. Daniel
* encoding.c: fixed a regression in iconv support.
Daniel
2004-11-06 13:26:59 +00:00
Daniel Veillard
ce682bc24b autogenerate a minimal NULL value sequence for unknown pointer types This
* gentest.py testapi.c: autogenerate a minimal NULL value sequence
  for unknown pointer types
* HTMLparser.c SAX2.c chvalid.c encoding.c entities.c parser.c
  parserInternals.c relaxng.c valid.c xmlIO.c xmlreader.c
  xmlsave.c xmlschemas.c xmlschemastypes.c xmlstring.c xpath.c
  xpointer.c: This uncovered an impressive amount of entry points
  not checking for NULL pointers when they ought to, closing all
  the open gaps.
Daniel
2004-11-05 17:22:25 +00:00
Daniel Veillard
05f9735ba3 Fixed bug #153937, making sure the conversion functions return the number
* encoding.c doc/examples/testWriter.c: Fixed bug #153937, making
  sure the conversion functions return the number of byte written.
  Had to fix one of the examples.
Daniel
2004-10-31 15:35:32 +00:00
William M. Brack
13dfa87e91 added the routine xmlNanoHTTPContentLength to the external API
* nanohttp.c, include/libxml/nanohttp.h: added the routine
  xmlNanoHTTPContentLength to the external API (bug151968).
* parser.c: fixed unnecessary internal error message (bug152060);
  also changed call to strncmp over to xmlStrncmp.
* encoding.c: fixed compilation warning (bug152307).
* tree.c: fixed segfault in xmlCopyPropList (bug152368); fixed
  a couple of compilation warnings.
* HTMLtree.c, debugXML.c, xmlmemory.c: fixed a few compilation
  warnings; no change to logic.
2004-09-18 04:52:08 +00:00
William M. Brack
f54924bd7e applied fixes for the UTF8ToISO8859x transcoding routine suggested by Mark
* encoding.c: applied fixes for the UTF8ToISO8859x transcoding
  routine suggested by Mark Itzcovitz
2004-09-09 14:35:17 +00:00
William M. Brack
a3215c7ae6 many further little changes for OOM problems. Now seems to be getting
* SAX2.c, encoding.c, error.c, parser.c, tree.c, uri.c, xmlIO.c,
  xmlreader.c, include/libxml/tree.h: many further little changes
  for OOM problems.  Now seems to be getting closer to "ok".
* testOOM.c: added code to intercept more errors, found more
  problems with library. Changed method of flagging / counting
  errors intercepted.
2004-07-31 16:24:01 +00:00
Daniel Veillard
b5da42af42 small patch to try to fix a warning with Sun One compiler Daniel
* encoding.c: small patch to try to fix a warning with Sun One compiler
Daniel
2004-02-21 14:57:44 +00:00
Daniel Veillard
3288882e8a small patch removing a warning with MS compiler. Daniel
* encoding.c: small patch removing a warning with MS compiler.
Daniel
2004-02-21 14:21:50 +00:00
Daniel Veillard
3671190b54 added xmlByteConsumed() interface updated the benchmark rebuilt the docs
* parserInternals.c xmlIO.c encoding.c include/libxml/parser.h
  include/libxml/xmlIO.h: added xmlByteConsumed() interface
* doc/*: updated the benchmark rebuilt the docs
* python/tests/Makefile.am python/tests/indexes.py: added a
  specific regression test for xmlByteConsumed()
* include/libxml/encoding.h rngparser.c tree.c: small cleanups
Daniel
2004-02-11 13:25:26 +00:00
William M. Brack
030a7a1729 applied patch supplied by Christophe Dubach to fix problem with
* encoding.c: applied patch supplied by Christophe Dubach
  to fix problem with --with-minimum configuration
  (bug 133773)
* nanoftp.c: fixed potential buffer overflow problem,
  similar to fix just applied to nanohttp.c.
2004-02-10 12:48:57 +00:00
Daniel Veillard
182d32a537 applied a small patch from Alfred Mickautsch to avoid an out of bound
* encoding.c: applied a small patch from Alfred Mickautsch
  to avoid an out of bound error in isolat1ToUTF8()
Daniel
2004-02-09 12:42:55 +00:00
William M. Brack
a2e844a3b3 moved string and UTF8 routines out of parser.c and encoding.c into a new
* encoding.c, parser.c, xmlstring.c, Makefile.am,
  include/libxml/Makefile.am, include/libxml/catalog.c,
  include/libxml/chvalid.h, include/libxml/encoding.h,
  include/libxml/parser.h, include/libxml/relaxng.h,
  include/libxml/tree.h, include/libxml/xmlwriter.h,
  include/libxml/xmlstring.h:
  moved string and UTF8 routines out of parser.c and encoding.c
  into a new module xmlstring.c with include file
  include/libxml/xmlstring.h mostly using patches from Reid
  Spencer.  Since xmlChar now defined in xmlstring.h, several
  include files needed to have a #include added for safety.
* doc/apibuild.py: added some additional sorting for various
  references displayed in the APIxxx.html files.  Rebuilt the
  docs, and also added new file for xmlstring module.
* configure.in: small addition to help my testing; no effect on
  normal usage.
* doc/search.php: added $_GET[query] so that persistent globals
  can be disabled (for recent versions of PHP)
2004-01-06 11:52:13 +00:00