1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2024-10-26 12:25:09 +03:00
Commit Graph

211 Commits

Author SHA1 Message Date
Nick Wellnhofer
98840d40da parser: Rework EBCDIC code page detection
To detect EBCDIC code pages, we used to switch the encoding twice and
had to be very careful not to decode data after the XML declaration
before the second switch. This relied on a hard-coded expected size of
the XML declaration and was complicated and unreliable.

Now we convert the first 200 bytes to EBCDIC-US and parse the encoding
declaration manually.
2023-03-21 21:35:15 +01:00
Nick Wellnhofer
1c5e1fc194 malloc-fail: Check for malloc failure in xmlFindCharEncodingHandler
Don't return encoding handlers with a NULL name.

Found with libFuzzer, see #344.
2023-02-17 17:16:50 +01:00
Nick Wellnhofer
d18f9c1102 malloc-fail: Fix leak of xmlCharEncodingHandler
Also free handler if its name is NULL.

Found with libFuzzer, see #344.
2023-02-17 17:16:50 +01:00
Nick Wellnhofer
3cc900f098 encoding: Cast toupper argument to unsigned char
Fixes undefined behavior.

Also cast return value explicitly to fix implicit-integer-sign-change
checks.
2023-02-17 17:16:50 +01:00
Nick Wellnhofer
2355eac59e malloc-fail: Fix null deref if growing input buffer fails
Also add some error checks.

Found with libFuzzer, see #344.
2023-01-24 11:32:15 +01:00
Nick Wellnhofer
0f54af7494 encoding.c: Fix for documentation generator
Top-level macro invocations throw off the documentation parser.
2022-12-08 18:40:58 +01:00
Nick Wellnhofer
53ab38408d encoding: Make init function private 2022-11-27 02:11:07 +01:00
Nick Wellnhofer
3e9d5e4f7f encoding: Remove unused variable xmlDefaultCharEncodingHandler 2022-11-27 02:11:07 +01:00
Nick Wellnhofer
1406b20fe9 encoding: Allocate default handlers statically 2022-11-24 19:21:01 +01:00
Nick Wellnhofer
2059df5358 buf: Deprecate static/immutable buffers 2022-11-20 21:16:03 +01:00
Nick Wellnhofer
ad338ca737 Remove explicit integer casts
Remove explicit integer casts as final operation

- in assignments
- when passing arguments
- when returning values

Remove casts

- to the same type
- from certain range-bound values

The main motivation is that these explicit casts don't change the result
of operations and only render UBSan's implicit-conversion checks
useless. Removing these casts allows UBSan to detect cases where
truncation or sign-changes occur unexpectedly.

Document some explicit casts as truncating and add a few missing ones.
2022-09-01 02:33:57 +02:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
David Kilzer
c14cac8bba xmlBufAvail() should return length without including a byte for NUL terminator
* buf.c:
(xmlBufAvail):
- Return the number of bytes available in the buffer, but do not
  include a byte for the NUL terminator so that it is reserved.

* encoding.c:
(xmlCharEncFirstLineInput):
(xmlCharEncInput):
(xmlCharEncOutput):
* xmlIO.c:
(xmlOutputBufferWriteEscape):
- Remove code that subtracts 1 from the return value of
  xmlBufAvail().  It was implemented inconsistently anyway.
2022-05-25 18:25:19 -07:00
David Kilzer
21561e833a Mark more static data as const
Similar to 8f5710379, mark more static data structures with
`const` keyword.

Also fix placement of `const` in encoding.c.

Original patch by Sarah Wilkin.
2022-04-07 12:01:23 -07:00
Nick Wellnhofer
40483d0ce2 Deprecate module init and cleanup functions
These functions shouldn't be part of the public API. Most init
functions are only thread-safe when called from xmlInitParser. Global
variables should only be cleaned up by calling xmlCleanupParser.
2022-03-06 15:59:43 +01:00
Nick Wellnhofer
f2072a8b2f Fix memory leak in xmlFindCharEncodingHandler
Fix memory leak in an unlikely error condition. Thanks to Wentao Liang
for the report.

Fixes #342.
2022-03-05 18:27:12 +01:00
Nick Wellnhofer
21ddad5284 Remove ICONV_CONST test
We can simply cast the offending pointer to (void *).
2022-03-04 22:08:58 +01:00
Nick Wellnhofer
776d15d383 Don't check for standard C89 headers
Don't check for

- ctype.h
- errno.h
- float.h
- limits.h
- math.h
- signal.h
- stdarg.h
- stdlib.h
- string.h
- time.h

Stop including non-standard headers

- malloc.h
- strings.h
2022-03-02 00:43:54 +01:00
Nick Wellnhofer
b66ce0bba8 Don't include ICU headers in public headers
There's no need to make these implementation details public.
2022-03-01 13:02:49 +01:00
Nick Wellnhofer
c41bc10da3 Fix unused variable warnings with disabled features 2022-02-22 19:57:12 +01:00
Nick Wellnhofer
346c3a930c Remove elfgcchack.h
The same optimization can be enabled with -fno-semantic-interposition
since GCC 5. clang has always used this option by default.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
7abc6e6a24 Fix integer conversion warning in xmlIconvWrapper
Use size_t for return value of iconv(3) to avoid an UBSan integer
conversion warning.
2022-01-25 03:07:30 +01:00
Mohammad Razavi
eb4c1bf855 Fix random dropping of characters on dumping ASCII encoded XML
Fix a bug in xmlCharEncOutput return value which will cause
xmlNodeDumpOutput to drop characters randomly.

xmlCharEncOutput returns zero if the length of the input buffer is
zero but ignores the fact that it may already encoded the input buffer
and the input's length is zero due to the fact that xmlEncOutputChunk
returned -2 errors and underlying code tries to fix the error by
encoding the input.

xmlCharEncOutput is collecting the number of bytes written to the
output buffer but is returning zero instead of the total number of
bytes in this situation. This commit will fix this issue by returning
the total number of bytes instead. So the xmlNodeDumpOutput will also
continue writing and will not stop due to the fact that it mistakenly
thinks the output buffer is not changed in that iteration.

Fixes #314
2022-01-16 15:08:44 +01:00
David Kilzer
03bb929390 Fix parse failure when 4-byte character in UTF-16 BE is split across a chunk
This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8().

* encoding.c:
(UTF16LEToUTF8):
- Fix comment to describe what the code does.
(UTF16BEToUTF8):
- Fix undefined behavior which was applied to UTF16LEToUTF8() in
  2f9382033e.
- Add bounds check to while() loop which was applied to
  UTF16LEToUTF8() in be803967db.
- Do not return -2 when (in >= inend) to fix the bug.  This was
  applied to UTF16LEToUTF8() in 496a1cf592.
- Inline (<< 8) statements to match UTF16LEToUTF8().

Add the following tests and results:

  test/text-4-byte-UTF-16-BE-offset.xml
  test/text-4-byte-UTF-16-BE.xml
  test/text-4-byte-UTF-16-LE-offset.xml
  test/text-4-byte-UTF-16-LE.xml
2022-01-16 14:07:17 +01:00
David King
b92b16f659 Remove unused variable in xmlCharEncOutFunc
Fixes a compiler warning:

encoding.c: In function 'xmlCharEncOutFunc__internal_alias':
encoding.c:2632:9: warning: unused variable 'output' [-Wunused-variable]
 2632 |     int output = 0;

https://gitlab.gnome.org/GNOME/libxml2/-/issues/254
2021-05-23 11:55:32 +02:00
Nick Wellnhofer
dcb80b92da Fix slow parsing of HTML with encoding errors
Under certain circumstances, the HTML parser would try to guess and
switch input encodings multiple times, leading to slow processing of
documents with encoding errors. The repeated scanning of the input
buffer when guessing encodings could even lead to quadratic behavior.

The code htmlCurrentChar probably assumed that if there's an encoding
handler, it is guaranteed to produce valid UTF-8. This holds true in
general, but if the detected encoding was "UTF-8", the UTF8ToUTF8
encoding handler simply invoked memcpy without checking for invalid
UTF-8. This still must be fixed, preferably by not using this handler
at all.

Also leave a note that switching encodings twice seems impossible to
implement correctly. Add a check when handling UTF-8 encoding errors
in htmlCurrentChar to avoid this situation, even if encoders produce
invalid UTF-8.

Found by OSS-Fuzz.
2021-02-20 21:28:56 +01:00
Xiaoming Ni
649d02eaa4 encoding: fix memleak in xmlRegisterCharEncodingHandler()
The return type of xmlRegisterCharEncodingHandler() is void. The invoker
cannot determine whether xmlRegisterCharEncodingHandler() is executed
successfully. when nbCharEncodingHandler >= MAX_ENCODING_HANDLERS, the
"handler" is not added to the array "handlers". As a result, the memory
of "handler" cannot be managed and released: memory leakage.

so add "xmlfree(handler)" to fix memory leakage on the failure branch of
xmlRegisterCharEncodingHandler().

Reported-by: wuqing <wuqing30@huawei.com>
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
2020-12-07 14:38:14 +01:00
Frederik Seiffert
b516ed189e Fix building with ICU 68.
ICU 68 no longer defines the TRUE macro.

Closes #204.
2020-11-19 18:10:32 +01:00
Nick Wellnhofer
1e41e4fa8e Fix return values and documentation in encoding.c
Make xmlEncInputChunk and xmlEncOutputChunk return 0 on success and
never a positive value.

Make xmlCharEncFirstLineInt, xmlCharEncFirstLineInt and
xmlCharEncOutFunc return the number of bytes written.
2020-07-06 15:06:13 +02:00
Nick Wellnhofer
2f9382033e Fix undefined behavior in UTF16LEToUTF8
Don't perform arithmetic on null pointer.

Found with libFuzzer and UBSan.
2020-06-15 21:23:54 +02:00
Nick Wellnhofer
a697ed1e24 Fix return value of xmlCharEncOutput
Commit 407b393d introduced a regression caused by xmlCharEncOutput
returning 0 in case of success instead of the number of bytes written.
Always use its return value for nbchars in xmlOutputBufferWrite.

Fixes #166.
2020-06-15 15:23:38 +02:00
Nick Wellnhofer
20c60886e4 Fix typos
Resolves #133.
2020-03-08 17:41:53 +01:00
Jared Yanovich
2a350ee9b4 Large batch of typo fixes
Closes #109.
2019-09-30 18:04:38 +02:00
Andrey Bienkowski
d2293cdbc8 Remove a misleading line from xmlCharEncOutput
Closes: https://bugzilla.gnome.org/show_bug.cgi?id=793028

It seams this line was accidentally copied over from xmlCharEncOutFunc.
In xmlCharEncOutput output is a pointer so incrementing it by ret can
point it where it wasn't supposed to be pointing. Luckily the current
implementation doesn't dereference the pointer after advancing it.

Signed-off-by: Daniel Veillard <veillard@redhat.com>
2018-07-23 10:21:38 +08:00
Nick Wellnhofer
772c06487b Fix unused parameter warning without ICU 2017-11-09 17:56:31 +01:00
Joel Hockey
0b19f236a2 Fixed ICU to set flush correctly and provide pivot buffer.
By always setting flush=TRUE when doing multiple reads, ICU
will not correctly handle truncated utf8 chars across read
boundaries.

The fix is to set flush=TRUE only on final read, and to
provide a pivot buffer which is maintained by libxml
between calls to ucnv_convertEx.
2017-11-04 15:25:31 +01:00
Nick Wellnhofer
e5107772ff Fix pathological performance when outputting charrefs
If a character can't be represented in the output encoding, it is
converted to a character reference. This used to to replace the
character in the input stream by calling xmlBufAddHead or
xmlBufferAddHead. These functions shifted the entire input array
around, leading to quadratic performance when converting a run of
non-representable characters. This is most pronounced when dumping to
memory.

Output the charref directly instead.

Found with libFuzzer.
2017-06-19 16:06:21 +02:00
Nick Wellnhofer
c9ccbd6a6d Deduplicate code in encoding.c
Introduce static functions xmlEncInputChunk and xmlEncOutputChunk
that handle the internal/iconv/ICU branching.
2017-06-19 16:06:21 +02:00
David Kilzer
4472c3a5a5 Fix some format string warnings with possible format string vulnerability
For https://bugzilla.gnome.org/show_bug.cgi?id=761029

Decorate every method in libxml2 with the appropriate
LIBXML_ATTR_FORMAT(fmt,args) macro and add some cleanups
following the reports.
2016-05-23 15:01:07 +08:00
Gaurav
080a22c5ea Avoid a possibility of dangling encoding handler
For https://bugzilla.gnome.org/show_bug.cgi?id=711149

In Function:
int xmlCharEncCloseFunc(xmlCharEncodingHandler *handler)

If the freed handler is any one of handlers[i] list, then it will make that
hanldlers[i] as dangling. This may lead to crash issues at places where
handlers is read.
2013-11-29 23:10:50 +08:00
Denis Pauk
e28c8a1ace #705267 - add additional defines checks for support "./configure --with-minimum"
https://bugzilla.gnome.org/show_bug.cgi?id=705267
2013-08-03 22:00:17 +08:00
Daniel Veillard
bf058dce13 Fix the flushing out of raw buffers on encoding conversions
https://bugzilla.gnome.org/show_bug.cgi?id=692915

the new set of converting functions tried to limit the encoding
conversion of the raw buffer to the consumption one to work in
a more progressive fashion. Unfortunately this was bad for
performances and led to errors on progressive parsing when
a very large chunk was close to the end of the document. Fix
the new internal function and switch back to the old way of
converting. Fix another bug in the process.
2013-02-13 18:19:42 +08:00
Petr Sumbera
6f49c73b53 Try IBM-037 when looking for EBCDIC handlers
http://en.wikipedia.org/wiki/EBCDIC_037
as it is another variat of EBCDIC
2012-12-12 15:41:30 +08:00
Daniel Veillard
f8e3db0445 Big space and tab cleanup
Remove all space before tabs and space and tabs at end of lines.
2012-09-11 13:26:36 +08:00
Daniel Veillard
28cc42d068 Regenerating docs and API files
Various cleanups
* configure.in: force regeneration of APIs in my environment
* buf.c buf.h enc.h encoding.c include/libxml/tree.h
  include/libxml/xmlerror.h save.h tree.c: various comment cleanups
  pointed by apibuild
* doc/apibuild.py: added the 3 new internal headers in the excludes
* doc/libxml2-api.xml doc/libxml2-refs.xml: regenerated the API
* doc/symbols.xml: listing new entry points for 2.9.0
* doc/devhelp/*: regenerated
2012-08-10 10:00:18 +08:00
Daniel Veillard
18d0db2503 Adding new encoding function to deal with the new structures
* encoding.c: adds xmlCharEncFirstLineInput, xmlCharEncInput and
  xmlCharEncOutput
* enc.h: the functions are not made public but added to this new header
2012-07-23 14:24:26 +08:00
Timothy Elliott
689408bd86 Prevent an infinite loop when dumping a node with encoding problems
When a node is dumped with a new encoding, we may encounter characters
that are not supported in the new encoding. libxml2 handles this by
replacing the character with character references, but in some encodings
this can result in an infinite loop when the character references
themselves contain unsupported characters.

This fixes the infinite loop by undoing a character reference substitution
when it cannot be inserted, and returning an encoder error.

This bug was noticed when looking into an infinite loop bug report for
the Ruby Nokogiri project. The original bug report, "nokogiri process
hangs on call to inner_html" is here:
https://github.com/tenderlove/nokogiri/issues/400
2012-05-08 22:03:22 +08:00
Daniel Veillard
69f04562f7 Fix an off by one error in encoding
this off by one error doesn't seems to reproduce on linux
but the error is real.
2011-08-19 11:05:04 +08:00
Giuseppe Iuculano
48f7dcb724 480323 add code to plug in ICU converters by default
This is not configured in by default but after some serious massaging
incorporate that patch from Chromium/Chrome.
2010-11-04 17:42:42 +01:00
Daniel Veillard
ad4f0a2dc8 630140 better fix for iso995x encoding error
Changing semantic of xmlCharEncInFunc() wasn't the proper way to
do this, better change UTF8ToISO8859x() appropriately
2010-11-03 20:40:46 +01:00
Daniel Veillard
1cc912ec7e Various cleanups on encoding handling
Done while chasing previous bug
2010-11-03 19:26:35 +01:00
Daniel Veillard
083caf5ec8 630140 fix iso995x encoding error
https://bugzilla.gnome.org/show_bug.cgi?id=630140
Fix the bug, which happen when using the embedded converters and
not iconv
2010-11-03 19:24:05 +01:00
Daniel Veillard
d44b936499 A few more safety cleanup raised by scan
* SAX2.c encoding.c parser.c xmlschemas.c: a few more safety checks
* relaxng.c: remove an unused intitialization
2009-09-07 12:15:08 +02:00
Daniel Veillard
76d364583e Fixing assorted potential problems raised by scan
* encoding.c parser.c relaxng.c runsuite.c tree.c xmlreader.c
  xmlschemas.c: nothing really serious but better safe than sorry
2009-09-07 11:19:33 +02:00
Daniel Veillard
7e385bd4e2 566012 autodetected encoding and encoding conflict
* encoding.c parser.c parserInternals.c: when we autodetect an encoding
  but it's actually not completely compatible with the one declared
  great care must be taken to not convert more than just the first line.
  Led to some refactoring, more private functions and a bit of cleanup.
2009-08-26 11:38:49 +02:00
Martin Kögler
c78988acb7 566012 Incomplete EBCDIC parsing support
* encoding.c: the iconv converter is sometimes only found as "EBCDIC-US"
2009-08-24 16:47:48 +02:00
Daniel Veillard
e83e93e715 make a new kind of buffer where shrinking and adding in head can avoid
* include/libxml/tree.h tree.c: make a new kind of buffer where
  shrinking and adding in head can avoid reallocation or full
  buffer memmoves
* encoding.c xmlIO.c: use the new kind of buffers for output
  buffers
Daniel

svn path=/trunk/; revision=3787
2008-08-30 12:52:26 +00:00
Daniel Veillard
f124539f7a buffer may not be large enough to convert to UCS4, patch from Christian
* encoding.c: buffer may not be  large enough to convert to
  UCS4, patch from Christian Fruth , fixes #504015
Daniel

svn path=/trunk/; revision=3727
2008-04-03 09:46:34 +00:00
Daniel Veillard
57c9db0725 poblem with encoding detection for UTF-16 reported by Ashwin and found by
* encoding.c: poblem with encoding detection for UTF-16 reported by
  Ashwin and found by Bill
* test/valid/dtds/utf16b.ent test/valid/dtds/utf16l.ent
  test/valid/UTF16Entity.xml result/valid/UTF16Entity.xml*: added
  the example to the regression tests
Daniel

svn path=/trunk/; revision=3700
2008-03-06 14:37:10 +00:00
Daniel Veillard
8e1a46d526 patch from Roumen Petrov to detect if iconv() needs a const for the second
* config.h.in configure.in encoding.c: patch from Roumen Petrov
  to detect if iconv() needs a const for the second parameter
Daniel

svn path=/trunk/; revision=3693
2008-02-15 07:47:26 +00:00
William M. Brack
38d452ac1c Fixed typo in xmlCharEncFirstLine pointed out by Mark Rowe (bug #440159)
* encoding.c: Fixed typo in xmlCharEncFirstLine pointed out
  by Mark Rowe (bug #440159)
* include/libxml/xmlversion.h.in: Added check for definition of
  _POSIX_C_SOURCE to avoid warnings on Apple OS/X (patch from
  Wendy Doyle and Mark Rowe, bug #346675)
* schematron.c, testapi.c, tree.c, xmlIO.c, xmlsave.c: minor
  changes to fix compilation warnings - no change to logic.

svn path=/trunk/; revision=3618
2007-05-22 16:00:06 +00:00
Daniel Veillard
28aac0b0f4 remove a warning check with uppercase for AIX iconv() should fix #352644
* HTMLparser.c: remove a warning
* encoding.c: check with uppercase for AIX iconv() should fix #352644
* doc/examples/Makefile.am: partially handle one bug report
Daniel
2006-10-16 08:31:18 +00:00
Daniel Veillard
df750627d6 fixing bug #340398 xmlCharEncOutFunc writing to input buffer Daniel
* encoding.c: fixing bug #340398 xmlCharEncOutFunc writing to
  input buffer
Daniel
2006-05-02 12:24:06 +00:00
Daniel Veillard
aac7c68e87 fix a few warning raised by gcc-4.1 and latests changes Daniel
* c14n.c encoding.c xmlschemas.c xpath.c xpointer.c: fix a few
  warning raised by gcc-4.1 and latests changes
Daniel
2006-03-10 13:40:16 +00:00
Daniel Veillard
2728f845c5 more cleanups based on coverity reports. Daniel
* SAX2.c catalog.c encoding.c entities.c example/gjobread.c
  python/libxml.c: more cleanups based on coverity reports.
Daniel
2006-03-09 16:49:24 +00:00
Daniel Veillard
2e7598cb06 avoid passing a char[] as snprintf first argument. implemented
* encoding.c parserInternals.c: avoid passing a char[] as snprintf
  first argument.
* threads.c include/libxml/threads.h: implemented xmlIsThreadsEnabled()
  based on Andrew W. Nosenko idea.
* doc/* elfgcchack.h: regenerated the API
Daniel
2005-09-02 12:28:34 +00:00
Daniel Veillard
2644ab270e applied the patch suggested #309565 which can avoid looping in error
* encoding.c: applied the patch suggested #309565 which can avoid
  looping in error conditions.
Daniel
2005-08-24 14:22:55 +00:00
Daniel Veillard
1fc3ed0280 finally converted the encoding module to the common error reporting
* encoding.c error.c include/libxml/xmlerror.h: finally converted
  the encoding module to the common error reporting mechanism
* doc/* doc/html/libxml-xmlerror.html: rebuilt
Daniel
2005-08-24 12:46:09 +00:00
Daniel Veillard
24505b0f5c a lot of small cleanups based on Linus' sparse check output. Daniel
* HTMLparser.c SAX2.c encoding.c globals.c parser.c relaxng.c
  runsuite.c runtest.c schematron.c testHTML.c testReader.c
  testRegexp.c testSAX.c testThreads.c valid.c xinclude.c xmlIO.c
  xmllint.c xmlmodule.c xmlschemas.c xpath.c xpointer.c: a lot of
  small cleanups based on Linus' sparse check output.
Daniel
2005-07-28 23:49:35 +00:00
Daniel Veillard
5d4644ef6e revamped the elfgcchack.h format to cope with gcc4 change of aliasing
* doc/apibuild.py doc/elfgcchack.xsl: revamped the elfgcchack.h
  format to cope with gcc4 change of aliasing allowed scopes, had
  to add extra informations to doc/libxml2-api.xml to separate
  the header from the c module source.
* *.c: updated all c library files to add a #define bottom_xxx
  and reimport elfgcchack.h thereafter, and a bit of cleanups.
* doc//* testapi.c: regenerated when rebuilding the API
Daniel
2005-04-01 13:11:58 +00:00
Daniel Veillard
394902e0d2 fix unitinialized variable in not frequently used code bug #172182 Daniel
* encoding.c: fix unitinialized variable in not frequently used
  code bug #172182
Daniel
2005-03-31 08:43:44 +00:00
Daniel Veillard
cffc1c7af1 removed a static buffer in xmlByteConsumed(), as pointed by Ben Maurer,
* encoding.c: removed a static buffer in xmlByteConsumed(),
  as pointed by Ben Maurer, fixes #170086
* xmlschemas.c: remove a potentially uninitialized pointer warning
Daniel
2005-03-12 18:54:55 +00:00
Daniel Veillard
56de87ee0d fix the comment to describe the real return values lot of work on the
* encoding.c: fix the comment to describe the real return values
* pattern.c xpath.c include/libxml/pattern.h: lot of work on
  the patterns, pluggin in the XPath default evaluation, but
  disabled right now because it's not yet good enough for XSLT.
  pattern.h streaming API are likely to be changed to handle
  relative and absolute paths in the same expression.
Daniel
2005-02-16 00:22:29 +00:00
Daniel Veillard
aba37dffd7 forgot a $(srcdir) stupid error wrong name #157976 Daniel
* Makefile.am: forgot a $(srcdir)
* encoding.c: stupid error wrong name #157976
Daniel
2004-11-11 20:42:04 +00:00
Daniel Veillard
01ca83cd4c fixed a regression in iconv support. Daniel
* encoding.c: fixed a regression in iconv support.
Daniel
2004-11-06 13:26:59 +00:00
Daniel Veillard
ce682bc24b autogenerate a minimal NULL value sequence for unknown pointer types This
* gentest.py testapi.c: autogenerate a minimal NULL value sequence
  for unknown pointer types
* HTMLparser.c SAX2.c chvalid.c encoding.c entities.c parser.c
  parserInternals.c relaxng.c valid.c xmlIO.c xmlreader.c
  xmlsave.c xmlschemas.c xmlschemastypes.c xmlstring.c xpath.c
  xpointer.c: This uncovered an impressive amount of entry points
  not checking for NULL pointers when they ought to, closing all
  the open gaps.
Daniel
2004-11-05 17:22:25 +00:00
Daniel Veillard
05f9735ba3 Fixed bug #153937, making sure the conversion functions return the number
* encoding.c doc/examples/testWriter.c: Fixed bug #153937, making
  sure the conversion functions return the number of byte written.
  Had to fix one of the examples.
Daniel
2004-10-31 15:35:32 +00:00
William M. Brack
13dfa87e91 added the routine xmlNanoHTTPContentLength to the external API
* nanohttp.c, include/libxml/nanohttp.h: added the routine
  xmlNanoHTTPContentLength to the external API (bug151968).
* parser.c: fixed unnecessary internal error message (bug152060);
  also changed call to strncmp over to xmlStrncmp.
* encoding.c: fixed compilation warning (bug152307).
* tree.c: fixed segfault in xmlCopyPropList (bug152368); fixed
  a couple of compilation warnings.
* HTMLtree.c, debugXML.c, xmlmemory.c: fixed a few compilation
  warnings; no change to logic.
2004-09-18 04:52:08 +00:00
William M. Brack
f54924bd7e applied fixes for the UTF8ToISO8859x transcoding routine suggested by Mark
* encoding.c: applied fixes for the UTF8ToISO8859x transcoding
  routine suggested by Mark Itzcovitz
2004-09-09 14:35:17 +00:00
William M. Brack
a3215c7ae6 many further little changes for OOM problems. Now seems to be getting
* SAX2.c, encoding.c, error.c, parser.c, tree.c, uri.c, xmlIO.c,
  xmlreader.c, include/libxml/tree.h: many further little changes
  for OOM problems.  Now seems to be getting closer to "ok".
* testOOM.c: added code to intercept more errors, found more
  problems with library. Changed method of flagging / counting
  errors intercepted.
2004-07-31 16:24:01 +00:00
Daniel Veillard
b5da42af42 small patch to try to fix a warning with Sun One compiler Daniel
* encoding.c: small patch to try to fix a warning with Sun One compiler
Daniel
2004-02-21 14:57:44 +00:00
Daniel Veillard
3288882e8a small patch removing a warning with MS compiler. Daniel
* encoding.c: small patch removing a warning with MS compiler.
Daniel
2004-02-21 14:21:50 +00:00
Daniel Veillard
3671190b54 added xmlByteConsumed() interface updated the benchmark rebuilt the docs
* parserInternals.c xmlIO.c encoding.c include/libxml/parser.h
  include/libxml/xmlIO.h: added xmlByteConsumed() interface
* doc/*: updated the benchmark rebuilt the docs
* python/tests/Makefile.am python/tests/indexes.py: added a
  specific regression test for xmlByteConsumed()
* include/libxml/encoding.h rngparser.c tree.c: small cleanups
Daniel
2004-02-11 13:25:26 +00:00
William M. Brack
030a7a1729 applied patch supplied by Christophe Dubach to fix problem with
* encoding.c: applied patch supplied by Christophe Dubach
  to fix problem with --with-minimum configuration
  (bug 133773)
* nanoftp.c: fixed potential buffer overflow problem,
  similar to fix just applied to nanohttp.c.
2004-02-10 12:48:57 +00:00
Daniel Veillard
182d32a537 applied a small patch from Alfred Mickautsch to avoid an out of bound
* encoding.c: applied a small patch from Alfred Mickautsch
  to avoid an out of bound error in isolat1ToUTF8()
Daniel
2004-02-09 12:42:55 +00:00
William M. Brack
a2e844a3b3 moved string and UTF8 routines out of parser.c and encoding.c into a new
* encoding.c, parser.c, xmlstring.c, Makefile.am,
  include/libxml/Makefile.am, include/libxml/catalog.c,
  include/libxml/chvalid.h, include/libxml/encoding.h,
  include/libxml/parser.h, include/libxml/relaxng.h,
  include/libxml/tree.h, include/libxml/xmlwriter.h,
  include/libxml/xmlstring.h:
  moved string and UTF8 routines out of parser.c and encoding.c
  into a new module xmlstring.c with include file
  include/libxml/xmlstring.h mostly using patches from Reid
  Spencer.  Since xmlChar now defined in xmlstring.h, several
  include files needed to have a #include added for safety.
* doc/apibuild.py: added some additional sorting for various
  references displayed in the APIxxx.html files.  Rebuilt the
  docs, and also added new file for xmlstring module.
* configure.in: small addition to help my testing; no effect on
  normal usage.
* doc/search.php: added $_GET[query] so that persistent globals
  can be disabled (for recent versions of PHP)
2004-01-06 11:52:13 +00:00
William M. Brack
f9415e4989 Enhanced the handling of UTF-16, UTF-16LE and UTF-16BE encodings. Now
* encoding.c, include/libxml/encoding.h: Enhanced the handling of UTF-16,
  UTF-16LE and UTF-16BE encodings.  Now UTF-16 output is handled internally
  by default, with proper BOM and UTF-16LE encoding.  Native UTF-16LE and
  UTF-16BE encoding will not generate a BOM on output, and will be
  automatically recognized on input.
* test/utf16lebom.xml, test/utf16bebom.xml, result/utf16?ebom*: added
  regression tests for above.
2003-11-28 09:39:10 +00:00
Daniel Veillard
d0c9c32f64 cleanup fix a funny typo converted the Schemas code to the new error
* Makefile.am: cleanup
* encoding.c: fix a funny typo
* error.c xmlschemas.c xmlschemastypes.c include/libxml/xmlerror.h:
  converted the Schemas code to the new error handling. PITA,
  still need to check output from regression tests.
Daniel
2003-10-10 00:49:42 +00:00
Daniel Veillard
a9cce9cd0d Okay this is scary but it is just adding a configure option to disable
* HTMLtree.c SAX2.c c14n.c catalog.c configure.in debugXML.c
  encoding.c entities.c nanoftp.c nanohttp.c parser.c relaxng.c
  testAutomata.c testC14N.c testHTML.c testRegexp.c testRelax.c
  testSchemas.c testXPath.c threads.c tree.c valid.c xmlIO.c
  xmlcatalog.c xmllint.c xmlmemory.c xmlreader.c xmlschemas.c
  example/gjobread.c include/libxml/HTMLtree.h include/libxml/c14n.h
  include/libxml/catalog.h include/libxml/debugXML.h
  include/libxml/entities.h include/libxml/nanohttp.h
  include/libxml/relaxng.h include/libxml/tree.h
  include/libxml/valid.h include/libxml/xmlIO.h
  include/libxml/xmlschemas.h include/libxml/xmlversion.h.in
  include/libxml/xpathInternals.h python/libxml.c:
  Okay this is scary but it is just adding a configure option
  to disable output, this touches most of the files.
Daniel
2003-09-29 13:20:24 +00:00
William M. Brack
7b9154b01e further (final?) minor changes for compilation warnings. No change to
* encoding.c, parser.c, relaxng.c: further (final?) minor
  changes for compilation warnings. No change to logic.
2003-09-27 19:23:50 +00:00
William M. Brack
7a82165d2e Minor changes to comments, etc. for improving documentation generation
* encoding.c, threads.c, include/libxml/HTMLparser.h,
  doc/libxml2-api.xml: Minor changes to comments, etc. for
  improving documentation generation
* doc/Makefile.am: further adjustment to auto-generation of
  win32/libxml2.def.src
2003-08-15 07:27:40 +00:00
Daniel Veillard
ab1ae3a768 applied UTF-16 encoding handling patch provided by Mark Itzcovitz more
* encoding.c: applied UTF-16 encoding handling patch provided by
  Mark Itzcovitz
* encoding.c parser.c: more cleanup and fixes for UTF-16 when
  not having iconv support.
Daniel
2003-08-14 12:19:54 +00:00
William M. Brack
16db7b6ee4 further small changes for warnings when configured with --with-iconv=no
* encoding.c: further small changes for warnings when
  configured with --with-iconv=no
2003-08-07 13:12:49 +00:00
Daniel Veillard
01fc1a9d5a applying patch from Peter Jacobi to added ISO-8859-x encoding support when
* encoding.c: applying patch from Peter Jacobi to added
  ISO-8859-x encoding support when iconv is not available
* configure.in include/libxml/xmlversion.h.in
  include/libxml/xmlwin32version.h.in: added the glue needed
  at the configure level and made it the default for Windows
Daniel
2003-07-30 15:12:01 +00:00
Daniel Veillard
9ff7de14ae fix the previous commit Daniel
* encoding.c: fix the previous commit
Daniel
2003-07-29 13:30:42 +00:00
William M. Brack
4a557d97bf fixed problem with comments reported by Nick Kew added routines
* HTMLparser.c: fixed problem with comments reported by Nick Kew
* encoding.c: added routines xmlUTF8Size and xmlUTF8Charcmp for
  some future cleanup of UTF8 handling
2003-07-29 04:28:04 +00:00
Daniel Veillard
8caa9c2c97 small fix fixed an error message Daniel
* encoding.c: small fix
* xmlIO.c: fixed an error message
Daniel
2003-06-02 13:35:24 +00:00
Daniel Veillard
3c908dca47 added xmlMallocAtomic() to be used when allocating blocks which do not
* DOCBparser.c HTMLparser.c c14n.c catalog.c encoding.c globals.c
  nanohttp.c parser.c parserInternals.c relaxng.c tree.c uri.c
  xmlmemory.c xmlreader.c xmlregexp.c xpath.c xpointer.c
  include/libxml/globals.h include/libxml/xmlmemory.h: added
  xmlMallocAtomic() to be used when allocating blocks which
  do not contains pointers, add xmlGcMemSetup() and xmlGcMemGet()
  to allow registering the full set of functions needed by
  a garbage collecting allocator like libgc, ref #109944
Daniel
2003-04-19 00:07:51 +00:00
Igor Zlatkovic
73267db52d applied Gennady's patch against buffer overrun 2003-03-08 13:29:24 +00:00
Daniel Veillard
809faa5237 fixing bug #104646 about iconv based encoding conversion when the input
* encoding.c xmlIO.c: fixing bug #104646 about iconv based
  encoding conversion when the input buffer stops in the
  middle of a multibyte char
Daniel
2003-02-10 15:43:53 +00:00