1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-01-25 06:03:34 +03:00

130 Commits

Author SHA1 Message Date
Nick Wellnhofer
e75e878e02 doc: Update and fix documentation 2024-05-20 14:23:39 +02:00
Nick Wellnhofer
ab63197149 uri: Keep fragment intact when resolving filesystem paths 2023-12-28 17:07:03 +01:00
Nick Wellnhofer
8ab1b122c4 Fix filename and URI handling
Many strings are passed to the library that could be either URIs or
filesystem paths. We now assume that strings are a URI if they contain
the substring "://". This means that they have a scheme and an
authority. Otherwise, URI resolution wouldn't make much sense.

Fix xmlBuildURI to work with filesystem paths. If the base URI doesn't
contain "://" it is treated as filename. The resolved URI is unescaped,
appended and the result is normalized. Rewrite xmlNormalizePath to
handle Windows quirks.

All special handling for Windows paths is removed in xmlCanonicPath.
If the path looks like an URI, only escape characters allowed in Legacy
Extended IRIs.

Make xmlPathToURI only call xmlCanonicPath. Theh additional round-trip
through URI parser and serializer seems useless.

Add a helper function xmlConvertUriToPath in xmlIO.c which checks for
file URIs and unescapes them.

Always process strings with xmlCanonicPath in xmlLoadExternalEntity.
This should be harmless now.

Should help with #334, #387, #611.
2023-12-25 23:38:40 +01:00
Nick Wellnhofer
28913232f6 uri: Clean up special parsing modes
Add function to handle unreserved check. Give flags meaningful names.
Add support to allow ucschars from Legacy Extended IRIs.
2023-12-25 23:38:40 +01:00
Nick Wellnhofer
da996c8d0f uri: Report malloc failures
Fix many places where malloc failures weren't reported, for example
after calling xmlStrdup.

Introduce new public API functions that return a separate error code if
a memory allocation fails:

- xmlParseURISafe
- xmlBuildURISafe
- xmlBuildRelativeURISafe

Update the fuzzer to check whether malloc failures are reported.
2023-12-11 22:05:47 +01:00
Nick Wellnhofer
699299cae3 globals: Stop including globals.h 2023-09-20 22:07:40 +02:00
Nick Wellnhofer
f65133fc04 uri: Add explicit cast in xmlSaveUri
Fix -fsanitize=implicit-conversion error. We should probably
percent-escape the host name here.
2023-01-24 11:32:15 +01:00
Nick Wellnhofer
ae0c9cfa05 uri: Fix handling of port numbers
Allow port number without host, real fix for #71.

Also compare port numbers in xmlBuildRelativeURI.

Fix handling of port numbers in xmlUriEscape.
2022-12-13 01:43:49 +01:00
Nick Wellnhofer
8ed40c621b Revert "uri: Allow port without host"
This reverts commit f30adb54f55e4e765d58195163f2a21f7ac759fb.

Fixes #460.
2022-12-13 00:51:33 +01:00
Nick Wellnhofer
f30adb54f5 uri: Allow port without host
Don't set port to -1 when host is missing. Host can be empty according
to spec.

Fixes #71.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
76d6b0d768 html: Don't escape ASCII chars in href attributes
In several cases, href attributes can contain ASCII characters which are
illegal in URIs. Escaping them often does more harm than good.

Fixes #321.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
6843fc726f Remove or annotate char casts 2022-09-01 04:31:30 +02:00
Nick Wellnhofer
2cac626976 Don't use sizeof(xmlChar) or sizeof(char) 2022-09-01 03:35:19 +02:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
Nick Wellnhofer
2489c1d024 Remove useless __CYGWIN__ checks
From what I can tell, some really early Cygwin versions from around
1998-2000 used to erroneously define _WIN32. This was eventually fixed,
but these days, the `defined(_WIN32) && !defined(__CYGWIN__)` idiom is
unnecessary.

Now, we only check for __CYGWIN__ in xmlexports.h when deciding whether
to use __declspec.
2022-02-28 22:58:35 +01:00
Nick Wellnhofer
346c3a930c Remove elfgcchack.h
The same optimization can be enabled with -fno-semantic-interposition
since GCC 5. clang has always used this option by default.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
0596d67ddc Add explicit cast in xmlURIUnescapeString
Avoids an integer conversion warning with UBSan.
2022-01-25 01:39:41 +01:00
Elliott Hughes
7c06d99e1f Fix xmlURIEscape memory leaks.
Found by running the fuzz/uri.c fuzzer under asan (internal Android bug
171610679).

Always free `ret` when exiting on failure. I've moved the definition of
NULLCHK down past where ret is always initialized to make it clear that
this is safe.

This patch also fixes the indentation of two of the NULLCHK call sites
to make it more obvious that NULLCHK isn't `if`-like.
2020-11-09 18:17:01 +01:00
Nick Wellnhofer
b46016b870 Allow port numbers up to INT_MAX
Also return an error on overflow.
2020-10-17 18:03:09 +02:00
Nick Wellnhofer
20c60886e4 Fix typos
Resolves #133.
2020-03-08 17:41:53 +01:00
Jared Yanovich
2a350ee9b4 Large batch of typo fixes
Closes #109.
2019-09-30 18:04:38 +02:00
Nick Wellnhofer
f9fce96313 Fix unsigned integer overflow
It's defined behavior but -fsanitize=unsigned-integer-overflow is
useful to discover bugs.
2019-05-20 13:38:22 +02:00
Thomas Holder
a71b98ec9d cleanup: remove some unreachable code 2018-11-29 22:25:35 +01:00
Thomas Holder
b1f87c0e43 Fix building relative URIs
Examples:

testURI --relative --base file:///a file:///b
New correct result: b
Old incorrect result: ../b

testURI --relative --base file:///a file:///
New correct result: ./
Old incorrect result: ../

testURI --relative --base file:///a/b file:///a/
New correct result: ./
Old incorrect result: ../../a/
2018-11-29 22:19:44 +01:00
Nick Wellnhofer
41c0a13fe7 Fix Windows compiler warnings in xmlCanonicPath
The code handling Windows paths assigned some char/xmlChar pointers
without explicit casts. Also remove an unused variable.
2017-10-09 13:46:44 +02:00
Daniel Veillard
3daee3f159 Problem resolving relative URIs
Raised by Matthias Pigulla <mp@webfactory.de>

In a nutshell we had that bug on URI composition after some fixes in
the area of localhost empty shortcuts :

./testURI --base file:///some/where file

Without patch: file:/some/file
With patch: file:///some/file
2017-08-28 21:12:14 +02:00
Nick Wellnhofer
91e5496780 Fix xmlBuildRelativeURI for URIs starting with './'
If the relative URI started with './', the 'pos' index was increased
which also affected indexing into the base path. Aside from producing
wrong results, this could also lead to a heap overread of the base
path buffer. The data read from beyond the buffer was only compared
to some char values, so this is mostly harmless.

Inside libxml2, xmlBuildRelativeURI is only called from xinclude.c.

Found with libFuzzer and ASan.
2017-06-10 17:41:42 +02:00
Nick Wellnhofer
d6b3645f9b Fix memory leak in xmlCanonicPath
Found with libFuzzer and ASan.
2017-05-27 15:59:18 +02:00
Michael Paddon
846cf015a7 Integer overflow parsing port number in URI
For https://bugzilla.gnome.org/show_bug.cgi?id=765566

in xmlParse3986Port(), uri->port can overflow when parsing a the port number.
The type of uri->port is int, so the consequent behavior is undefined and
may differ between compilers and architectures
2016-05-21 17:18:15 +08:00
Daniel Veillard
beb7281055 Fix a problem properly saving URIs
As written by Martin Kletzander <mkletzan@redhat.com>:
Since commit 8eb55d782a2b9afacc7938694891cc6fad7b42a5, when you parse
and save an URI that has no server (or similar) part, two slashes
after the 'schema:' get lost.  It means 'uri:///noserver' is turned
into 'uri:/noserver'.

basically
   foo:///only/path

means a host of "" while

   foo:/only/path

means no host at all

  So the best fix IMHO is to fix the URI parser to record the first
case and an empty host string and the second case as a NULL host string

 I would not revert the initial patch, we should not 'invent' those
slash, but we should instead when parsing keep the information that
it's a host based path and that foo:/// means the presence of a host
but an empty one.

Once applied the resulting patch below, all cases seems to be saved
properly:

thinkpad:~/XML -> ./testURI uri:/noserver
uri:/noserver
thinkpad:~/XML -> ./testURI uri:///noserver
uri:///noserver
thinkpad:~/XML -> ./testURI uri://server/foo
uri://server/foo
thinkpad:~/XML -> ./testURI uri:/noserver/foo
uri:/noserver/foo
thinkpad:~/XML -> ./testURI uri:///
uri:///
thinkpad:~/XML -> ./testURI uri://
uri://
thinkpad:~/XML -> ./testURI uri:/
uri:/
thinkpad:~/XML ->

  If you revert the initial patch that last case fails

The problem is that I don't want to change the xmlURI structure to
minimize ABI breakage, so I could not extend the field. The natural
solution is to denote that uri:/// has an empty host by making
the uri server field an empty string which works very well but breaks
applications (like libvirt ;-) who blindly look at uri->server
not being NULL to try to reach it !
Simplest was to stick the port to -1 in that case, instead of 0
application don't bother looking at the port of there is no server
string, this makes the patch more complex than a 1 liner, but
is better for ABI.
2014-10-03 19:22:39 +08:00
Dennis Filder
8eb55d782a xmlSaveUri() incorrectly recomposes URIs with rootless paths
For https://bugzilla.gnome.org/show_bug.cgi?id=731063

xmlSaveUri() of libxml2 (snapshot 2014-05-31 and earlier) returns
bogus values when called with URIs that have rootless paths
(e.g. "urx🅱️b" becomes "urx://b%3Ab" where "urx:b%3Ab" would be
correct)
2014-06-13 14:56:14 +08:00
Michael Stahl
55b899a23a Support long path names on WNT
so we've got this patch to libxml2 2.7.6 in the LibreOffice code base,
inherited from OOo.  it fixes a definite problem, which is that Windows
has a rather low maximum path length restriction, and there is a special
trick on NT whereby path names can be prefixed with "\\?\", in which
case the maximum length is 32k, which ought to be sufficient even for
bloated office suites :)

I'll attach the patch to the xmlCanonicPath function.  note that i
didn't write this and am by no means an expert on either Microsoftean
platforms or libxml so maybe it's not the best way to do it.
2012-09-07 12:19:25 +08:00
Daniel Veillard
5756038650 Cleanup URI module memory allocation code
* uri.c: cleanup the code doing the allocations, set up a structured
  error handler to report memory errors, and set up an abitrary
  limit on URI saving size
* error.c include/libxml/xmlerror.h: add a new FROM_URI indication
  for structured error reporting, also adding strings for schematron
  and buffer which were missing
2012-07-24 11:44:23 +08:00
Daniel Veillard
fc74a6f5c2 URI handling code is not OOM resilient
as pointed out by Dan Berrange, add a small comment in the header
2012-05-07 15:02:25 +08:00
Nico Weber
cedf84d35a Fix -Wempty-body warning from clang
clang recently grew a warning on `for (...);`. This patch
fixes all two instances of this pattern in libxml. The changes
don't modify the code semantic.
2012-03-05 16:36:59 +08:00
Daniel Veillard
2ee91eb658 Fix handling of apos in URIs
François Delyon <f.delyon@satimage.fr> pointed out a divergence between
the URI code and RFC 3986, fix trivial and seems to not break regression
tests
2010-06-04 09:14:16 +08:00
Daniel Veillard
1358fef9aa URI with no path parsing problem
* uri.c: Ralf Junker pointed out that URI with no path
  like http://www.domain.com when parsed ended up with an
  empty path value instead of NULL, this fixes the problem
2009-10-02 17:29:48 +02:00
Daniel Veillard
13cee4e37b Fix a bunch of scan 'dead increments' and cleanup
* HTMLparser.c c14n.c debugXML.c entities.c nanohttp.c parser.c
  testC14N.c uri.c xmlcatalog.c xmllint.c xmlregexp.c xpath.c:
  fix unused variables, or unneeded increments as well as a couple
  of space issues
* runtest.c: check for NULL before calling unlink()
2009-09-05 14:52:55 +02:00
Daniel Veillard
f582d14fbc bug in parsing RFC 3986 uris with port numbers Daniel
* uri.c: bug in parsing RFC 3986 uris with port numbers
Daniel

svn path=/trunk/; revision=3781
2008-08-27 17:23:41 +00:00
Daniel Veillard
84c45df8d8 allow [ and ] in fragment identifiers, 3986 disallow them but it's widely
* uri.c: allow [ and ] in fragment identifiers, 3986 disallow them
  but it's widely used for XPointer, and would break DocBook
  processing among others
Daniel

svn path=/trunk/; revision=3765
2008-08-06 10:26:06 +00:00
Daniel Veillard
d7af555327 rewrite the URI parser to update to rfc3986 (from 2396) removed the error
* uri.c include/libxml/uri.h: rewrite the URI parser to update to
  rfc3986 (from 2396)
* test/errors/webdav.xml result/errors/webdav.xml*: removed the
  error test, 'DAV:' is a correct URI under 3986
* Makefile.am: small cleanup in make check
Daniel

svn path=/trunk/; revision=3763
2008-08-04 15:29:44 +00:00
Daniel Veillard
ed86dc2383 applied patch from Ashwin fixing a number of realloc problems improve
* uri.c: applied patch from Ashwin fixing a number of realloc problems
* HTMLparser.c: improve handling for misplaced html/head/body
Daniel

svn path=/trunk/; revision=3740
2008-04-24 11:58:41 +00:00
Daniel Veillard
e54c3173b8 fix saving for file:///X:/ URI embedding Windows file paths should fix
* uri.c: fix saving for file:///X:/ URI embedding Windows file paths
  should fix #524253 
Daniel

svn path=/trunk/; revision=3714
2008-03-25 13:22:41 +00:00
Daniel Veillard
69f8a13e52 applied a patch based on Petr Sumbera one to avoid a problem with paths
* uri.c: applied a patch based on Petr Sumbera one to avoid a 
  problem with paths starting with //
Daniel

svn path=/trunk/; revision=3683
2008-02-05 08:37:56 +00:00
William M. Brack
504201966d applied patch from from Patrik Fimml. Fixes bug #458268
* uri.c: applied patch from from Patrik Fimml.  Fixes bug #458268

svn path=/trunk/; revision=3645
2007-07-20 01:09:08 +00:00
Daniel Veillard
e61d75f11e fix bug reported by François Delyon Daniel
* uri.c: fix bug reported by François Delyon
Daniel

svn path=/trunk/; revision=3619
2007-05-28 14:16:33 +00:00
Daniel Veillard
a1413b84f7 patch from Richard Jones to save the query part in raw form. Daniel
* uri.c include/libxml/uri.h: patch from Richard Jones to save
  the query part in raw form.
Daniel

svn path=/trunk/; revision=3607
2007-04-26 08:33:28 +00:00
Daniel Veillard
7918765454 More doc cleanup, Daniel
svn path=/trunk/; revision=3604
2007-04-24 10:19:52 +00:00
Daniel Veillard
a44294f10b fix xmlURIUnescapeString comments which was confusing Daniel
* uri.c: fix xmlURIUnescapeString comments which was confusing
Daniel

svn path=/trunk/; revision=3603
2007-04-24 08:57:54 +00:00
William M. Brack
2224227818 implemented patch from S. Bidoul for uri.c (bug #389767)
* implemented patch from S. Bidoul for uri.c (bug #389767)

svn path=/trunk/; revision=3576
2007-01-27 07:59:37 +00:00