1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-01-20 14:03:33 +03:00

17 Commits

Author SHA1 Message Date
Nick Wellnhofer
8a85263f13 Add fuzzing dictionaries to EXTRA_DIST
Also add static seed corpus for the URI fuzzer.
2020-10-25 20:08:16 +01:00
Nick Wellnhofer
6f1470a5d6 Hardcode maximum XPath recursion depth
Always limit nested functions calls to 5000. This avoids call stack
overflows with deeply nested expressions.

The expression parser produces about 10 nested function calls when
parsing a subexpression in parentheses, so the effective nesting limit
is about 500 which should be more than enough.

Use a lower limit when fuzzing to account for increased memory usage
when using sanitizers.
2020-08-26 00:22:25 +02:00
Nick Wellnhofer
8c3ef083ca Pass URL of main entity in XML fuzzer 2020-08-24 23:17:34 +02:00
Nick Wellnhofer
0d5f3710fb Consolidate seed corpus generation
Implement file handling in C to speed up corpus generation.
2020-08-24 21:14:55 +02:00
Nick Wellnhofer
0d9da0290c Test fuzz targets with dummy driver
Run fuzz targets with files in seed corpus during test.
2020-08-24 03:57:03 +02:00
Nick Wellnhofer
804c52978f Stop using maxParserDepth in xpath.c
Only use a single maxDepth value.
2020-08-17 03:39:51 +02:00
Nick Wellnhofer
0ff527482d Fix autotools warnings 2020-08-17 02:54:28 +02:00
Nick Wellnhofer
10a0794878 Fix XPath fuzzer 2020-08-08 17:46:11 +02:00
Nick Wellnhofer
6c128fd58a Fuzz XInclude engine 2020-08-08 14:32:44 +02:00
Nick Wellnhofer
ad26a60f95 Add XPath and XPointer fuzzer 2020-08-06 14:12:32 +02:00
Nick Wellnhofer
905820a44c Update fuzzing code
- Shorten timeouts
- Align options from Makefile and options files
- Add section headers to Makefile
- Skip invalid UTF-8 in regexp fuzzer
- Update regexp.dict
- Generate HTML seed corpus in correct format
2020-07-31 11:55:13 +02:00
Nick Wellnhofer
93ce33c2b8 Fix several quadratic runtime issues in HTML push parser
Fix a few remaining cases where the HTML push parser would scan more
content during lookahead than being parsed later.

Make sure that htmlParseDocTypeDecl consumes all content up to the
final '>' in case of errors. The old comment said "We shouldn't try to
resynchronize", but ignoring invalid content is also what the HTML5
spec mandates.

Likewise, make htmlParseEndTag skip to the final '>' in invalid end
tags even if not in recovery mode. This is probably the most visible
change in practice and leads to different output for some tests but is
also more in line with HTML5.

Make sure that htmlParsePI and htmlParseComment don't abort if invalid
characters are encountered but log an error and ignore the character.

Change some other end-of-buffer checks to test for a zero byte instead
of relying on IS_CHAR.

Fix usage of IS_CHAR macro in htmlParseScript.
2020-07-23 20:47:35 +02:00
Nick Wellnhofer
eac1c7e2e5 Fuzz target for XML Schemas
This only tests the schema parser for now.
2020-06-23 16:20:27 +02:00
Nick Wellnhofer
ffd31dbefd Move entity recorder to fuzz.c 2020-06-21 12:15:46 +02:00
Nick Wellnhofer
536f421d37 Fuzz target for HTML parser 2020-06-15 15:23:38 +02:00
Nick Wellnhofer
e98150d444 Add options file for xml fuzzer
This will be picked up OSS-Fuzz, limiting the maximum input size to
80 KB and hopefully avoiding timeouts. Some of the timeouts seem to be
related to our suboptimal handling of excessive entity expansion.
The new fuzzers support external entities and make this problem even
more prominent.
2020-06-09 13:53:06 +02:00
Nick Wellnhofer
00ed736eec Add a couple of libFuzzer targets
- XML fuzzer
  Currently tests the pull parser, push parser and reader, as well as
  serialization. Supports splitting fuzz data into multiple documents
  for things like external DTDs or entities. The seed corpus is built
  from parts of the test suite.

- Regexp fuzzer
  Seed corpus was statically generated from test suite.

- URI fuzzer
  Tests parsing and most other functions from uri.c.
2020-06-05 13:53:11 +02:00