1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2024-10-26 12:25:09 +03:00
Commit Graph

123 Commits

Author SHA1 Message Date
Nick Wellnhofer
0fb3ae5840 Revert "Improve HTML fuzzer stability"
This reverts commit de1b51eddc.
2021-02-22 17:31:05 +01:00
Nick Wellnhofer
0987001c1b Add charset names to fuzzing dictionaries 2021-02-22 13:21:38 +01:00
Nick Wellnhofer
de1b51eddc Improve HTML fuzzer stability
Call htmlInitAutoClose during fuzzer initialization to fix stability
issue. Leave a note concerning problems with this function.
2021-02-22 13:21:38 +01:00
Nick Wellnhofer
ec808a4415 Speed up HTML fuzzer
htmlDocDumpMemory uses the "HTML" encoding if no other encoding was
specified in the source HTML. This encoding can be extremely slow
because of an inefficiency in htmlEntityValueLookup. Stop encoding
the output for now.
2021-02-07 14:39:55 +01:00
Nick Wellnhofer
e2b975c317 Handle malloc failures in fuzzing code
Avoid misdiagnosis in OOM situations.
2020-12-18 14:10:13 +01:00
Nick Wellnhofer
9086988ffa Enforce maximum length of fuzz input
Remove the libfuzzer max_len option which doesn't apply to other
fuzzing engines. Enforce the maximum length directly in the fuzz
targets. For the xml target, lower the maximum when expanding entities
to avoid timeout and OOM errors.
2020-12-16 16:12:07 +01:00
Nick Wellnhofer
8a85263f13 Add fuzzing dictionaries to EXTRA_DIST
Also add static seed corpus for the URI fuzzer.
2020-10-25 20:08:16 +01:00
Nick Wellnhofer
6f1470a5d6 Hardcode maximum XPath recursion depth
Always limit nested functions calls to 5000. This avoids call stack
overflows with deeply nested expressions.

The expression parser produces about 10 nested function calls when
parsing a subexpression in parentheses, so the effective nesting limit
is about 500 which should be more than enough.

Use a lower limit when fuzzing to account for increased memory usage
when using sanitizers.
2020-08-26 00:22:25 +02:00
Nick Wellnhofer
8c3ef083ca Pass URL of main entity in XML fuzzer 2020-08-24 23:17:34 +02:00
Nick Wellnhofer
0d5f3710fb Consolidate seed corpus generation
Implement file handling in C to speed up corpus generation.
2020-08-24 21:14:55 +02:00
Nick Wellnhofer
0d9da0290c Test fuzz targets with dummy driver
Run fuzz targets with files in seed corpus during test.
2020-08-24 03:57:03 +02:00
Nick Wellnhofer
804c52978f Stop using maxParserDepth in xpath.c
Only use a single maxDepth value.
2020-08-17 03:39:51 +02:00
Nick Wellnhofer
0ff527482d Fix autotools warnings 2020-08-17 02:54:28 +02:00
Nick Wellnhofer
10a0794878 Fix XPath fuzzer 2020-08-08 17:46:11 +02:00
Nick Wellnhofer
6c128fd58a Fuzz XInclude engine 2020-08-08 14:32:44 +02:00
Nick Wellnhofer
ad26a60f95 Add XPath and XPointer fuzzer 2020-08-06 14:12:32 +02:00
Nick Wellnhofer
905820a44c Update fuzzing code
- Shorten timeouts
- Align options from Makefile and options files
- Add section headers to Makefile
- Skip invalid UTF-8 in regexp fuzzer
- Update regexp.dict
- Generate HTML seed corpus in correct format
2020-07-31 11:55:13 +02:00
Nick Wellnhofer
93ce33c2b8 Fix several quadratic runtime issues in HTML push parser
Fix a few remaining cases where the HTML push parser would scan more
content during lookahead than being parsed later.

Make sure that htmlParseDocTypeDecl consumes all content up to the
final '>' in case of errors. The old comment said "We shouldn't try to
resynchronize", but ignoring invalid content is also what the HTML5
spec mandates.

Likewise, make htmlParseEndTag skip to the final '>' in invalid end
tags even if not in recovery mode. This is probably the most visible
change in practice and leads to different output for some tests but is
also more in line with HTML5.

Make sure that htmlParsePI and htmlParseComment don't abort if invalid
characters are encountered but log an error and ignore the character.

Change some other end-of-buffer checks to test for a zero byte instead
of relying on IS_CHAR.

Fix usage of IS_CHAR macro in htmlParseScript.
2020-07-23 20:47:35 +02:00
Nick Wellnhofer
eac1c7e2e5 Fuzz target for XML Schemas
This only tests the schema parser for now.
2020-06-23 16:20:27 +02:00
Nick Wellnhofer
ffd31dbefd Move entity recorder to fuzz.c 2020-06-21 12:15:46 +02:00
Nick Wellnhofer
536f421d37 Fuzz target for HTML parser 2020-06-15 15:23:38 +02:00
Nick Wellnhofer
e98150d444 Add options file for xml fuzzer
This will be picked up OSS-Fuzz, limiting the maximum input size to
80 KB and hopefully avoiding timeouts. Some of the timeouts seem to be
related to our suboptimal handling of excessive entity expansion.
The new fuzzers support external entities and make this problem even
more prominent.
2020-06-09 13:53:06 +02:00
Nick Wellnhofer
00ed736eec Add a couple of libFuzzer targets
- XML fuzzer
  Currently tests the pull parser, push parser and reader, as well as
  serialization. Supports splitting fuzz data into multiple documents
  for things like external DTDs or entities. The seed corpus is built
  from parts of the test suite.

- Regexp fuzzer
  Seed corpus was statically generated from test suite.

- URI fuzzer
  Tests parsing and most other functions from uri.c.
2020-06-05 13:53:11 +02:00