IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Swap arguments in initial call to xmlFARecurseDeterminism.
Fix the check whether we revisit the initial state in
xmlFARecurseDeterminism.
If there are transitions with equal atoms and targets but different
counters, treat the regex as deterministic but mark the transitions as
non-deterministic internally.
Don't overwrite zero return value of xmlFAComputesDeterminism
with non-zero value from xmlFARecurseDeterminism.
Most of these errors lead to non-deterministic regexes not being
detected which typically isn't an issue. The improved code may break
users who relied on buggy behavior or cause other bugs to become
visible.
Fixes#469.
The visited flag must only be reset after the first call to
xmlFAReduceEpsilonTransitions has finished. Visiting states multiple
times could lead to unnecessary processing of duplicate transitions.
Similar to 68eadabd.
Private functions were previously declared
- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.
Consolidate all private header files in include/private.
This adds support for some non-standard escape sequences observed
in Microsoft's MSXML DLLs and used by Windows apps, and thus
needed by Wine. Some are also used in other XML implementations,
eg. Java's.
This isn't intended to be final. We probably wish to toggle these
non-standard escape sequences on and off somehow, as needed by
the caller.
Further discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/260
When building the internal representation of a regexp, it is possible
that a lot of empty transitions are created. Therefore there is a step
to reduce them in the function xmlFAEliminateSimpleEpsilonTransitions.
There is an error there for this case:
* State 1 has a transition with an atom (in this case "a") to state 2.
* State 2 is final and has an epsilon transition to state 1.
After reduction it looked like:
* State 1 has a transition with an atom (in this case "a") to itself
and is final.
In other words, the empty string is accepted when it shouldn't be.
The attached patch skips the reduction step for final states.
An alternative would be to insert or increment counters when reducing a
final state, but this seemed error prone and unnecessary, since there
aren't that many final states.
Fixes#282
In order to prevent visiting a state twice, states must be marked as
visited for the whole duration of graph traversal because states might
be reached by different paths. Otherwise state graphs like the
following can lead to exponential runtime:
->O-->O-->O-->O-->O->
\ / \ / \ / \ /
O O O O
Reset the "visited" flag only after the graph was traversed.
xmlFAComputesDeterminism still has massive performance problems when
handling fuzzed input. By design, it has quadratic time complexity in
the number of reachable states. Some issues might also stem from
redundant epsilon transitions. With this fix, fuzzing regexes with a
maximum length of 100 becomes feasible at least.
Found with libFuzzer.