7745 Commits

Author SHA1 Message Date
Konstantin Kozoriz
f41b1356c0 123-alt1
* Tue Apr 01 2025 Konstantin Kozoriz <kozorizki@altlinux.org> 123-alt1
 - Initial RPM package for Binaryen
2025-04-08 14:03:35 +03:00
Konstantin Kozoriz
efe2f50952 gear-remotes-save 2025-04-08 13:50:16 +03:00
Alon Zakai
3ca38d8231
[NFC] Add an explicit ModuleRunner.start() (#7463)
Previously the constructor ran the start method and other initialization. Doing
that in an explicit function that the user calls makes it possible to do things in
the middle. The specific thing I would like to do in the middle is have a function
to change the interpreter instance's mode to "reject as nonconstant relaxed SIMD",
but I've wanted this in the past too, and worked around it - seems best to just
fix it.

Adding more flags to the constructor is another option, but there are already
several, and more in the parent classes, and several levels of inheritance through
which all such options must be forwarded, which is annoying.
2025-04-07 16:07:15 -07:00
Alon Zakai
69e665eda3
[GC] Fix order of operations in TypeRefiningGUFA: Restrictions must propagate (#7462)
We must first find global restrictions, then propagate, so that the
restrictions propagate.
2025-04-07 12:49:26 -07:00
Ruiyang Xu
b0d51f5af6
OptimizeInstructions: Optimize unsigned(x) >= 0 => i32(1) even with side effects (#7429)
Fixes #7425
2025-04-07 11:18:59 -07:00
Ruiyang Xu
16dbac101b
OptimizeInstructions: Optimize x | -1 ==> -1 even with side effects (#7439)
Fixes: #7438
2025-04-04 16:32:11 -07:00
Ruiyang Xu
8e4d3c1e3f
OptimizeInstructions: Optimize (unsigned)x > -1 ==> i32(0) even with side effects (#7437)
Fixes: #7435
2025-04-04 15:32:22 -07:00
Ruiyang Xu
7e9e9dd55c
OptimizeInstructions: Optimize (unsigned)x <= -1 ==> i32(1) even with side effects (#7436)
Fixes #7434
2025-04-04 13:43:48 -07:00
Alon Zakai
4734c0e742
ClusterFuzz: Bundle mimalloc (#7450)
The emsdk's linux builds now include mimalloc, so we must bundle
it for ClusterFuzz, same as we already do for libc++.
2025-04-04 13:43:02 -07:00
Alon Zakai
63186dcf70
[GC] Fix global handling in TypeRefiningGUFA (#7451)
TypeRefiningGUFA refines struct types based on the types that flow into fields,
and looks past the immediate types in the IR. As a result, it can require
casts when we end up storing something that appears not-sufficiently-
refined to wasm validation rules, in ways that cannot occur with normal
TypeRefining.

The specific problem that can happen is that casts are disallowed in
globals, so we must be careful to not refine too much there.
2025-04-04 12:48:01 -07:00
Thomas Lively
62b55de0e6
Avoid overflow UB in random.cpp (#7449)
The random number generator adds into `xorFactor` in various places.
This variable was previously signed, so overflows from these adds were
UB. Make it unsigned to avoid UB.
2025-04-04 15:58:44 +00:00
Thomas Lively
5dd2d41aa6
[NFC] Use HeapTypeDef more widely (#7446)
Update `TypeBuilder::build` to return a vector of `HeapTypeDef`.
Although `HeapTypeDef` is implicitly convertible to `HeapType`,
containers of `HeapTypeDef` are not implicitly convertible to containers
of `HeapType`, so this change requires several other utilities and
passes to start using `HeapTypeDef` directly and transitively.
2025-04-04 08:30:18 -07:00
Thomas Lively
5ad90cbcb8
Use standard type printing for BINARYEN_PRINT_FULL (#7447)
`printTypeOrName`, used to print types for BINARYEN_PRINT_FULL and a few
other niche use cases, previously had bespoke printing for references
that did not follow the standard format. To improve the output and
reduce the number of ways we print types, change it to use the standard
printing utility.
2025-04-04 08:13:26 -07:00
Alon Zakai
e0c20f3320
Add missing SIMD fuzzing (#7445)
This fixes almost all the current TODOs.
2025-04-03 16:43:05 -07:00
Thomas Lively
fbf08efb9b
[NFC] Introduce HeapTypeDef and use it for printing (#7444)
Now that we have exact heap types, non-abstract heap types no longer
correspond 1:1 with heap type definitions. We previously used a single
type, `HeapType`, to represent both heap types and heap type
definitions. Now that `HeapType` can represent multiple heap types
corresponding to the same definition, there is a new class of potential
bugs in which code that expects `HeapType` values to map 1:1 with heap
type definitions observes both an exact and inexact heap type for the
same definition.

To eliminate this class of bugs, introduce a new type, `HeapTypeDef`,
whose values do correspond 1:1 with heap type definitions. `HeapTypeDef`
is a subclass of `HeapType`, so it supports all the same functionality,
but it cannot represent exact heap types because its constructor clears
the exact bit.

As an initial proof-of-concept, use HeapTypeDef in the type printing
machinery, which only cares about heap type names corresponding to heap
type definitions. Future PRs will use HeapTypeDef in more places.
2025-04-03 15:33:07 -07:00
Thomas Lively
3ce02b645f
[NFC] printHeapType => printHeapTypeName in Print.cpp (#7443)
Rename the function in anticipation of exact heap types appearing in the
IR. When an expression like `StructNew` has an exact heap type, the
`exact` does not appear in `struct.new $foo`. In this case `$foo` is not
the full heap type, but rather than name of the heap type definition.
2025-04-03 13:22:17 -07:00
Thomas Lively
9cef51c10e
[NFC] Take a HeapType instead of Type in RefFunc::finalize (#7442)
This removes from the callers the burden of constructing the type with
the correct nullability and other future attributes.
2025-04-03 18:47:53 +00:00
Alon Zakai
01e0eea42f
[GC] Add a TypeRefiningGUFA pass (#7433)
This variation of TypeRefining uses GUFA to determine what types to refine
struct fields to. GUFA does a (slow) whole-program analysis which can infer
things the normal pass cannot, e.g. refinements that contain cycles through
things like locals or globals.

This is mainly a proof of concept, as it is pretty slow to compute GUFA just
for this, and while I see improvements on real-world code, they are minor.
If we find that the benefits here are worth it, a larger refactoring could
do this optimization in the existing GUFA pass (which already does the
computation of the graph anyhow).
2025-04-03 09:30:20 -07:00
Thomas Lively
862aeb9ecc
Parse and emit exact heap types (#7432)
Implement text and binary parsing for exact heap types as well as binary
emitting.
2025-04-02 16:55:09 -07:00
Gulg
570de7968d
[C/JS APIs] Allow JS and C to read the start function of a module (#7424)
This PR adds `BinaryenGetStart` to the C api and the corresponding
`module.getStart` JS wrapper.
2025-04-02 16:41:41 -07:00
Alon Zakai
85fc8bbb01
Fix typo in fuzz skipping of a test (#7430) 2025-04-02 14:10:58 -07:00
Jérôme Vouillon
df26f96333
Stack switching: fix some optimization passes (#7271)
This continues #7041 by adapting the optimizations passes to work with
the stack switching instructions.
2025-04-02 11:19:44 -07:00
Alon Zakai
52ac4c11e2
[GC] RemoveUnusedBrs must not un-refine sent types (#7421)
The pass refines cast types of br_on, but that will un-refine the sent
type of a br_on_cast_fail - a more refined cast type means more things
fail it, so more things are sent. We need to avoid that.
2025-04-02 08:58:11 -07:00
Ashley Nelson
05a8c8d0da
[Outlining] Filter in nested control flow (#7411)
Fixes a bug where stringify walker was not removing outlining candidates
with restricted expressions because they were in nested control flow.
2025-04-01 16:21:55 -07:00
Alon Zakai
a7d93efd82
[Strings] Handle encoding in JSON parsing so StringLifting can handle arbitrary custom section content (#7414)
Rather than encode to WTF8 and re-encode, instead make the unescaping logic go
from UTF8 straight to WTF16. That makes it simpler and more efficient.

Make the JSON parser get a parameter for which encoding to use for strings, so
we can use ascii in old places.
2025-04-01 16:07:14 -07:00
Alon Zakai
f77a69d7f6
OptimizeInstructions: Handle all binary and unary expressions emitting zero bits (#7413)
We have many hardcoded rules, but also have general logic that computes
the max bits. If the max bits are 0, we can replace with a 0.

Fixes #7406
2025-03-31 11:55:22 -07:00
Derek Schuff
1fd0085381
Allow using mimalloc with dynamic linking (#7391)
With dynamic linking, build and link mimalloc's dynamic library, and  include it
in the installation (this also brings along the headers and CMake files, but it
seemed like more trouble than it was worth to try to manually install just the
library or remove the extras).
The static build remains the same.

Remove the restriction that mimalloc can only be linked into a static-lib build.
2025-03-28 18:01:11 -07:00
Thomas Lively
d8b4c569c7
Test StringLowering with a surrogate pair (#7415)
We tested isolated surrogates, but we did not test with a valid
surrogate pair.
2025-03-28 00:20:24 +00:00
Alon Zakai
a12de94e27
[Strings] Unescape in JSON parsing, so StringLifting can read escaped strings (#7410) 2025-03-27 13:12:36 -07:00
Thomas Lively
43d635c544
Subtyping and LUBs for exact heap types (#7412)
Also fix `with(Inexact)` to no longer strip sharedness from basic heap
types. Add exact heap types, defined subtypes, and exact subtypes to the
gtest for heap type relations.
2025-03-27 09:42:29 -07:00
Ashley Nelson
387552935a
[Outlining] Insert Unreachable (#7400)
When the last instruction of the outlined sequence is unreachable, we
need to insert an unreachable instruction immediately after the call to
the outlined function. This maintains the unreachable type in the
original scope of the outlined sequence.
2025-03-26 17:35:37 -07:00
Thomas Lively
5f6ba291de
Initial support for exact heap types (#7396)
The custom descriptors proposal has moved exactness from reference types
to defined (but not abstract) heap types. Since we only use a bit in the
heap type representation to represent sharedness for abstract heap
types, we can conveniently reuse the same bit to represent exactness for
heap types.

Implement basic support for representing exact heap types and taking
them into account in canonicalization. Also ensure that other operations
like getting the rec group of a heap type or looking up its structure
work properly on exact heap types.
2025-03-26 17:19:59 -07:00
Alon Zakai
2f075b5c27
Avoid repeated work in RemoveUnusedModuleElements::addReferences() [NFC] (#7407)
Globals that refer to globals lead to more work. This work cannot be
infinite, as globals only refer to previous ones, but this can end up as
exponential time, so this is actually important to optimize here.

For exponential time, it is enough to have a chain of these:

 (global $global$N+1 (ref $A) (struct.new $A
  (global.get $global$N)
  (global.get $global$N)
 ))

That is, two references from each global to its predecessor, causing
us to double the work each time we scan back.

Fixes #7405 (where the above pattern appears)
2025-03-26 14:49:51 -07:00
Alon Zakai
ca5d9dba1e
[Strings] Support the custom section for strings in StringLifting (#7409) 2025-03-26 14:26:22 -07:00
Alon Zakai
d262701cde
[Strings] Allow customizing the module name for string constants in StringLifting/Lowering (#7399)
Previously we hardcoded "'" (a single quote).
2025-03-26 11:43:22 -07:00
Thomas Lively
a5f0423cde
Use fewer bits for BasicHeapType (#7404)
Now that we aren't supporting exact reference types, we no longer need
to leave bit 2 free for use by the Type representation. Shift the basic
HeapType representations down to start at bit 2 instead of bit 3.
2025-03-26 11:11:30 -07:00
Thomas Lively
fbf20108c9
Revert exact reference types (#7402)
We decided that Custom Descriptors should introduce exact heap types
rather than exact reference types. Although these new features are very
similar, the APIs we need to change for them are completely different.

One option would have been to keep the existing exact reference type
implementation while additionally implementing exact heap types, but
there are not enough free bits in the type implementation to have both
at once without increasing the alignment of HeapTypeInfo allocations.
Portably increasing the alignment is annoying enough that it's easier to
just eagerly remove exact reference types to free up the bit for use
with exact heap types.

Fully or partially revert the following PRs:

 - #7371
 - #7365
 - #7360
 - #7357
 - #7356
 - #7355
 - #7354
 - #7353
 - #7347
 - #7342
 - #7328

Keep the new `.with(...)` Type APIs and the relevant parts of the type
relations gtest that were introduced as part of the reverted work.
2025-03-26 09:36:57 -07:00
Ashley Nelson
0997f9b3f1
[Outlining] Separating Filter Branch & Return Tests (#7401)
In the interest of improved test readability, moved the return part of
the FilterBranches test into its own test. Also condensed the test to
remove the unnecessary consts.
2025-03-26 04:14:34 +00:00
Ashley Nelson
1f01a77521
[Outlining] Remove overlapping sequences (#7146)
While determining whether repeat sequences of instructions are
candidates for outlining, remove sequences that overlap, giving weight
to sequences that are longer and appear more frequently.
2025-03-25 17:40:24 -07:00
Alon Zakai
aee292be6e
[strings] Add a StringLifting pass (#7389)
This converts imported string constants into string.const, and imported
string instructions into string.* expressions. After this pass they are
represented using stringref and we can optimize them fully (e.g.
precomputing a string.concat of two constants). Typically a user would
later lower then back down using StringLowering.

This pass allows users to avoid emitting stringref directly, which means
they are emitting standard wasm which can run in VMs, leaving wasm-opt
entirely optional.

Also refactor a few shared constants with StringLowering into a helper file.

Left as TODOs: contents of the strings custom section, and casts (see
comments in source).

Fixes most of #7370
2025-03-25 12:46:44 -07:00
Alon Zakai
6a6e08057c
Release version 123 (#7394)
Given the big speedup in our official release binaries for Linux, this
seems useful to get to users quickly.
2025-03-25 11:13:08 -07:00
Alon Zakai
52a15b3e0e
[NFC] Fix help text for tools that emit binary (#7398)
Our tools all stated that if -o is not provided, we write to stdout by
default. But that is not true - we do nothing in that case. I believe the
rationale is that we write text to stdout by default in say wasm2js, but
for wasm-opt etc. we don't want to write binary to stdout (as that is
dangerous).
2025-03-25 11:12:29 -07:00
Thomas Lively
def4095995
[NFC] Use a lambda instead of a macro in gtest (#7395)
The length macro used in a type test in type-builder.cpp was causing
extremely long compile times in some compilers. Use a lambda instead to
fix it. This makes the error messages less useful when a test fails, but
under normal circumstances the test should not be failing, so this is a
good trade off.

Fixes #7383.
2025-03-24 20:48:44 -07:00
Thomas Lively
8958df4115
Validate descriptor declarations (#7392)
To ensure soundness, there are very particular rules about how described
and descriptor type declarations must relate to one another and their
supertypes. Implement and test these rules.
2025-03-24 11:06:01 -07:00
Thomas Lively
7508e81f0d
Parsing and binary writing for custom descriptors (#7387)
Implement text and binary parsing as well as binary writing for
`descriptor` and `describes` clauses, as specified in the
custom-descriptors proposal. Also simplify some neighboring code dealing
with shared types as a drive-by.
2025-03-21 17:06:43 -07:00
Thomas Lively
835a178ae4
Fix RefFunc type updating in I64ToI32Lowering (#7390)
We recently started updating RefFunc types in I64ToI32Lowering to ensure
that their types matched the updated types of their functions. But the
way we updated the types of RefFunc expressions and Functions were not
the same. When updating Functions that return i64, we replace the result
type with i32 and use a global to propagate the remaining bits to the
caller. Previously when updating RefFunc result types, we would instead
split i64s into pairs of i32s, depending on multivalue to lower the
type. Update the logic for updating RefFunc results to match the
existing logic for updating Functions.
2025-03-21 14:39:35 -07:00
Thomas Lively
cd3b26d6e2
Require RefFunc to have the proper type (#7376)
As a holdout from before GC was implemented, we previously allowed
RefFunc expressions to have type `funcref` rather than a specific
signature type matching that of the referenced function. Remove this
allowance and start requiring the types to be correct and precise to
eliminate the possibility of stale types inhibiting (or invalidating!)
optimizations.

Update various older passes to update the types of RefFuncs, including
those in tables, to keep their output passing validation. Also update
the kitchen sink example test to construct RefFunc expressions with the
correct type via the C API.
2025-03-21 10:03:53 -07:00
Derek Schuff
3f341e5019
Exclude the mimalloc files from the Binaryen install (#7386)
Since mimalloc is linked statically into the Binaryen tools, none of its
files need to be installed with Binaryen.

Also use CMAKE_SYSTEM_NAME instead of LINUX, as the latter was
introduced in CMake 3.25
2025-03-20 16:42:31 -07:00
Sam Clegg
d98e3c46bc
[test] Use test-specific filename when running spec tests (#7384)
These means that the intermediate files don't conflict with each other
and you can inspect them by name after the test run.
2025-03-20 21:27:17 +00:00
Daniel Lehmann
adfdb1bebd
Do not pass C++-only flag to C compiler [NFC] (#7382)
And some minor drive-by fixes: gitignore Ninja in-tree build file,
update mimalloc to latest stable release.

This gets rid of warnings in the mimalloc cmake/make step. See
https://github.com/microsoft/mimalloc/issues/1031#issuecomment-2740351301
and https://github.com/microsoft/mimalloc/issues/1038.
2025-03-20 12:03:45 -07:00