IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When we're running with sanitizers, sd-executor might pull in a
significant chunk of shared libraries on startup, that can cause a lot
of memory pressure and put us in the front when sd-oomd decides to go on
a killing spree. This is exacerbated further on Arch Linux when built
with gcc, as Arch ships unstripped gcc-libs so sd-executor pulls in over
30M of additional shared libs on startup:
~# lddtree build-san/systemd-executor
build-san/systemd-executor (interpreter => /lib64/ld-linux-x86-64.so.2)
libasan.so.8 => /usr/lib/libasan.so.8
libstdc++.so.6 => /usr/lib/libstdc++.so.6
libm.so.6 => /usr/lib/libm.so.6
libgcc_s.so.1 => /usr/lib/libgcc_s.so.1
libsystemd-core-255.so => /root/systemd/build-san/src/core/libsystemd-core-255.so
libaudit.so.1 => /usr/lib/libaudit.so.1
libcap-ng.so.0 => /usr/lib/libcap-ng.so.0
...
libseccomp.so.2 => /usr/lib/libseccomp.so.2
libubsan.so.1 => /usr/lib/libubsan.so.1
libc.so.6 => /usr/lib/libc.so.6
~# ls -Llh /usr/lib/libasan.so.8 /usr/lib/libstdc++.so.6 /usr/lib/libubsan.so.1
-rwxr-xr-x 1 root root 9.7M Feb 2 10:36 /usr/lib/libasan.so.8
-rwxr-xr-x 1 root root 21M Feb 2 10:36 /usr/lib/libstdc++.so.6
-rwxr-xr-x 1 root root 3.2M Feb 2 10:36 /usr/lib/libubsan.so.1
Sanitized libsystemd-core.so is also quite big:
~# ls -Llh /root/systemd/build-san/src/core/libsystemd-core-255.so /usr/lib/systemd/libsystemd-core-255.so
-rwxr-xr-x 1 root root 26M Feb 8 19:04 /root/systemd/build-san/src/core/libsystemd-core-255.so
-rwxr-xr-x 1 root root 5.9M Feb 7 12:03 /usr/lib/systemd/libsystemd-core-255.so
(cherry picked from commit 974fe6131f1fae5c31e18e0979a40d56a85c2c88)
Currently we spawn services by forking a child process, doing a bunch
of work, and then exec'ing the service executable.
There are some advantages to this approach:
- quick: we immediately have access to all the enourmous amount of
state simply by virtue of sharing the memory with the parent
- easy to refactor and add features
- part of the same binary, will never be out of sync
There are however significant drawbacks:
- doing work after fork and before exec is against glibc's supported
case for several APIs we call
- copy-on-write trap: anytime any memory is touched in either parent
or child, a copy of that page will be triggered
- memory footprint of the child process will be memory footprint of
PID1, but using the cgroup memory limits of the unit
The last issue is especially problematic on resource constrained
systems where hard memory caps are enforced and swap is not allowed.
As soon as PID1 is under load, with no page out due to no swap, and a
service with a low MemoryMax= tries to start, hilarity ensues.
Add a new systemd-executor binary, that is able to receive all the
required state via memfd, deserialize it, prepare the appropriate
data structures and call exec_child.
Use posix_spawn which uses CLONE_VM + CLONE_VFORK, to ensure there is
no copy-on-write (same address space will be used, and parent process
will be frozen, until exec).
The sd-executor binary is pinned by FD on startup, so that we can
guarantee there will be no incompatibilities during upgrades.
The test fails on my machine, running Debian stable, because
testsuite-55-testbloat.service just swaps and never goes over the
limit, so it's not killed. Use 'stress' instead which seems to be
able to overwhelm the swap too.
If we add a drop-in for init.scope (e.g.: to set some memory limit),
it will be loaded long after the cgroup has already been realized.
Do it again when creating the special unit.
Let's keep the debug logs in the journal, while logging only
testsute-*.sh stdout/stderr to the console (ba7abf7). This should make
the test output log a bit more readable and potentially the tests itself
a bit faster by avoiding console oversaturation.
Also, it should significantly reduce the size of artifacts kept by CIs.
Make oomctl a bit less likely to race with systemd-oomd receiving the
managed oom cgroup info by checking oomctl output in a loop with
timeout.
Fixes#21146
ASan is having a hard time to get its LD_PRELOAD= shenanigans straight
with all the shells flying around. Let's make it a bit easier by using
one of the nifty systemctl's features instead.
Compared to PID1 where systemd-oomd has to be the client to PID1
because PID1 is a more privileged process than systemd-oomd, systemd-oomd
is the more privileged process compared to a user manager so we have
user managers be the client whereas systemd-oomd is now the server.
The same varlink protocol is used between user managers and systemd-oomd
to deliver ManagedOOM property updates. systemd-oomd now sets up a varlink
server that user managers connect to to send ManagedOOM property updates.
We also add extra validation to make sure that non-root senders don't
send updates for cgroups they don't own.
The integration test was extended to repeat the chill/bloat test using
a user manager instead of PID1.
Pressure remains > 1% after a kill for some time and could cause
testchill to get killed. Bumping the limit from 1% to 20% should help
with this.
Fixes#20118
The commit 6f3ac0d51766b0b9101676cefe5c4ba81feba436 drops the prefix and
suffix in TAGS= property. But there exists several rules that have like
`TAGS=="*:tag:*"`. So, the property must be always prefixed and suffixed
with ":".
Fixes#17930.