IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Trivial performance boost by explicitly bypassing the implicit
locking of stdio.
This significantly affects common cases of `journalctl` usage:
Before:
# time ./journalctl -b -1 > /dev/null
real 0m26.628s
user 0m26.495s
sys 0m0.125s
# time ./journalctl -b -1 > /dev/null
real 0m27.069s
user 0m26.936s
sys 0m0.134s
# time ./journalctl -b -1 > /dev/null
real 0m26.727s
user 0m26.607s
sys 0m0.119s
After:
# time ./journalctl -b -1 > /dev/null
real 0m23.394s
user 0m23.244s
sys 0m0.142s
# time ./journalctl -b -1 > /dev/null
real 0m23.283s
user 0m23.160s
sys 0m0.121s
# time ./journalctl -b -1 > /dev/null
real 0m23.274s
user 0m23.125s
sys 0m0.144s
Fixes https://github.com/systemd/systemd/issues/6341
2d79a0bbb9 did that for TimeoutSec=,
89beff89ed did that for JobTimeoutSec=,
and 0004f698df did that for
x-systemd.device-timeout=. But after parsing x-systemd.device-timeout=xxx
we write it out as JobRunningTimeoutSec=xxx. Two options:
- write out JobRunningTimeoutSec=<a very big number>,
- change JobRunningTimeoutSec= to behave like the other options.
I think it would be confusing for JobRunningTimeoutSec= to have different
syntax then TimeoutSec= and JobTimeoutSec=, so this patch implements the
second option.
Fixes#6264, https://bugzilla.redhat.com/show_bug.cgi?id=1462378.
Currently we set 4096 as maximum for number of stream connections that
we accept. However maximum number of file descriptors that systemd is
willing to accept from us is just 1024. This means we can't retain all
stream connections that we accepted. Hence bump the limit of fds in a
unit file so that systemd holds open all stream fds while we are
restarted.
New limit is set to 4224 (4096 + 128).
If you specify "x-systemd.device-timeout" for an NFS mount
point, you get no warning and a meaningless device unit
dependency created.
Better to have a warning and no dependency.
I messed up when adding the definitions in 4278d1f531.
Unfortunately I didn't have the hardware at hand and went by
looking at the kernel headers.
(cherry picked from commit 53196fafcb7b24b45ed4f48ab894d00a24a6d871)
"Failed to add rule for system call access, ignoring: Numerical argument out of domain"
is confusing. Make that "... system call access() / 238".
(cherry picked from commit 977dc6ca5acb8069a2966ec63e7378576bc2ca51)
cgroup namespace wasn't useful for delegation because it allowed resource
control interface files (e.g. memory.high) to be written from inside the
namespace - this allowed the namespace parent's resource distribution to be
disturbed by its namespace-scoped children.
A new mount option, "nsdelegate", was added to cgroup v2 to address this issue.
The flag is meangingful only when mounting cgroup v2 in the init namespace and
makes a cgroup namespace a delegation boundary. The kernel feature is pending
for v4.13.
This should have been the default behavior on cgroup namespaces and this commit
makes systemd try "nsdelegate" first when trying to mount cgroup v2 and fall
back if the option is not supported.
Note that this has danger of breaking usages which depend on modifying the
parent's resource settings from the namespace root, which isn't a valid thing
to do, but such usages may still exist.
Introduces window_matches_fd() for the fd matching case in try_context(),
In find_mmap() we're already walking a list of windows by fd, checking
this is pointless work in a potentially hot loop with many windows.
Instead of context_detach_window() and a manual attach of the new
window, simply call context_attach_window() which performs the
detach first if appropriate.
generator_add_symlink() is extended to ignore EEXIST. This should be fine
for all existing callers.
There's a small difference in behaviour when adding symlinks in sysv-generator:
the message is more generic and does not include ", ignored". But creation of
symlinks shouldn't ever fail except if things are very wrong, so in practice
this shouldn't matter.
Test needed updating: os.path.exists(os.readlink(link)) only works if the link
is absolute (or if we are in the right directory). Let's just use
os.path.exists(link), which properly tests that the symlink target exists.
journal_file_move_to_object() can skip the second
journal_file_move_to() call if the first one already mapped a
sufficiently large area.
Now that mmap_cache_get() returns the size of the mapped area
when asked, ask for the size and only perform the second call if
the required size exceeds the mapped size instead of the object
header size.
This results in a nice performance boost in my testing, even with
a corpus of many small logs burning much CPU time elsewhere:
Before:
# time ./journalctl -b -1 --no-pager > /dev/null
real 0m16.330s
user 0m16.281s
sys 0m0.046s
# time ./journalctl -b -1 --no-pager > /dev/null
real 0m16.409s
user 0m16.358s
sys 0m0.048s
# time ./journalctl -b -1 --no-pager > /dev/null
real 0m16.625s
user 0m16.558s
sys 0m0.061s
After:
# time ./journalctl -b -1 --no-pager > /dev/null
real 0m15.311s
user 0m15.257s
sys 0m0.046s
# time ./journalctl -b -1 --no-pager > /dev/null
real 0m15.201s
user 0m15.135s
sys 0m0.062s
# time ./journalctl -b -1 --no-pager > /dev/null
real 0m15.170s
user 0m15.113s
sys 0m0.053s
If requested, return the actual mapping size to the caller in
addition to the address.
journal_file_move_to_object() often performs two successive
mmap_cache_get() calls via journal_file_move_to(); one to get the
object header, then another to get the entire object when it's
larger than the header's size.
If mmap_cache_get() returned the actual mapping's size, it's
probable that the second mmap_cache_get() could be skipped when
the established mapping already encompassed the desired size.
If an error is encountered in any of the Exec* lines, WorkingDirectory,
SELinuxContext, ApparmorProfile, SmackProcessLabel, Service (in .socket
units), User, or Group, refuse to load the unit. If the config stanza
has support, ignore the failure if '-' is present.
For those configuration directives, even if we started the unit, it's
pretty likely that it'll do something unexpected (like write files
in a wrong place, or with a wrong context, or run with wrong permissions,
etc). It seems better to refuse to start the unit and have the admin
clean up the configuration without giving the service a chance to mess
up stuff.
Note that all "security" options that restrict what the unit can do
(Capabilities, AmbientCapabilities, Restrict*, SystemCallFilter, Limit*,
PrivateDevices, Protect*, etc) are _not_ treated like this. Such options are
only supplementary, and are not always available depending on the architecture
and compilation options, so unit authors have to make sure that the service
runs correctly without them anyway.
Fixes#6237, #6277.
This has a long history; see see 5261ba9018
which originally introduced the behavior. Unfortunately that commit
doesn't include any rationale, but IIRC the basic issue is that
systemd wants to model the real mount state as units, and symlinks
make canonicalization much more difficult.
At the same time, on a RHEL6 system (upstart), one can make e.g. `/home` a
symlink, and things work as well as they always did; but one doesn't have
access to the sophistication of mount units (dependencies, introspection, etc.)
Supporting symlinks here will hence make it easier for people to do upgrades to
RHEL7 and beyond.
The `/home` as symlink case also appears prominently for OSTree; see
https://ostree.readthedocs.io/en/latest/manual/adapting-existing/
Further work has landed in the nspawn case for this; see e.g.
d944dc9553
A basic limitation with doing this in the fstab generator (and that I hit while
doing some testing) is that we obviously can't chase symlinks into mounts,
since the generator runs early before mounts. Or at least - doing so would
require multiple passes over the fstab data (as well as looking at existing
mount units), and potentially doing multi-phase generation. I'm not sure it's
worth doing that without a real world use case. For now, this will fix at least
the OSTree + `/home` <https://bugzilla.redhat.com/show_bug.cgi?id=1382873> case
mentioned above, and in general anyone who for whatever reason has symlinks in
their `/etc/fstab`.