IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Coverity is unhappy because we use "line" in the assert that checks
the return value. It doesn't matter much, but let's clean this up.
Also, let's not assume that /proc/cmdline contains anything.
CID #1400219.
When running in Fedora "mock", / is a tmpfs and /home is not mounted. The test
assumes that /home will be a tmpfs only and only if we can unshare. Obviously,
this does not hold in this case, because unsharing is not possible, but /home
is still a tmpfs. Let's just skip the test, since it's fully legitimate to
mount either or both of / and /home as tmpfs.
test_exec_ambientcapabilities: exec-ambientcapabilities-nobody.service: exit status 0, expected 1
Sometimes we get just the last line, for example from the failure summary,
so make it as useful as possible.
This adds some extra paranoia: when we recursively chown a directory for
use with DynamicUser=1 services we'll now drop suid/sgid from all files
we chown().
Of course, such files should not exist in the first place, and noone
should get access to those dirs who isn't root anyway, but let's better
be safe than sorry, and drop everything we come across.
Add even more suid/sgid protection to DynamicUser= envionments: the
state directories we bind mount from the host will now have the nosuid
flag set, to disable the effect of nosuid on them.
The host mounts it like that, nspawn hence should do too.
Moreover, mount the file system after doing CLONEW_NEWIPC so that it
actually reflects the right mqueues. Finally, mount it wthout
considering it fatal, since POSIX mqueue support is little used and it
should be fine not to support it in the kernel.
The function is otherwise generic enough to toggle other bind mount
flags beyond MS_RDONLY (for example: MS_NOSUID or MS_NODEV), hence let's
beef it up slightly to support that too.
Humans are susceptible to making orthographic errors sometimes. A
misspelled "%systemd_post caek.service" would go unnoticed due to all
output from systemctl being discarded if and when %post runs.
To alleviate this, cease hiding outputs. Then, to account for the
potential absence of systemd from the system, add file checks so as
to not generate a "command not found" error.
This adds support for user to set & get reboot parameter for reboot.
As callee would be next issuing Reboot call same policy checks are being used.
If unit file issuing the reboot action defines RebootArgument (or similar) that
setting takes precedence.
Previously both run() and run_container() would free 'fds'. Let's fix
that, and let run() free it but make run_container() already remove all
fds from it, because that's what we actually want to do.
Fixes: #12073
Commit d85515edcf changed logic how reboot is
executed. That commit changed behavior to use emergency action reboot code path
to perform the reboot.
This inadvertently broke rebooting with argument:
$ systemctl reboot custom-reason
Restore original behavior so that if reboot service unit similar to
systemd-reboot.service is executed it is possible to override reboot reason
with "systemctl reboot ARG".
When "systemctl reboot ARG" is executed ARG is placed in file
/run/systemd/reboot-param and reboot is issued using logind's Reboot
dbus-service.
If RebootArgument is specified in systemd-reboot.service it takes precedence
over what systemctl sets.
Fixes: #11828
Invoking %systemd_tmpfiles (in %post) without any arguments, while
possible, will cause systemd-tmpfiles to process the entire system
configuration, rather than just the newly installed configuration
files. In https://github.com/systemd/systemd/pull/12048, it was
established that processing everything constitutes unusual practice,
and should be flagged as a mistake at build time.
Furthermore, invoking %systemd_post without any arguments will cause
the underlying `systemctl preset` to outright return an error ("Too
few arguments") when run. This can be flagged during build time in
the same manner.
As I have found no ways to successfully nest %if clauses inside a
macro[1], I am helping myself by reusing the recursive variable
expansion technique pioneered in [2].
Now, when %systemd_post or %systemd_tmpfiles is incorrectly used,
rpm gives accurate line number reporting, too:
error: This macro requires some arguments
error: line 11: %{systemd_post}
error: This macro requires two arguments
error: line 13: %{tmpfiles_create_package meh more more}
[1] what has been tried: %{expand:%%if "%#" == 0 \\\
%%{error:you have given me %# args} \\\
%%endif}
[2] http://git.savannah.gnu.org/cgit/automake.git/commit/?id=e0bd4af16da88e4c2c61bde42675660eff7dff51
As general principle, we generally check command line args first, the
enviroment second, and external configuration and system state only later.
In case of the invocation ID, checking the keyring before the environment
was implemented as a poor-man's security measure. But this is not really
useful, since we're moving within the same security boundary. So let's just
do the expected thing, and check environment first.
Prompted by https://github.com/systemd/systemd/pull/11991#issuecomment-474647652.
This makes two changes:
1. Instead of resetting the configured service TTY each time after a
process exited, let's do so only when the service goes back to "dead"
state. This should be preferable in case the started processes leave
background child processes around that still reference the TTY.
2. chmod() and chown() the TTY at the same time. This should make it
safe to run "systemd-run -p DynamicUser=1 -p StandardInput=tty -p
TTYPath=/dev/tty8 /bin/bash" without leaving a TTY owned by a dynamic
user around.
Since switching to extract_first_word with no flags for parsing
unit names in 4c9565eea5, escape
characters will be stripped from escaped unit names such as
"mnt-persistent\x2dvolume.mount" resulting in the unit not being
configured as defined. Preserve escape characters again for
compatibility with existing preset definitions.
On arm64 with gcc-8.2.1-5.fc29.aarch64:
../src/test/test-fileio.c:645:29: warning: comparison is always false due to limited range of data type [-Wtype-limits]
assert_se(c == EOF || safe_fgetc(f, &c) == 1);
^~
Casting c to int is not enough, gcc is able to figure out that the original
type was unsigned and still warns. So let's just silence the warning like
in test-sizeof.c.
Fixes#12036.
../../../src/systemd/src/libsystemd/sd-bus/bus-objects.c: In function ‘add_object_vtable_internal’:
../../../src/systemd/src/basic/macro.h:423:19: error: duplicate case value
The thread launched in journal_file_set_offline() accesses a memory
mapped file, so it needs to handle SIGBUS. Leave SIGBUS unblocked on the
offlining thread so that it uses the same handler as the main thread.
The result of triggering SIGBUS in a thread where it's blocked is
undefined in Linux. The tested implementations were observed to cause
the default handler to run, taking down the whole journald process.
We can leave SIGBUS unblocked in multiple threads since it's handler is
thread-safe. If SIGBUS is sent to the journald process asynchronously
(i.e. with kill, sigqueue, or raise), either thread handling it will
result in the same behavior: it will install the default handler and
reraise the signal, killing the process.
Fixes: #12042
If we know that main pid is our child then it's unnecessary to watch all
other processes of a unit since in this case we will get SIGCHLD when the main
process will exit and will act upon accordingly.
So let's watch all processes only if the main process is not our child since in
this case we need to detect when the cgroup will become empty in order to
figure out when the service becomes dead. This is only needed by cgroupv1.
Some PIDs can remain in the watched list even though their processes have
exited since a long time. It can easily happen if the main process of a forking
service manages to spawn a child before the control process exits for example.
However when a pid is about to be mapped to a unit by calling unit_watch_pid(),
the caller usually knows if the pid should belong to this unit exclusively: if
we just forked() off a child, then we can be sure that its PID is otherwise
unused. In this case we take this opportunity to remove any stalled PIDs from
the watched process list.
If we learnt about a PID in any other form (for example via PID file, via
searching, MAINPID= and so on), then we can't assume anything.
It's a glibc-specific API, but supported on FreeBSD and musl too at
least, hence fairly common. This way we can reduce our calls to
realloc() as much as possible.
The loop around ttyname_r() makes it look like we use unbounded stack
allocations. We know that that paths have a maximum size, so let's simplify
the whole thing.
Replaces #12043.
It's probably unexpected if we do a recursive chown() when dynamic users
are used but not on static users.
hence, let's tweak the logic slightly, and recursively chown in both
cases, except when operating on the configuration directory.
Fixes: #11842
Let's reduce the chance of failure: if we can't apply the chmod/chown
requested, check if it's applied anyway, and if so, supress the error.
This is even race-free since we operate on an O_PATH fd anyway.
The function so far always returned -ECANCELLED, which is ignored in all
cases the function is invoked, except one: in unit_test_start_limit()
where -ECANCELLED is returned when the start limit is hit, which is part
of unit_start()'s protocol of return values.
Since the emergency_action() logic should be relatively generic and is
used in many places, let's drop the return value from it, since it's
constant anyway, and in alll cases useless. Instead, let's return it in
unit_test_start_limit(), where it's part of the protocol.
No change in behaviour.
Let's add a safety precaution: if the start condition checks for a unit
are tested too often and fail each time, let's rate limit this too.
This should add extra safety in case people define .path, .timer or
.automount units that trigger a service that as a conditoin that always
fails.
Just some renaming, no change in behaviour.
Background: I'd like to add more functions unit_test_xyz() that test
various things, hence let's streamline the naming a bit.
Introduced in 6d586a1371.
Reported by Felix Riemann in https://bugzilla.redhat.com/show_bug.cgi?id=1685286.
Reproducer:
for i in `seq 1 100`; do gdbus call --session -d org.freedesktop.systemd1 -m org.freedesktop.systemd1.Manager.StartUnit -o "/$(for x in `seq 0 28000`; do echo -n $x; done)" & done
socket_bind_to_ifindex() uses the the SO_BINDTOIFINDEX sockopt of kernel
5.0, with a fallback to SO_BINDTODEVICE on older kernels.
socket_bind_to_ifname() is a trivial wrapper around SO_BINDTODEVICE, the
only benefit of using it instead of SO_BINDTODEVICE directly is that it
determines the size of the interface name properly so that it also works
for unbinding. Moreover, it's an attempt to unify our invocations of the
sockopt with a size of strlen(ifname) rather than strlen(ifname)+1...
I had a test machine with ulimit -n set to 1073741816 through pam
("session required pam_limits.so set_all", which copies the limits from PID 1,
left over from testing of #10921).
test-execute would "hang" and then fail with a timeout when running
exec-inaccessiblepaths-proc.service. It turns out that the problem was in
close_all_fds(), which would go to the fallback path of doing close()
1073741813 times. Let's just fail if we hit this case. This only matters
for cases where both /proc is inaccessible, and the *soft* limit has been
raised.
(gdb) bt
#0 0x00007f7e2e73fdc8 in close () from target:/lib64/libc.so.6
#1 0x00007f7e2e42cdfd in close_nointr ()
from target:/home/zbyszek/src/systemd-work3/build-rawhide/src/shared/libsystemd-shared-241.so
#2 0x00007f7e2e42d525 in close_all_fds ()
from target:/home/zbyszek/src/systemd-work3/build-rawhide/src/shared/libsystemd-shared-241.so
#3 0x0000000000426e53 in exec_child ()
#4 0x0000000000429578 in exec_spawn ()
#5 0x00000000004ce1ab in service_spawn ()
#6 0x00000000004cff77 in service_enter_start ()
#7 0x00000000004d028f in service_enter_start_pre ()
#8 0x00000000004d16f2 in service_start ()
#9 0x00000000004568f4 in unit_start ()
#10 0x0000000000416987 in test ()
#11 0x0000000000417632 in test_exec_inaccessiblepaths ()
#12 0x0000000000419362 in run_tests ()
#13 0x0000000000419632 in main ()
When debugging failure in one of the cases, it's annoying to have to wade
through the output from all the other cases. Let's allow picking select
cases.
This is a pretty large patch, and adds support for OCI runtime bundles
to nspawn. A new switch --oci-bundle= is added that takes a path to an
OCI bundle. The JSON file included therein is read similar to a .nspawn
settings files, however with a different feature set.
Implementation-wise this mostly extends the pre-existing Settings object
to carry additional properties for OCI. However, OCI supports some
concepts .nspawn files did not support yet, which this patch also adds:
1. Support for "masking" files and directories. This functionatly is now
also available via the new --inaccesible= cmdline command, and
Inaccessible= in .nspawn files.
2. Support for mounting arbitrary file systems. (not exposed through
nspawn cmdline nor .nspawn files, because probably not a good idea)
3. Ability to configure the console settings for a container. This
functionality is now also available on the nspawn cmdline in the new
--console= switch (not added to .nspawn for now, as it is something
specific to the invocation really, not a property of the container)
4. Console width/height configuration. Not exposed through
.nspawn/cmdline, but this may be controlled through $COLUMNS and
$LINES like in most other UNIX tools.
5. UID/GID configuration by raw numbers. (not exposed in .nspawn and on
the cmdline, since containers likely have different user tables, and
the existing --user= switch appears to be the better option)
6. OCI hook commands (no exposed in .nspawn/cmdline, as very specific to
OCI)
7. Creation of additional devices nodes in /dev. Most likely not a good
idea, hence not exposed in .nspawn/cmdline. There's already --bind=
to achieve the same, which is the better alternative.
8. Explicit syscall filters. This is not a good idea, due to the skewed
arch support, hence not exposed through .nspawn/cmdline.
9. Configuration of some sysctls on a whitelist. Questionnable, not
supported in .nspawn/cmdline for now.
10. Configuration of all 5 types of capabilities. Not a useful concept,
since the kernel will reduce the caps on execve() anyway. Not
exposed through .nspawn/cmdline as this is not very useful hence.
Note that this only implements the OCI runtime logic itself. It does not
provide a runc-compatible command line tool. This is left for a later
PR. Only with that in place tools such as "buildah" can use the OCI
support in nspawn as drop-in replacement.
Currently still missing is OCI hook support, but it's already parsed and
everything, and should be easy to add. Other than that it's OCI is
implemented pretty comprehensively.
There's a list of incompatibilities in the nspawn-oci.c file. In a later
PR I'd like to convert this into proper markdown and add it to the
documentation directory.
Let's separate out the raw uid_t/gid_t handling from the username
handling. This is useful later on.
Also, let's use the right gid_t type for group types wherever
appropriate.
Everyone will be in trouble then (as quite widely caps are store in
64bit fields). But let's protect ourselves at least to the point that we
ignore all higher caps for now.
The kernel only allows dropping bounding caps as long as we have
CAP_SETPCAP. Hence, let's keep that before dropping the bounding caps,
and afterwards drop them too.
../src/shared/bootspec.c: In function ‘find_sections’:
../src/shared/bootspec.c:425:23: warning: comparison of integer expressions of different signedness: ‘ssize_t’ {aka ‘int’} and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
425 | if (n != size)
| ^~
When system manager is started first time or after switching root,
then the udev's device tag data do not exist yet.
So, let's not honor the enumeration results.
Fixes#11997.
The dbus external authentication takes as optional argument the UID the
sender wants to authenticate as. This uid is purely optional. The
AF_UNIX socket already conveys the same information through the
auxiliary socket data, so we really don't have to provide that
information.
Unfortunately, there is no way to send empty arguments, since they are
interpreted as "missing argument", which has a different meaning. The
SASL negotiation thus changes from:
AUTH EXTERNAL <uid>
NEGOTIATE_UNIX_FD (optional)
BEGIN
to:
AUTH EXTERNAL
DATA
NEGOTIATE_UNIX_FD (optional)
BEGIN
And thus the replies we expect as a client change from:
OK <server-id>
AGREE_UNIX_FD (optional)
to:
DATA
OK <server-id>
AGREE_UNIX_FD (optional)
Since the old sd-bus server implementation used the wrong reply for
"AUTH" requests that do not carry the arguments inlined, we decided to
make sd-bus clients accept this as well. Hence, sd-bus now allows
"OK <server-id>\r\n" replies instead of "DATA\r\n" replies.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
The correct way to reply to "AUTH <protocol>" without any payload is to
send "DATA" rather than "OK". The "DATA" reply triggers the client to
respond with the requested payload.
In fact, adding the data as hex-encoded argument like
"AUTH <protocol> <hex-data>" is an optimization that skips the "DATA"
roundtrip. The standard way to perform an authentication is to send the
"DATA" line.
This commit fixes sd-bus to properly send the "DATA" line. Surprisingly
no existing implementation depends on this, as they all pass the data
directly as argument to "AUTH". This will not work if we want to pass
an empty argument, though.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
Setting an access mode != 0666 is explicitly supported via -Dgroup-render-mode
In such a case, re-add the uaccess tag.
This is basically the same change that was done for /dev/kvm in
commit fa53e24130 and
ace5e3111c
and partially reverts the changes from
4e15a7343c
This makes L2TP.Local= support an empty string, 'auto', 'static', and
'dynamic'. When one of the values are specified, a local address is
automatically picked from the local interface of the tunnel.
When 'nomodeset' is specified, there's no DRM driver to take over from
efifb. This means no device will be marked as a seat master, so gdm will
never find a sufficiently active seat to start on.
I'm not aware of an especially good way to detect this through a proper
kernel API, so check for the word 'nomodeset' on the command line and
allow fbdev devices to be seat masters if found.
For https://bugzilla.redhat.com/show_bug.cgi?id=1683197.