1
1
mirror of https://github.com/systemd/systemd-stable.git synced 2025-01-21 18:03:41 +03:00

42 Commits

Author SHA1 Message Date
Topi Miettinen
75723d31a6 units: udev: partially emulate ProtectClock=
Drop CAP_SYS_TIME and CAP_WAKE_ALARM capabilities and block clock-related
system calls. Update TODO.
2022-09-26 11:40:28 +02:00
Yu Watanabe
f562abe296 unit: drop ProtectClock=yes from systemd-udevd.service
This partially reverts cabc1c6d7adae658a2966a4b02a6faabb803e92b.

The setting ProtectClock= implies DeviceAllow=, which is not suitable
for udevd. Although we are slowly removing cgropsv1 support, but
DeviceAllow= with cgroupsv1 is necessarily racy, and reloading PID1
during the early boot process may cause issues like #24668.

Let's disable ProtectClock= for udevd. And, if necessary, let's
explicitly drop CAP_SYS_TIME and CAP_WAKE_ALARM (and possibly others)
by using CapabilityBoundingSet= later.

Fixes #24668.
2022-09-16 03:41:29 +09:00
Yu Watanabe
a1f4fd3876 udev: run the main process, workers, and spawned commands in /udev subcgroup
And enable cgroup delegation for udevd.
Then, processes invoked through ExecReload= are assigned .control
subcgroup, and they are not killed by cg_kill().

Fixes #16867 and #22686.
2022-03-17 20:24:38 +09:00
Maciek Borzecki
0ddd608a6d units/systemd-udevd: allow bpf() syscall
Programs run by udev triggers may need to execute the bpf() syscall. Even more
so, since on a cgroup v2 system, the only way to set up device access filtering
is to install a BPF program on the cgroup in question and one way of passing
data to such program is through BPF maps, which can only be access using the
bpf() syscall. One such use case was identified in RHBZ#2025264 related to
snap-device-helper, and led to RHBZ#2027627 being filed.

Unfortunately there is no finer grained control over what gets passed in the
syscall, so just enable bpf() and leave fine grained mediation to other
security layers (eg. SELinux).

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2027627

Signed-off-by: Maciek Borzecki <maciek.borzecki@gmail.com>
2021-12-07 07:37:54 +01:00
Zbigniew Jędrzejewski-Szmek
059cc610b7 meson: use jinja2 for unit templates
We don't need two (and half) templating systems anymore, yay!

I'm keeping the changes minimal, to make the diff manageable. Some enhancements
due to a better templating system might be possible in the future.

For handling of '## ' — see the next commit.
2021-05-19 10:24:43 +09:00
Yu Watanabe
6671ff5bad unit: update comment about OOM score
Follow-up for 6b2229c6c60d0486f5eb9ed3088f9c780d7c0233.
2020-11-23 22:21:40 +01:00
Yu Watanabe
db9ecf0501 license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
Zbigniew Jędrzejewski-Szmek
41a7c3bf5d units: uppercase the description
https://github.com/systemd/systemd/pull/15982#pullrequestreview-422536495
2020-06-02 14:14:20 +02:00
Zbigniew Jędrzejewski-Szmek
d1109e12c0 udevd: update snippet string
Repeating the unit name in the description is not useful, and "manages devices"
is too cryptic.
2020-05-30 17:15:20 +02:00
Topi Miettinen
cabc1c6d7a units: add ProtectClock=yes
Add `ProtectClock=yes` to systemd units. Since it implies certain
`DeviceAllow=` rules, make sure that the units have `DeviceAllow=` rules so
they are still able to access other devices. Exclude timesyncd and timedated.
2020-04-07 15:37:14 +02:00
Zbigniew Jędrzejewski-Szmek
cdc6804b60 units: drop full paths for utilities in $PATH
This makes things a bit simpler and the build a bit faster, because we don't
have to rewrite files to do the trivial substitution. @rootbindir@ is always in
our internal $PATH that we use for non-absolute paths, so there should be no
functional change.
2020-01-20 16:50:16 +01:00
Zbigniew Jędrzejewski-Szmek
21d0dd5a89 meson: allow WatchdogSec= in services to be configured
As discussed on systemd-devel [1], in Fedora we get lots of abrt reports
about the watchdog firing [2], but 100% of them seem to be caused by resource
starvation in the machine, and never actual deadlocks in the services being
monitored. Killing the services not only does not improve anything, but it
makes the resource starvation worse, because the service needs cycles to restart,
and coredump processing is also fairly expensive. This adds a configuration option
to allow the value to be changed. If the setting is not set, there is no change.

My plan is to set it to some ridiculusly high value, maybe 1h, to catch cases
where a service is actually hanging.

[1] https://lists.freedesktop.org/archives/systemd-devel/2019-October/043618.html
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1300212
2019-10-25 17:20:24 +02:00
Yu Watanabe
ce8a4ef13d unit: add ExecReload= in systemd-udevd.service 2019-09-18 01:32:46 +09:00
Lennart Poettering
2ec71e439f journald: slightly bump OOM adjust for journald (#13366)
If logging disappears issues are hard to debug, hence let's give
journald a slight edge over other services when the OOM killer hits.

Here are the special adjustments we now make:

 systemd-coredump@.service.in OOMScoreAdjust=500
 systemd-journald.service.in  OOMScoreAdjust=-250
 systemd-udevd.service.in     OOMScoreAdjust=-1000

(i.e. the coredump processing is made more likely to be killed on OOM,
and udevd and journald are less likely to be killed)
2019-08-22 10:02:28 +02:00
Lennart Poettering
62aa29247c units: turn on RestrictSUIDSGID= in most of our long-running daemons 2019-04-02 16:56:48 +02:00
Topi Miettinen
99894b867f units: enable ProtectHostname=yes 2019-02-20 10:50:44 +02:00
Lennart Poettering
ee8f26180d units: switch from system call blacklist to whitelist
This is generally the safer approach, and is what container managers
(including nspawn) do, hence let's move to this too for our own
services. This is particularly useful as this this means the new
@system-service system call filter group will get serious real-life
testing quickly.

This also switches from firing SIGSYS on unexpected syscalls to
returning EPERM. This would have probably been a better default anyway,
but it's hard to change that these days. When whitelisting system calls
SIGSYS is highly problematic as system calls that are newly introduced
to Linux become minefields for services otherwise.

Note that this enables a system call filter for udev for the first time,
and will block @clock, @mount and @swap from it. Some downstream
distributions might want to revert this locally if they want to permit
unsafe operations on udev rules, but in general this shiuld be mostly
safe, as we already set MountFlags=shared for udevd, hence at least
@mount won't change anything.
2018-06-14 17:44:20 +02:00
Lennart Poettering
b2e8ae7380 units: switch udev service to use PrivateMounts=yes
Given that PrivateMounts=yes is the "successor" to MountFlags=slave in
unit files, let's make use of it for udevd.
2018-06-12 16:27:37 +02:00
Zbigniew Jędrzejewski-Szmek
a7df2d1e43 Add SPDX license headers to unit files 2017-11-19 19:08:15 +01:00
Lennart Poettering
0a9b166b43 units: prohibit all IP traffic on all our long-running services (#6921)
Let's lock things down further.
2017-10-04 14:16:28 +02:00
Lennart Poettering
bff8f2543b units: set LockPersonality= for all our long-running services (#6819)
Let's lock things down. Also, using it is the only way how to properly
test this to the fullest extent.
2017-09-14 19:45:40 +02:00
Alan Jenkins
1d422b153b units: order service(s) before udevd, not udev-trigger (coldplug)
Since hotplugs happen as soon as udevd is started, there is not much sense
in giving udev-trigger an After= dependency on any service.  The device
could be hotplugged before coldplug starts.

This is intended to avoid the race window where we create the hwdb with
the wrong selinux context (then fix it up afterwards).
https://github.com/systemd/systemd/issues/3458#issuecomment-322444107
2017-08-15 14:22:44 +01:00
Alan Jenkins
3533b49e74 units: Sockets= already implies Wants= and After= (systemd-udevd.service)
I grepped for other `After=` on a socket unit as well.  This was the only
instance.
2017-08-15 14:11:23 +01:00
Lennart Poettering
7f396e5f66 units: set SystemCallArchitectures=native on all our long-running services 2017-02-09 16:12:03 +01:00
Yu Watanabe
94f42fe3a6 units: systemd-udevd: add AF_INET and AF_INET6 to RestrictAddressFamilies= (#4296)
The udev builtin command `net_setup_link` requires AF_INET and AF_INET6.

Fixes #4293.
2016-10-06 15:40:53 +02:00
Lennart Poettering
0c28d51ac8 units: further lock down our long-running services
Let's make this an excercise in dogfooding: let's turn on more security
features for all our long-running services.

Specifically:

- Turn on RestrictRealtime=yes for all of them

- Turn on ProtectKernelTunables=yes and ProtectControlGroups=yes for most of
  them

- Turn on RestrictAddressFamilies= for all of them, but different sets of
  address families for each

Also, always order settings in the unit files, that the various sandboxing
features are close together.

Add a couple of missing, older settings for a numbre of unit files.

Note that this change turns off AF_INET/AF_INET6 from udevd, thus effectively
turning of networking from udev rule commands. Since this might break stuff
(that is already broken I'd argue) this is documented in NEWS.
2016-09-25 10:52:57 +02:00
Franck Bui
de2edc008a udev: bump TasksMax to inifinity (#3593)
udevd already limits its number of workers/children: the max number is actually
twice the number of CPUs the system is using.

(The limit can also be raised with udev.children-max= kernel command line
option BTW).

On some servers, this limit can easily exceed the maximum number of tasks that
systemd put on all services, which is 512 by default.

Since udevd has already its limitation logic, simply disable the static
limitation done by TasksMax.
2016-06-23 22:31:01 +02:00
Lennart Poettering
c2fc2c2560 units: increase watchdog timeout to 3min for all our services
Apparently, disk IO issues are more frequent than we hope, and 1min
waiting for disk IO happens, so let's increase the watchdog timeout a
bit, for all our services.

See #1353 for an example where this triggers.
2015-09-29 21:55:51 +02:00
Tom Gundersen
62f908b53c udevd: hook up watchdog support
We are already sending watchdog notification, this tells PID1 to actually listen for
them and restart udevd in case it gets stuck.
2015-05-29 18:52:13 +02:00
Lennart Poettering
658f26b828 units: set KillMode=mixed for our daemons that fork worker processes
The daemons should really have the time to kill the workers first,
before systemd does it, hence use KillMode=mixed for these daemons.

https://bugs.freedesktop.org/show_bug.cgi?id=90051
2015-04-24 16:14:46 +02:00
Lennart Poettering
d8f0930eec units: move After=systemd-hwdb-update.service dependency from udev to udev-trigger
Let's move the hwdb regeneration a bit later. Given that hwdb is
non-essential it should be OK to allow udev to run without it until we
do the full trigger.

http://lists.freedesktop.org/archives/systemd-devel/2015-April/030074.html
2015-04-03 14:27:16 +02:00
Zbigniew Jędrzejewski-Szmek
d99ce93383 units: there is no systemd-udev-hwdb-update.service 2015-03-14 23:03:21 -04:00
Lennart Poettering
ecde7065f7 units: rebuild /etc/passwd, the udev hwdb and the journal catalog files on boot
Only when necessary of course, nicely guarded with the new
ConditionNeedsUpdate= condition we added.
2014-06-13 13:26:32 +02:00
Kay Sievers
72543b361d remove ReadOnlySystem and ProtectedHome from udevd and logind
logind needs access to /run/user/, udevd fails during early boot
with these settings
2014-06-04 01:41:15 +02:00
Lennart Poettering
417116f234 core: add new ReadOnlySystem= and ProtectedHome= settings for service units
ReadOnlySystem= uses fs namespaces to mount /usr and /boot read-only for
a service.

ProtectedHome= uses fs namespaces to mount /home and /run/user
inaccessible or read-only for a service.

This patch also enables these settings for all our long-running services.

Together they should be good building block for a minimal service
sandbox, removing the ability for services to modify the operating
system or access the user's private data.
2014-06-03 23:57:51 +02:00
Lennart Poettering
c2c13f2df4 unit: turn off mount propagation for udevd
Keep mounts done by udev rules private to udevd. Also, document how
MountFlags= may be used for this.
2014-03-20 04:16:39 +01:00
Tom Gundersen
edeb68c53f static-nodes: move creation of static nodes from udevd to tmpfiles
As of kmod v14, it is possible to export the static node information from
/lib/modules/`uname -r`/modules.devname in tmpfiles.d(5) format.

Use this functionality to let systemd-tmpfilesd create the static device nodes
at boot, and drop the functionality from systemd-udevd.

As an effect of this we can move from systemd-udevd to systemd-tmpfiles-setup-dev:

 * the conditional CAP_MKNOD (replaced by checking if /sys is mounted rw)
 * ordering before local-fs-pre.target (see 89d09e1b5c65a2d97840f682e0932c8bb499f166)
2013-07-08 21:26:24 +02:00
Lennart Poettering
b0afe214c0 units: order all udev services before sysinit.target, too
Not that it would matter much, but let's make things a bit more
systematic: early boot services shall order themselves before
sysinit.target, and nothing else.
2013-03-25 21:29:09 +01:00
Frederic Crozat
89d09e1b5c udevd: ensure static nodes are created before local-fs mount
static nodes (like /dev/loop-control) are created when systemd-udevd
is started and needed to mount loopback devices. Therefore,
local-fs-pre.target should be only started after systemd-udevd is
started.
2013-03-23 15:17:39 +01:00
Kay Sievers
3bf3cd95ab udevd: sort into sysinit instead of basic target 2013-03-12 15:56:19 +01:00
Lennart Poettering
47ec118473 units: don't enforce a holdoff time for journald, logind, udevd
These services should be restarted as quickly as possible if they fail,
and the extra safety net of the holdoff time is not necessary.
2012-07-18 02:31:52 +02:00
Colin Guthrie
51dfddc2cc units: Rename systemd-udev.service to systemd-udevd.service
This naming convention is more inline with other systemd daemon
unit names (systemd-logind.service, systemd-localed.service etc)

The companion .socket units have also been renamed, however the
-trigger and -settle units keep their current name as these are
not directly related to daemon process itself.
2012-07-02 23:21:51 +02:00