1
0
mirror of https://github.com/systemd/systemd.git synced 2025-01-06 17:18:12 +03:00
Commit Graph

5256 Commits

Author SHA1 Message Date
Daan De Meyer
fa693fdc7e core: Add support for PrivateUsers=identity
This configures an indentity mapping similar to
systemd-nspawn --private-users=identity.
2024-09-09 18:31:01 +02:00
Lennart Poettering
7a3223f509
Merge pull request #34258 from yuwata/nspawn-volatile-u
nspawn: make --volatile work with -U
2024-09-09 17:11:11 +02:00
Yu Watanabe
ef32235db1
Merge pull request #34067 from LukeShu/lukeshu/nspawn-fuse
nspawn: enable FUSE in containers
2024-09-09 19:32:16 +09:00
Luke T. Shumaker
dc3223919f nspawn: enable FUSE in containers
Linux kernel v4.18 (2018-08-12) added user-namespace support to FUSE, and
bumped the FUSE version to 7.27 (see: da315f6e0398 (Merge tag
'fuse-update-4.18' of
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse, Linus Torvalds,
2018-06-07).  This means that on such kernels it is safe to enable FUSE in
nspawn containers.

In outer_child(), before calling copy_devnodes(), check the FUSE version to
decide whether enable (>=7.27) or disable (<7.27) FUSE in the container.  We
look at the FUSE version instead of the kernel version in order to enable FUSE
support on older-versioned kernels that may have the mentioned patchset
backported ([as requested by @poettering][1]).  However, I am not sure that
this is safe; user-namespace support is not a documented part of the FUSE
protocol, which is what FUSE_KERNEL_VERSION/FUSE_KERNEL_MINOR_VERSION are meant
to capture.  While the same patchset
 - added FUSE_ABORT_ERROR (which is all that the 7.27 version bump
   is documented as including),
 - bumped FUSE_KERNEL_MINOR_VERSION from 26 to 27, and
 - added user-namespace support
these 3 things are not inseparable; it is conceivable to me that a backport
could include the first 2 of those things and exclude the 3rd; perhaps it would
be safer to check the kernel version.

Do note that our get_fuse_version() function uses the fsopen() family of
syscalls, which were not added until Linux kernel v5.2 (2019-07-07); so if
nothing has been backported, then the minimum kernel version for FUSE-in-nspawn
is actually v5.2, not v4.18.

Pass whether or not to enable FUSE to copy_devnodes(); have copy_devnodes()
copy in /dev/fuse if enabled.

Pass whether or not to enable FUSE back over fd_outer_socket to run_container()
so that it can pass that to append_machine_properties() (via either
register_machine() or allocate_scope()); have append_machine_properties()
append "DeviceAllow=/dev/fuse rw" if enabled.

For testing, simply check that /dev/fuse can be opened for reading and writing,
but that actually reading from it fails with EPERM.  The test assumes that if
FUSE is supported (/dev/fuse exists), then the testsuite is running on a kernel
with FUSE >= 7.27; I am unsure how to go about writing a test that validates
that the version check disables FUSE on old kernels.

[1]: https://github.com/systemd/systemd/issues/17607#issuecomment-745418835

Closes #17607
2024-09-07 10:18:35 -06:00
Michal Sekletar
887a18b0d3 docs: use actual docs/HACKING.md URL 2024-09-07 12:14:42 +02:00
Luke T. Shumaker
93c15c6d43 test: add a testcase for unprivileged nspawn
Right now it mostly duplicates a test that already exists in
TEST-50-DISSECT.mountfsd.sh, but it serves as a template for more unprivileged
nspawn tests.
2024-09-06 18:33:50 -06:00
Lennart Poettering
fc8ddae76b pcrlock: be more careful when preparing credential name for pcrlock policy
The .cred suffix is stripped from a credential as it is imported from
the ESP, hence it should not be included in the credential name embedded
in the credential.

Fixes: #33497
2024-09-06 18:55:32 +02:00
Lennart Poettering
8e6587679b cryptenroll/cryptsetup: allow combined signed TPM2 PCR policy + pcrlock policy
So far you had to pick:

1. Use a signed PCR TPM2 policy to lock your disk to (i.e. UKI vendor
   blesses your setup via signature)
or
2. Use a pcrlock policy (i.e. local system blesses your setup via
   dynamic local policy stored in NV index)

It was not possible combine these two, because TPM2 access policies do
not allow the combination of PolicyAuthorize (used to implement #1
above) and PolicyAuthorizeNV (used to implement #2) in a single policy,
unless one is "further upstream" (and can simply remove the other from
the policy freely).

This is quite limiting of course, since we actually do want to enforce
on each TPM object that both the OS vendor policy and the local policy
must be fulfilled, without the chance for the vendor or the local system
to disable the other.

This patch addresses this: instead of trying to find a way to come up
with some adventurous scheme to combine both policy into one TPM2
policy, we simply shard the symmetric LUKS decryption key: one half we
protect via the signed PCR policy, and the other we protect via the
pcrlock policy. Only if both halves can be acquired the disk can be
decrypted.

This means:

1. we simply double the unlock key in length in case both policies shall
   be used.
2. We store two resulting TPM policy hashes in the LUKS token JSON, one
   for each policy
3. We store two sealed TPM policy key blobs in the LUKS token JSON, for
   both halves of the LUKS unlock key.

This patch keeps the "sharding" logic relatively generic (i.e. the low
level logic is actually fine with more than 2 shards), because I figure
sooner or later we might have to encode more shards, for example if we
add further TPM2-based access policies, for example when combining FIDO2
with TPM2, or implementing TOTP for this.
2024-09-06 15:55:28 +02:00
Yu Watanabe
48878074d6 test: add test cases for --volatile= with -U
For issue #34254.
2024-09-06 13:24:36 +09:00
Yu Watanabe
31a9aedf03 test: fix copy-and-paste error in comment 2024-09-06 13:10:19 +09:00
Yu Watanabe
a00006861b
Merge pull request #34261 from yuwata/repart-seed-random
repart: initialize seed earlier
2024-09-06 08:30:12 +09:00
Lennart Poettering
41902bacc3
Merge pull request #34256 from YHNdnzj/pid1-followup
core: follow-ups for recent PRs
2024-09-05 17:01:10 +02:00
Yu Watanabe
fe6049d021 test: fix indentation 2024-09-05 18:01:42 +09:00
Yu Watanabe
56d6ebd404 test: add test case for systemd-repart --seed=random
For issue #34257.
2024-09-05 18:01:42 +09:00
Yu Watanabe
c47f2a26b0 test: add test cases of "systemctl cat" for nonexistent units 2024-09-05 10:08:03 +09:00
Mike Yuan
7a9f0125bb
core: rename BindJournalSockets= to BindLogSockets=
Addresses https://github.com/systemd/systemd/pull/32487#issuecomment-2328465309
2024-09-04 21:44:25 +02:00
Daan De Meyer
2b9ced9072 network: Add support for mq qdisc 2024-09-04 14:56:40 +02:00
Daan De Meyer
3f14557ce0 network: Add support for multiq qdisc 2024-09-04 14:56:37 +02:00
Daan De Meyer
5064de1383
Merge pull request #34224 from yuwata/network-make-qdisc-reconfigurable
network: make qdisc reconfigurable
2024-09-04 12:07:16 +02:00
Mike Yuan
1a64b42c46
TEST-50-DISSECT: add explicit coverage for BindJournalSockets= 2024-09-03 21:04:52 +02:00
Mike Yuan
e2e6c23fdb
test: drop unneeded journal socket bind mounts
(where BindJournalSockets=yes is implied)
2024-09-03 21:04:52 +02:00
Daan De Meyer
27cacec939 repart: Add compression support
Now that mkfs.btrfs is adding support for compressing the generated
filesystem (https://github.com/kdave/btrfs-progs/pull/882), let's
add general support for specifying the compression algorithm and
compression level to use.

We opt to not parse the specified compression algorithm and instead
pass it on as is to the mkfs tool. This has a few benefits:

- We support every compression algorithm supported by every tool
  automatically.
- Users don't need to modify systemd-repart if a mkfs tool learns a
  new compression algorithm in the future
- We don't need to maintain a bunch of tables for filesystem to map
  from our generic compression algorithm enum to the filesystem specific
  names.

We don't add support for btrfs just yet until the corresponding PR
in btrfs-progs is merged.
2024-09-03 08:49:49 +02:00
Daan De Meyer
6b5d3d2556 TEST-58-REPART: Only skip part of testcase_minimize() that requires root 2024-09-03 08:48:34 +02:00
Daan De Meyer
d55d756c42 TEST-58-REPART: Always run TEST-58-REPART in virtual machine
Required for various tests in TEST-58-REPART.
2024-09-03 08:48:34 +02:00
Frantisek Sumsal
bd7a06dc31 test: don't install Python scripts from systemd-test RPM
The original regex didn't cover the `run-unit-tests.py` script that
made the old framework pull in Python into the test image, which in turn
allowed the new TEST-69-SHUTDOWN Python script to get executed in the
old framework's image, causing unexpected fails with latest Python on
Rawhide.
2024-09-02 19:26:57 +01:00
Luca Boccassi
1e2d1a7202 portable: ensure PORTABLE_FORCE_ATTACH works even when there is a leftover unit
Force means force, we skip checks with PID1 for existing units, but
then bail out with EEXIST if the files are actually there. Overwrite
everything instead.
2024-09-02 15:33:29 +01:00
Daan De Meyer
21d9eeb5e6 networkd: Replace existing objects instead of doing nothing if they exist
Currently, if for example a traffic control object already exist, networkd
will silently do nothing, even if the settings in the network file for the
traffic control object have changed. Let's instead replace the object if it
already exists so that new settings from the network file are applied as
expected.

Fixes #31226
2024-09-02 14:12:49 +09:00
Yu Watanabe
7876f3d63a test-network: use the same MTU bytes for veth interfaces
Hopefully fixes #34204.
2024-08-31 11:24:56 +01:00
Yu Watanabe
c5d5d76988 test: add test for GetUnitByPID() D-Bus method
For issue #34104.
2024-08-29 14:16:43 +01:00
Luca Boccassi
5162829ec8 core: do BindMount/MountImage operations in async control process
These operations might require slow I/O, and thus might block PID1's main
loop for an undeterminated amount of time. Instead of performing them
inline, fork a worker process and stash away the D-Bus message, and reply
once we get a SIGCHILD indicating they have completed. That way we don't
break compatibility and callers can continue to rely on the fact that when
they get the method reply the operation either succeeded or failed.

To keep backward compatibility, unlike reload control processes, these
are ran inside init.scope and not the target cgroup. Unlike ExecReload,
this is under our control and is not defined by the unit. This is necessary
because previously the operation also wasn't ran from the target cgroup,
so suddenly forking a copy-on-write copy of pid1 into the target cgroup
will make memory usage spike, and if there is a MemoryMax= or MemoryHigh=
set and the cgroup is already close to the limit, it will cause an OOM
kill, where previously it would have worked fine.
2024-08-29 12:48:55 +01:00
Luca Boccassi
1e17e48b96 test: mount ld.so.cache in minimal nspawn container if present
In some cases (SUSE Tumbleweed) this is needed as a library (libz) is
not in the default path, so it fails to run.
2024-08-29 07:27:16 +02:00
Daan De Meyer
7560a5393a test: Set show_status=error
The TEST-64-UDEV-STORAGE tests fail before we even start the test.
Let's set show_status=error to get more information when those failures
happen.
2024-08-28 19:20:56 +02:00
Adrian Vovk
88261bcf3b
Merge pull request #33570 from AdrianVovk/sysupdate-incomplete
sysupdate: Handle incomplete versions
2024-08-27 13:04:02 -04:00
Luca Boccassi
7d8bbfbe08 service: add 'debug' option to RestartMode=
One of the major pait points of managing fleets of headless nodes is
that when something fails at startup, unless debug level was already
enabled (which usually isn't, as it's a firehose), one needs to manually
enable it and pray the issue can be reproduced, which often is really
hard and time consuming, just to get extra info. Usually the extra log
messages are enough to triage an issue.

This new option makes it so that when a service fails and is restarted
due to Restart=, log level for that unit is set to debug, so that all
setup code in pid1 and sd-executor logs at debug level, and also a new
DEBUG_INVOCATION=1 env var is passed to the service itself, so that it
knows it should start with a higher log level. Once the unit succeeds
or reaches the rate limit the original level is restored.
2024-08-27 12:24:45 +01:00
Yu Watanabe
80e038221b test: add more test cases for resolvconf 2024-08-27 05:37:34 +09:00
Yu Watanabe
5dc74c6667 test-network: check one more rule we configure 2024-08-23 23:57:17 +09:00
Daan De Meyer
615226abd8 Revert "nspawn: Allow specifying custom init program"
I don't actually need this anymore since we're going with a
unit based approach for the containers stuff internally so
let's just revert it.

Fixes #34085

This reverts commit ce2291730d.
2024-08-22 22:20:42 +02:00
Adrian Vovk
e7416c9d42
sysupdate: Add tests for incomplete versions
To make sure we don't regress on #33339
2024-08-22 16:00:47 -04:00
Yu Watanabe
00ed8c6dfa
Merge pull request #34072 from yuwata/networkd-routing-policy-rule-follow-up
network/routing-policy-rule: follow up for recent change
2024-08-22 07:17:10 +09:00
Adrian Vovk
38d7b8d3ff
Merge pull request #32363 from CodethinkLabs/sysupdate-dbus
sysupdate: Implement dbus service
2024-08-21 15:35:34 -04:00
Yu Watanabe
cd2a1e2df9 test-network: also test routing policy rules are configured as expected after reconfiguration
For issue #34068.
2024-08-22 04:21:02 +09:00
Yu Watanabe
462be8c957 test-network: find routing policy rule by priority
We usually configure a test rule with a unique priority. Hence, finding
rule by priority reduces the lines of output, and we can debug easily.

Also print short comments on check. That's helpful when the check is
called several times.
2024-08-22 04:16:12 +09:00
Luca Boccassi
bdf75118ba
Merge pull request #34049 from yuwata/network-routing-policy-rule
network: further rework for routing policy rule
2024-08-21 12:46:37 +02:00
Tom Coldrick
b8b38e3da6
sysupdate: Add integration test for updatectl updates 2024-08-21 09:31:41 +01:00
Yu Watanabe
2656f44c3c
Merge pull request #34018 from yuwata/network-address-label
network: allow to configure IPv6 address label in networkd.conf
2024-08-21 02:05:22 +09:00
Daan De Meyer
c8e7cfeddc tests: Don't override QemuKvm= value if TEST_NO_KVM=0
Let's disable KVM if TEST_NO_KVM=1 is set but let's not specify anything
if it's not set so the QemuKvm= setting from mkosi.conf is used.
2024-08-21 01:52:09 +09:00
Yu Watanabe
085818569b test-network: add test for ManageForeignRoutingPolicyRules= 2024-08-20 21:02:31 +09:00
Yu Watanabe
49454d9ced test-network: add tests for Type=table, goto, and nop 2024-08-20 21:02:31 +09:00
Yu Watanabe
936dec4337 test-network: do not pass '[detached]' to 'ip rule del'
That indicates the interface name in 'iif' or 'oif' cannot be resolved
when 'ip rule' command is invoked. That's natural when networkd fail to
remove rule but the corresponding interface is already removed.
To make not the residual rules interfere subsequent test cases, let's
ignore the flag and actually remove unwanted rules.
2024-08-20 21:02:31 +09:00
Yu Watanabe
489671d225 network/address-label: allow to configure IPv6 address label in networkd.conf
Closes #23159.
2024-08-20 20:50:56 +09:00