systemd

mirror of https://github.com/systemd/systemd.git synced 2024-12-22 17:35:35 +03:00

Author	SHA1	Message	Date
Luca Boccassi	9e615fa3aa	core: add WantsMountsFor= This is the equivalent of RequiresMountsFor=, but adds Wants= instead of Requires=. It will be useful for example for the autogenerated systemd-cryptsetup units. Fixes https://github.com/systemd/systemd/issues/11646	2023-11-29 11:04:59 +00:00
Yu Watanabe	58cde42f65	core: rename MemoryZswapCurrent -> MemoryZSwapCurrent Follow-up for `26caa66867`.	2023-11-13 13:54:56 +01:00
Florian Schmaus	26caa66867	cgroup: add support for memory.zswap.current	2023-11-12 21:10:40 +01:00
Florian Schmaus	37533c9432	cgroup: add support for memory.swap.current In systemctl-show we only show current swap if ever swapped or non-zero. This reduces the noise on swapless systems, that would otherwise always show a swap value that never has the chance to become non-zero. It further reduces the noise for services that never swapped.	2023-11-11 12:16:29 +01:00
Florian Schmaus	aac3384e56	cgroup: add support for memory.swap.peak	2023-11-11 12:14:07 +01:00
Florian Schmaus	6c71db763c	cgroup: add support for memory.peak Linux's Control Group v2 interfaces exposes memory.peak, which contains the "max memory usage recorded for the cgroup and its descendants since the creation of the cgroup." This commit adds a new property "MemoryPeak" for units and makes "systemctl show" display this value if it is available. Fixes #29878. Signed-off-by: Florian Schmaus <flo@geekplace.eu>	2023-11-06 18:08:33 +01:00
Lennart Poettering	cde8cc946b	Merge pull request #29272 from enr0n/coredump-container coredump: support forwarding coredumps to containers	2023-10-16 16:13:16 +02:00
Luca Boccassi	7c83d42ef8	mount-util: use mount beneath to replace previous namespace mount Instead of mounting over, do an atomic swap using mount beneath, if available. This way assets can be mounted again and again (e.g.: updates) without leaking mounts.	2023-10-16 14:33:47 +01:00
Nick Rosbrook	cfc015f09e	man: document CoredumpReceive= setting	2023-10-13 15:28:50 -04:00
Mike Yuan	854eca4a95	core/execute: always set $USER and introduce SetLoginEnvironment= Before this commit, $USER, $HOME, $LOGNAME and $SHELL are only set when User= is set for the unit. For system service, this results in different behaviors depending on whether User=root is set. $USER always makes sense on its own, so let's set it unconditionally. Ideally $HOME should be set too, but it causes trouble when e.g. getty passes '-p' to login(1), which then doesn't override $HOME. $LOGNAME and $SHELL are more like "login environments", and are generally not suitable for system services. Therefore, a new option SetLoginEnvironment= is also added to control the latter three variables. Fixes #23438 Replaces #8227	2023-10-10 00:00:26 +08:00
Luca Boccassi	559214cbbd	pid1: add SurviveFinalKillSignal= to skip units on final sigterm/sigkill spree Add a new boolean for units, SurviveFinalKillSignal=yes/no. Units that set it will not have their process receive the final sigterm/sigkill in the shutdown phase. This is implemented by checking if a process is part of a cgroup marked with a user.survive_final_kill_signal xattr (or a trusted xattr if we can't set a user one, which were added only in kernel v5.7 and are not supported in CentOS 8).	2023-09-28 13:48:14 +01:00
Mike Yuan	6bd8340d11	man/org.freedesktop.systemd1: add version info for NFTSet Follow-up for `dc7d69b3c1`	2023-09-28 03:04:28 +08:00
Topi Miettinen	dc7d69b3c1	core: firewall integration of cgroups with NFTSet= New directive `NFTSet=` provides a method for integrating dynamic cgroup IDs into firewall rules with NFT sets. The benefit of using this setting is to be able to use control group as a selector in firewall rules easily and this in turn allows more fine grained filtering. Also, NFT rules for cgroup matching use numeric cgroup IDs, which change every time a service is restarted, making them hard to use in systemd environment. This option expects a whitespace separated list of NFT set definitions. Each definition consists of a colon-separated tuple of source type (only "cgroup"), NFT address family (one of "arp", "bridge", "inet", "ip", "ip6", or "netdev"), table name and set name. The names of tables and sets must conform to lexical restrictions of NFT table names. The type of the element used in the NFT filter must be "cgroupsv2". When a control group for a unit is realized, the cgroup ID will be appended to the NFT sets and it will be be removed when the control group is removed. systemd only inserts elements to (or removes from) the sets, so the related NFT rules, tables and sets must be prepared elsewhere in advance. Failures to manage the sets will be ignored. If the firewall rules are reinstalled so that the contents of NFT sets are destroyed, command systemctl daemon-reload can be used to refill the sets. Example: ``` table inet filter { ... set timesyncd { type cgroupsv2 } chain ntp_output { socket cgroupv2 != @timesyncd counter drop accept } ... } ``` /etc/systemd/system/systemd-timesyncd.service.d/override.conf ``` [Service] NFTSet=cgroup:inet:filter:timesyncd ``` ``` $ sudo nft list set inet filter timesyncd table inet filter { set timesyncd { type cgroupsv2 elements = { "system.slice/systemd-timesyncd.service" } } } ```	2023-09-27 18:10:11 +00:00
Luca Boccassi	4c9a288154	man: document SystemState's possible values	2023-09-25 22:55:54 +01:00
Abderrahim Kitouni	d9d2d16aea	man: add version information for dbus interfaces These only go back to version 250 which is the first version to provide the export-dbus-interfaces build target.	2023-09-19 14:33:34 +01:00
Lennart Poettering	2bec84e7a5	core: add new "PollLimit" settings to .socket units This adds a new "PollLimit" pair of settings to .socket units, very similar to existing "TriggerLimit" logic. The differences are: * PollLimit focusses on the polling on the sockets, and pauses that temporarily if a ratelimit on that is reached. TriggerLimit otoh focusses on the triggering effect of socket units, and stops triggering once the ratelimit is hit. * While the trigger limit being hit is an action that causes the socket unit to fail the polling limit being reached will just temporarily disable polling on the socket fd, and it is resumed once the ratelimit interval is over. * When a socket unit operates on multiple socket fds (e,g, ListenStream= on both some ipv6 and an ipv4 address or so). Then the PollLimit will be specific to each fd, while the trigger limit is specific to the whole unit. Implementation-wise this is mostly a wrapper around sd-event's sd_event_source_set_ratelimit(), which exposes the desired behaviour directly. Usecase for all of this: socket services which when overloaded with connections should just slow down reception of it, but not fail persistently.	2023-09-18 18:55:19 +02:00
Michal Koutný	055665d596	dbus: Document org.freedesktop.systemd1.Service.MemoryAvailable property The value is an optimistic estimate, make it clear in the docs.	2023-09-09 10:42:38 +02:00
Abderrahim Kitouni	ec07c3c80b	man: add version info This tries to add information about when each option was added. It goes back to version 183. The version info is included from a separate file to allow generating it, which would allow more control on the formatting of the final output.	2023-08-29 14:07:24 +01:00
Luca Boccassi	b0d3095fd6	Drop split-usr and unmerged-usr support As previously announced, execute order 66: https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html The meson options split-usr, rootlibdir and rootprefix become no-ops that print a warning if they are set to anything other than the default values. We can remove them in a future release.	2023-07-28 19:34:03 +01:00
Luca Boccassi	3835b9aa4b	Revert "core: add IgnoreOnSoftReboot= unit option" The feature is not ready, postpone it This reverts commit `b80fc61e89`.	2023-07-22 23:27:27 +01:00
Luca Boccassi	b80fc61e89	core: add IgnoreOnSoftReboot= unit option As it says on the tin, configures the unit to survive a soft reboot. Currently all the following options have to be set by hand: Conflicts=reboot.target kexec.target poweroff.target halt.target Before=reboot.target kexec.target poweroff.target halt.target After=sysinit.target basic.target DefaultDependencies=no IgnoreOnIsolate=yes This is not very user friendly. If new default dependencies are added, or new shutdown/reboot types, they also have to be added manually. The new option is much simpler, easy to find, and does the right thing by default.	2023-07-21 18:05:41 +02:00
Luca Boccassi	b2deaaf01b	Merge pull request #27584 from rphibel/add-restartquick-option service: add new RestartMode option	2023-07-06 20:37:31 +01:00
Richard Phibel	e568fea9fc	service: add new RestartMode option When this option is set to direct, the service restarts without entering a failed state. Dependent units are not notified of transitory failure. This is useful for the following use case: We have a target with Requires=my-service, After=my-service. my-service.service is a oneshot service and has Restart=on-failure in its definition. my-service.service can get stuck for various reasons and time out, in which case it is restarted. Currently, when it fails the first time, the target fails, even though my-service is restarted. The behavior we're looking for is that until my-service is not restarted anymore, the target stays pending waiting for my-service.service to start successfully or fail without being restarted anymore.	2023-07-06 14:33:52 +02:00
Daniel P. Berrangé	1257274ad8	dbus: add 'ConfidentialVirtualization' property to manager object This property reports whether the system is running inside a confidential virtual machine. Related: https://github.com/systemd/systemd/issues/27604 Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2023-07-06 12:20:04 +01:00
Daan De Meyer	9c0c670125	core: Add RootEphemeral= setting This setting allows services to run in an ephemeral copy of the root directory or root image. To make sure the ephemeral copies are always cleaned up, we add a tmpfiles snippet to unconditionally clean up /var/lib/systemd/ephemeral. To prevent in use ephemeral copies from being cleaned up by tmpfiles, we use the newly added COPY_LOCK_BSD and BTRFS_SNAPSHOT_LOCK_BSD flags to take a BSD lock on the ephemeral copies which instruct tmpfiles to not touch those ephemeral copies as long as the BSD lock is held.	2023-06-21 12:48:46 +02:00
licunlong	a068eeac6f	core/dbus-manager: also show DefaultIOAccounting and DefaultIPAccounting fix: https://github.com/systemd/systemd/issues/28045	2023-06-19 09:57:11 +02:00
Lennart Poettering	e503019bc7	tree-wide: when in doubt use greek small letter mu rather than micro symbol Doesn't really matter since the two unicode symbols are supposedly equivalent, but let's better follow the unicode recommendations to prefer greek small letter mu, as per: https://www.unicode.org/reports/tr25	2023-06-14 10:23:56 +02:00
Daan De Meyer	bbfb25f4b9	creds: Add ImportCredential= ImportCredential= takes a credential name and searches for a matching credential in all the credential stores we know about it. It supports globs which are expanded so that all matching credentials are loaded.	2023-06-08 14:09:18 +02:00
Stefan Roesch	85614c6e2f	add support for KSM This adds support for KSM (kernel samepage merging). It adds a new boolean parameter called MemoryKSM to enable the feature. The feature can only be enabled with newer kernels.	2023-06-05 11:22:43 +02:00
Lennart Poettering	4de665812a	man: document the soft reboot operation	2023-06-02 18:43:10 +02:00
Luca Boccassi	d936595672	manager: restrict Dump() to privileged callers or ratelimit Dump() methods can take quite some time due to the amount of data to serialize, so they can potentially stall the manager. Make them privileged, as they are debugging tools anyway. Use a new 'dump' capability for polkit, and the 'reload' capability for SELinux, as that's also non-destructive but slow. If the caller is not privileged, allow it but rate limited to 10 calls every 10 minutes.	2023-05-19 15:18:23 +01:00
Mike Yuan	e9f17fa8dd	core: rename RestartSecMax to RestartMaxDelaySec	2023-05-18 00:23:49 +08:00
Zbigniew Jędrzejewski-Szmek	8fb350049b	man: fixes for assorted issues reported by the manpage-l10n project Fixes #26761.	2023-05-17 12:25:01 +02:00
Miao Wang	4fad639a13	doc: remove legacy DefaultControlGroup from dbus properties DefaultControlGroup does not exist any more.	2023-05-08 22:23:00 +09:00
Lennart Poettering	a8b993dc11	core: add DelegateSubgroup= setting This implements a minimal subset of #24961, but in a lot more restrictive way: we only allow one level of subcgroup (as that's enough to address the no-processes in inner cgroups rule), and does not change anything about threaded cgroup logic or similar, or make any of this new behaviour mandatory. All this does is this: all non-control processes we invoke for a unit we'll invoke in a subgroup by the specified name. We'll later port all our current services that use cgroup delegation over to this, i.e. user@.service, systemd-nspawn@.service and systemd-udevd.service.	2023-04-27 12:18:32 +02:00
Lennart Poettering	b9c1883a9c	service: add ability to pin fd store Oftentimes it is useful to allow the per-service fd store to survive longer than for a restart. This is useful in various scenarios: 1. An fd to some security relevant object needs to be stashed somewhere, that should not be cleaned automatically, because the security enforcement would be dropped then. 2. A user namespace fd should be allocated on first invocation and be kept around until the user logs out (i.e. systemd --user ends), á la #16328 (This does not implement what #16318 asks for, but should solve the use-case discussed there.) 3. There's interest in allow a concept of "userspace reboots" where the kernel stays running, and userspace is swapped out (i.e. all services exit, and the rootfs transitioned into a new version of it) while keeping some select resources pinned, very similar to how we implement a switch root. Thus it is useful to allow services to exit, while leaving their fds around till the very end. This is exposed through a new FileDescriptorStorePreserve= setting that is closely modelled after RuntimeDirectoryPreserve= (in fact it reused the same internal type), since we want similar behaviour in the end, and quite often they probably want to be used together.	2023-04-13 06:44:27 +02:00
Lennart Poettering	3af48a86d9	Merge pull request #25608 from poettering/dissect-moar dissect: add dissection policies	2023-04-12 13:46:08 +02:00
Colin Walters	4e1ac54e1c	tree-wide: A few more uses of "unmet" for conditions This is a followup to `413e8650b7` > tree-wide: Use "unmet" for condition checks, not "failed" Since I noticed when running `systemctl status` on a recent systemd still seeing `Condition: start condition failed` To recap the original rationale here for "unmet" is that it's normal for some units to be conditional, so the term "failure" here is too strong.	2023-04-11 12:36:53 +09:00
Lennart Poettering	84be0c710d	tree-wide: hook up image dissection policy logic everywhere	2023-04-05 20:45:30 +02:00
Mike Yuan	5171356eee	core: always calculate the next restart interval Follow-up for #26902 and #26971 Let's always calculate the next restart interval since that's more useful. For that, we add 1 to s->n_restarts unconditionally, and change RestartUSecCurrent property to RestartUSecNext.	2023-03-31 01:22:58 +01:00
Lennart Poettering	2ea24611b9	pid1: add DumpFileDescriptorStore() bus call that returns fdstore content info	2023-03-29 18:53:20 +02:00
Mike Yuan	57b33e0ce7	core/dbus-service: add RestartUSecCurrent property This new property shows how much time we actually waits before restarting.	2023-03-27 19:31:12 +08:00
Mike Yuan	be1adc27fc	core: add RestartSteps= and RestartSecMax= for exponentially increasing interval between restarts RestartSteps= accepts a positive integer as the number of steps to take to increase the interval between auto-restarts from RestartSec= to RestartSecMax=, or 0 to disable it. Closes #6129	2023-03-27 19:31:12 +08:00
Mike Yuan	19dff6914d	core: support overriding NOTIFYACCESS= through sd-notify during runtime Closes #25963	2023-03-22 06:33:12 +08:00
Lennart Poettering	6bb0084204	pid1: add unit file settings to control memory pressure logic	2023-03-01 09:43:23 +01:00
Yu Watanabe	60c5bd7759	tree-wide: fix typo	2023-02-22 14:46:19 +09:00
Lennart Poettering	a721cd0016	pid1: add a new D-Bus method for enquing POSIX signals with values to unit processes This augments the existing KillUnit() + Kill() methods with QueueSignalUnit() + QueueSignal(), which are what sigqueue() is to kill(). This is useful for sending our new SIGRTMIN+18 control signals to system services.	2023-02-17 09:55:35 +01:00
Luca Boccassi	53fda560dc	core: add support for Startup memory limits We support separate Startup configurations for CPU and I/O, so add it for memory too. Only cover cgroupsv2 settings.	2023-02-15 20:01:16 +00:00
Luca Boccassi	e0e7bc8223	core: add GetUnitByPIDFD method and use it in systemctl A pid can be recycled, but a pidfd is pinned. Add a new method that is safer as it takes a pidfd as input. Return not only the D-Bus object path, but also the unit id and the last recorded invocation id, as they are both useful (especially the id, as converting from a path object to a unit id from a script requires another round-trip via D-Bus). Note that the manager still tracks processes by pid, so theorethically this is not fully error-proof, but on the other hand the method response is synchronous and the manager is single-threaded, so once a call is being processed the unit database will not change anyway. Once the manager switches to use pidfds everywhere, this can be further hardened.	2023-01-18 10:58:46 +01:00
Lennart Poettering	3bd28bf721	pid1: add new Type=notify-reload service type Fixes: #6162	2023-01-10 18:28:38 +01:00

1 2 3 4

163 Commits