1
0
mirror of https://github.com/systemd/systemd.git synced 2025-01-06 17:18:12 +03:00
Commit Graph

52 Commits

Author SHA1 Message Date
Ryan Wilson
63d4c4271c cgroup: Add ManagedOOMMemoryPressureDurationSec= override setting for units
This will allow units (scopes/slices/services) to override the default
systemd-oomd setting DefaultMemoryPressureDurationSec=.

The semantics of ManagedOOMMemoryPressureDurationSec= are:
- If >= 1 second, overrides DefaultMemoryPressureDurationSec= from oomd.conf
- If is empty, uses DefaultMemoryPressureDurationSec= from oomd.conf
- Ignored if ManagedOOMMemoryPressure= is not "kill"
- Disallowed if < 1 second

Note the corresponding dbus property is DefaultMemoryPressureDurationUSec
which is in microseconds. This is consistent with other time-based
dbus properties.
2024-10-16 20:12:38 -07:00
Arthur Shau
cc0ab8c810 timer: introduce DeferReactivation setting
By default, in instances where timers are running on a realtime schedule,
if a service takes longer to run than the interval of a timer, the
service will immediately start again when the previous invocation finishes.
This is caused by the fact that the next elapse is calculated based on
the last trigger time, which, combined with the fact that the interval
is shorter than the runtime of the service, causes that elapse to be in
the past, which in turn means the timer will trigger as soon as the
service finishes running.

This behavior can be changed by enabling the new DeferReactivation setting,
which will cause the next calendar elapse to be calculated based on when
the trigger unit enters inactivity, rather than the last trigger time.

Thus, if a timer is on an realtime interval, the trigger will always
adhere to that specified interval.
E.g. if you have a timer that runs on a minutely interval, the setting
guarantees that triggers will happen at *:*:00 times, whereas by default
this may skew depending on how long the service runs.

Co-authored-by: Matteo Croce <teknoraver@meta.com>
2024-10-11 22:54:16 +02:00
Zbigniew Jędrzejewski-Szmek
8e3fee33af Revert "docs: use collections to structure the data"
This reverts commit 5e8ff010a1.

This broke all the URLs, we can't have that. (And actually, we probably don't
_want_ to make the change either. It's nicer to have all the pages in one
directory, so one doesn't have to figure out to which collection the page
belongs.)
2024-02-23 09:48:47 +01:00
hulkoba
5e8ff010a1
docs: use collections to structure the data 2024-02-22 10:11:54 +01:00
Nick Rosbrook
cfc015f09e man: document CoredumpReceive= setting 2023-10-13 15:28:50 -04:00
Quentin Deslandes
523ea1237a journal: log filtering options support in PID1
Define new unit parameter (LogFilterPatterns) to filter logs processed by
journald.

This option is used to store a regular expression which is carried from
PID1 to systemd-journald through a cgroup xattrs:
`user.journald_log_filter_patterns`.
2022-12-15 09:57:39 +00:00
Yu Watanabe
adc1b76c30 core: add missing dependency DBus properties
Follow-up for 0bc488c99a.

Also sort dependency properties to make them match the definition of
`enum UnitDependency` in basic/unit-def.h.

Fixes #22133.
2022-01-16 14:05:33 +00:00
Zbigniew Jędrzejewski-Szmek
e2de2d28f4
Merge pull request #20813 from unusual-thoughts/exittype_v2
Reintroduce ExitType
2021-11-08 15:06:37 +01:00
Henri Chain
596e447076 Reintroduce ExitType
This introduces `ExitType=main|cgroup` for services.
Similar to how `Type` specifies the launch of a service, `ExitType` is
concerned with how systemd determines that a service exited.

- If set to `main` (the current behavior), the service manager will consider
  the unit stopped when the main process exits.

- The `cgroup` exit type is meant for applications whose forking model is not
  known ahead of time and which might not have a specific main process.
  The service will stay running as long as at least one process in the cgroup
  is running. This is intended for transient or automatically generated
  services, such as graphical applications inside of a desktop environment.

Motivation for this is #16805. The original PR (#18782) was reverted (#20073)
after realizing that the exit status of "the last process in the cgroup" can't
reliably be known (#19385)

This version instead uses the main process exit status if there is one and just
listens to the cgroup empty event otherwise.

The advantages of a service with `ExitType=cgroup` over scopes are:
- Integrated logging / stdout redirection
- Avoids the race / synchronisation issue between launch and scope creation
- More extensive use of drop-ins and thus distro-level configuration:
  by moving from scopes to services we can have drop ins that will affect
  properties that can only be set during service creation,
  like `OOMPolicy` and security-related properties
- It makes systemd-xdg-autostart-generator usable by fixing [1], as obviously
  only services can be used in the generator, not scopes.

[1] https://bugs.kde.org/show_bug.cgi?id=433299
2021-11-08 10:15:23 +01:00
Daan De Meyer
51462135fb exec: Add TTYRows and TTYColumns properties to set TTY dimensions 2021-11-05 21:32:14 +00:00
Albert Brox
5918a93355 core: implement RuntimeMaxDeltaSec directive 2021-09-28 16:46:20 +02:00
Zbigniew Jędrzejewski-Szmek
0aff7b7584 docs: add spdx tags to all .md files
I have no idea if this is going to cause rendering problems, and it is fairly
hard to check. So let's just merge this, and if it github markdown processor
doesn't like it, revert.
2021-09-27 09:19:02 +02:00
Peter Morrow
c93a7d4ad3 docs: update docs with StartupAllowedCPUs and StartupAllowedMemoryNodes details
Signed-off-by: Peter Morrow <pemorrow@linux.microsoft.com>
2021-09-15 09:52:12 +01:00
Zbigniew Jędrzejewski-Szmek
abaf5edd08 Revert "Introduce ExitType"
This reverts commit cb0e818f7c.

After this was merged, some design and implementation issues were discovered,
see the discussion in #18782 and #19385. They certainly can be fixed, but so
far nobody has stepped up, and we're nearing a release. Hopefully, this feature
can be merged again after a rework.

Fixes #19345.
2021-06-30 21:56:47 +02:00
Uwe Kleine-König
cbcdcaaa0e Add support for conditions on the machines firmware
This allows to limit units to machines that run on a certain firmware
type. For device tree defined machines checking against the machine's
compatible is also possible.
2021-04-28 10:55:55 +02:00
Henri Chain
cb0e818f7c Introduce ExitType 2021-03-31 10:26:07 +02:00
Anita Zhang
4e806bfa9f oom: add unit file settings for oomd avoid/omit xattrs 2021-02-12 12:45:36 -08:00
Anita Zhang
0a9f93443b oom: rework *MemoryPressureLimit= properties to have 1/10000 precision
Requested in
https://github.com/systemd/systemd/pull/15206#discussion_r505506657,
preserve the full granularity for memory pressure limits (permyriad)
instead of capping out at percent.
2021-02-02 17:52:48 -08:00
Kristijan Gjoshev
acf24a1a84 timer: add new feature FixedRandomDelay=
FixedRandomDelay=yes will use
`siphash24(sd_id128_get_machine() || MANAGER_IS_SYSTEM(m) || getuid() || u->id)`,
where || is concatenation, instead of a random number to choose a value between
0 and RandomizedDelaySec= as the timer delay.
This essentially sets up a fixed, but seemingly random, offset for each timer
iteration rather than having a random offset recalculated each time it fires.

Closes #10355

Co-author: Anita Zhang <the.anitazha@gmail.com>
2020-11-05 10:59:33 -08:00
Anita Zhang
4d824a4e0b core: add ManagedOOM*= properties to configure systemd-oomd on the unit
This adds the hook ups so it can be read with the usual systemd
utilities. Used in later commits by sytemd-oomd.
2020-10-07 16:17:23 -07:00
Topi Miettinen
9df2cdd8ec exec: SystemCallLog= directive
With new directive SystemCallLog= it's possible to list system calls to be
logged. This can be used for auditing or temporarily when constructing system
call filters.

---
v5: drop intermediary, update HASHMAP_FOREACH_KEY() use
v4: skip useless debug messages, actually parse directive
v3: don't declare unused variables with old libseccomp
v2: fix build without seccomp or old libseccomp
2020-09-15 12:54:17 +03:00
Renaud Métrich
3e5f04bf64 socket: New option 'FlushPending' (boolean) to flush socket before entering listening state
Disabled by default. When Enabled, before listening on the socket, flush the content.
Applies when Accept=no only.
2020-09-01 17:20:23 +02:00
Lennart Poettering
4e39995371 core: introduce ProtectProc= and ProcSubset= to expose hidepid= and subset= procfs mount options
Kernel 5.8 gained a hidepid= implementation that is truly per procfs,
which allows us to mount a distinct once into every unit, with
individual hidepid= settings. Let's expose this via two new settings:
ProtectProc= (wrapping hidpid=) and ProcSubset= (wrapping subset=).

Replaces: #11670
2020-08-24 20:11:02 +02:00
Yu Watanabe
830ffbce1b doc: add recentry introduced transient settings
Also sort entries for service settings.
2020-07-01 10:38:08 +02:00
Lennart Poettering
a3d19f5d99 core: add new PassPacketInfo= socket unit property 2020-05-27 22:40:38 +02:00
Martin Hundebøll
75f4bd7fd0 man: document ReadWriteOnly property for mount units 2020-05-20 14:26:04 +02:00
Zbigniew Jędrzejewski-Szmek
ad21e542b2 manager: add CoredumpFilter= setting
Fixes #6685.
2020-04-09 14:08:48 +02:00
Kevin Kuehler
022d334561 man: doc: Document ProtectClock= 2020-01-27 11:21:36 -08:00
Lennart Poettering
b35ec8ded2 docs: uppercase all markdown document titles
For most we used uppercasing, but not for all. Let's stick to one rule,
and uppercase them all.
2020-01-14 10:14:11 +01:00
Tobias Bernard
b41a3f66c9 docs: make it pretty
Add custom Jekyll theme, logo, webfont and .gitignore

FIXME: the markdown files have some H1 headers which need to be replaced
with H2
2019-12-11 17:04:20 +01:00
Lennart Poettering
4cdca0af11 docs: place all our markdown docs in rough categories 2019-12-11 10:53:00 +01:00
Zbigniew Jędrzejewski-Szmek
b096d14c41 doc: update list of transient units
Doing this manually seem to work only so well, but it is indeed hard to generate
automatically. Let's add the stuff that is missing for now.

AddRef= is not a unit file setting, remove it from the list.
2019-11-27 13:56:29 +01:00
Zbigniew Jędrzejewski-Szmek
370f0dc81c doc: drop rhs from transient settings list
I don't know why these particular ones had them.
2019-11-27 11:04:36 +01:00
Anita Zhang
05d6628ad2
Merge pull request #14151 from mk-fg/fix-timer-dump-syntax-bug
core.timer: fix "systemd-analyze dump" and docs syntax inconsistencies wrt OnTimezoneChange=
2019-11-25 15:56:33 -08:00
Mike Kazantsev
0810e39628 core.timer: fix "systemd-analyze dump" and docs syntax inconsistencies wrt OnTimezoneChange= 2019-11-26 04:29:03 +05:00
Zbigniew Jędrzejewski-Szmek
a5f6f346d3
Merge pull request #13423 from pwithnall/12035-session-time-limits
Add `RuntimeMaxSec=` support to scope units (time-limited login sessions)
2019-10-28 14:57:00 +01:00
Philip Withnall
9ed7de605d scope: Support RuntimeMaxSec= directive in scope units
Just as `RuntimeMaxSec=` is supported for service units, add support for
it to scope units. This will gracefully kill a scope after the timeout
expires from the moment the scope enters the running state.

This could be used for time-limited login sessions, for example.

Signed-off-by: Philip Withnall <withnall@endlessm.com>

Fixes: #12035
2019-10-28 09:44:31 +01:00
Pavel Hrdina
047f5d63d7 cgroup: introduce support for cgroup v2 CPUSET controller
Introduce support for configuring cpus and mems for processes using
cgroup v2 CPUSET controller.  This allows users to limit which cpus
and memory NUMA nodes can be used by processes to better utilize
system resources.

The cgroup v2 interfaces to control it are cpuset.cpus and cpuset.mems
where the requested configuration is written.  However, it doesn't mean
that the requested configuration will be actually used as parent cgroup
may limit the cpus or mems as well.  In order to reflect the real
configuration cgroup v2 provides read-only files cpuset.cpus.effective
and cpuset.mems.effective which are exported to users as well.
2019-09-24 15:16:07 +02:00
Anita Zhang
31cd5f63ce core: ExecCondition= for services
Closes #10596
2019-07-17 11:35:02 +02:00
Chris Down
acdb4b5236 cgroup: Polish hierarchically aware protection docs a bit
I missed adding a section in `systemd.resource-control` about
DefaultMemoryMin in #12332.

Also, add a NEWS entry going over the general concept.
2019-05-08 12:06:32 +01:00
Anita Zhang
25cc30c4c8 core: support DisableControllers= for transient units 2019-04-22 11:52:08 -07:00
Jan Klötzke
dc653bf487 service: handle abort stops with dedicated timeout
When shooting down a service with SIGABRT the user might want to have a
much longer stop timeout than on regular stops/shutdowns. Especially in
the face of short stop timeouts the time might not be sufficient to
write huge core dumps before the service is killed.

This commit adds a dedicated (Default)TimeoutAbortSec= timer that is
used when stopping a service via SIGABRT. In all other cases the
existing TimeoutStopSec= is used. The timer value is unset by default
to skip the special handling and use TimeoutStopSec= for state
'stop-watchdog' to keep the old behaviour.

If the service is in state 'stop-watchdog' and the service should be
stopped explicitly we still go to 'stop-sigterm' and re-apply the usual
TimeoutStopSec= timeout.
2019-04-12 17:32:52 +02:00
Chris Down
c52db42b78 cgroup: Implement default propagation of MemoryLow with DefaultMemoryLow
In cgroup v2 we have protection tunables -- currently MemoryLow and
MemoryMin (there will be more in future for other resources, too). The
design of these protection tunables requires not only intermediate
cgroups to propagate protections, but also the units at the leaf of that
resource's operation to accept it (by setting MemoryLow or MemoryMin).

This makes sense from an low-level API design perspective, but it's a
good idea to also have a higher-level abstraction that can, by default,
propagate these resources to children recursively. In this patch, this
happens by having descendants set memory.low to N if their ancestor has
DefaultMemoryLow=N -- assuming they don't set a separate MemoryLow
value.

Any affected unit can opt out of this propagation by manually setting
`MemoryLow` to some value in its unit configuration. A unit can also
stop further propagation by setting `DefaultMemoryLow=` with no
argument. This removes further propagation in the subtree, but has no
effect on the unit itself (for that, use `MemoryLow=0`).

Our use case in production is simplifying the configuration of machines
which heavily rely on memory protection tunables, but currently require
tweaking a huge number of unit files to make that a reality. This
directive makes that significantly less fragile, and decreases the risk
of misconfiguration.

After this patch is merged, I will implement DefaultMemoryMin= using the
same principles.
2019-04-12 17:23:58 +02:00
Lennart Poettering
7445db6eb7 man: document the new RestrictSUIDSGID= setting 2019-04-02 16:56:48 +02:00
Lennart Poettering
efebb613c7 core: optionally, trigger .timer units on timezone and clock changes
Fixes: #6228
2019-04-02 08:20:10 +02:00
Filipe Brandenburger
10f2864111 core: add CPUQuotaPeriodSec=
This new setting allows configuration of CFS period on the CPU cgroup, instead
of using a hardcoded default of 100ms.

Tested:
- Legacy cgroup + Unified cgroup
- systemctl set-property
- systemctl show
- Confirmed that the cgroup settings (such as cpu.cfs_period_ns) were set
  appropriately, including updating the CPU quota (cpu.cfs_quota_ns) when
  CPUQuotaPeriodSec= is updated.
- Checked that clamping works properly when either period or (quota * period)
  are below the resolution of 1ms, or if period is above the max of 1s.
2019-02-14 11:04:42 -08:00
Filipe Brandenburger
c3e270f4ee docs: add a "front matter" snippet to our markdown pages
It turns out Jekyll (the engine behind GitHub Pages) requires that pages
include a "Front Matter" snippet of YAML at the top for proper rendering.

Omitting it will still render the pages, but including it opens up new
possibilities, such as using a {% for %} loop to generate index.md instead of
requiring a separate script.

I'm hoping this will also fix the issue with some of the pages (notably
CODE_OF_CONDUCT.html) not being available under systemd.io

Tested locally by rendering the website with Jekyll. Before this change, the
*.md files were kept unchanged (so not sure how that even works?!), after this
commit, proper *.html files were generated from it.
2019-01-02 14:16:34 -08:00
Lennart Poettering
7af67e9a8b core: allow to set exit status when using SuccessAction=/FailureAction=exit in units
This adds SuccessActionExitStatus= and FailureActionExitStatus= that may
be used to configure the exit status to propagate in when
SuccessAction=exit or FailureAction=exit is used.

When not specified let's also propagate the exit status of the main
process we fork off for the unit.
2018-11-27 09:44:40 +01:00
Zbigniew Jędrzejewski-Szmek
2640b77356 docs/TRANSIENT-SETTINGS: drop PermissionsStartOnly= from 2018-11-16 16:21:21 +01:00
Anita Zhang
90fc172e19 core: implement per unit journal rate limiting
Add LogRateLimitIntervalSec= and LogRateLimitBurst= options for
services. If provided, these values get passed to the journald
client context, and those values are used in the rate limiting
function in the journal over the the journald.conf values.

Part of #10230
2018-10-18 09:56:20 +02:00