1
0
mirror of https://github.com/systemd/systemd.git synced 2024-12-22 17:35:35 +03:00

man: beef up the description of systemd-oomd.service

The gist of the description is moved from systemd.resource-control
to systemd-oomd man page. Cross-references to OOMPolicy, memory.oom.group,
oomctl, ManagedOOMSwap and ManagedOOMMemoryPressure are added in all
places.

The descriptions are also more down-to-earth: instead of talking
about "taking action" let's just say "kill". We *might* add configuration
for different actions in the future, but we're not there yet, so let's
just describe what we do now.
This commit is contained in:
Zbigniew Jędrzejewski-Szmek 2022-04-26 22:04:31 +02:00
parent c0a96b1b1d
commit 6f83ea60e9
3 changed files with 70 additions and 51 deletions

View File

@ -29,23 +29,36 @@
<refsect1> <refsect1>
<title>Description</title> <title>Description</title>
<para><command>systemd-oomd</command> is a system service that uses cgroups-v2 and pressure stall information (PSI) <para><command>systemd-oomd</command> is a system service that uses cgroups-v2 and pressure stall
to monitor and take action on processes before an OOM occurs in kernel space.</para> information (PSI) to monitor and take corrective action before an OOM occurs in the kernel space.</para>
<para>You can enable monitoring and actions on units by setting <varname>ManagedOOMSwap=</varname> and/or <para>You can enable monitoring and actions on units by setting <varname>ManagedOOMSwap=</varname> and
<varname>ManagedOOMMemoryPressure=</varname> to the appropriate value. <command>systemd-oomd</command> will <varname>ManagedOOMMemoryPressure=</varname> in the unit configuration, see
periodically poll enabled units' cgroup data to detect when corrective action needs to occur. When an action needs <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry>.
to happen, it will only be performed on the descendant cgroups of the enabled units. More precisely, only cgroups with <command>systemd-oomd</command> retrieves information about such units from <command>systemd</command>
<filename>memory.oom.group</filename> set to <constant>1</constant> and leaf cgroup nodes are eligible candidates. when it starts and watches for subsequent changes.</para>
Action will be taken recursively on all of the processes under the chosen candidate.</para>
<para>See <para>Cgroups of units with <varname>ManagedOOMSwap=</varname> or
<citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> <varname>ManagedOOMMemoryPressure=</varname> set to <option>kill</option> will be monitored.
<command>systemd-oomd</command> periodically polls PSI statistics for the system and those cgroups to
decide when to take action. If the configured limits are exceeded, <command>systemd-oomd</command> will
select a cgroup to terminate, and send <constant>SIGKILL</constant> to all processes in it. Note that
only descendant cgroups are eligible candidates for killing; the unit with its property set to
<option>kill</option> is not a candidate (unless one of its ancestors set their property to
<option>kill</option>). Also only leaf cgroups and cgroups with <filename>memory.oom.group</filename> set
to <constant>1</constant> are eligible candidates; see <varname>OOMPolicy=</varname> in
<citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>.
</para>
<para><citerefentry><refentrytitle>oomctl</refentrytitle><manvolnum>1</manvolnum></citerefentry> can
be used to list monitored cgroups and pressure information.</para>
<para>See <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>
for more information about the configuration of this service.</para> for more information about the configuration of this service.</para>
</refsect1> </refsect1>
<refsect1> <refsect1>
<title>Setup Information</title> <title>System requirements and configuration</title>
<para>The system must be running systemd with a full unified cgroup hierarchy for the expected cgroups-v2 features. <para>The system must be running systemd with a full unified cgroup hierarchy for the expected cgroups-v2 features.
Furthermore, memory accounting must be turned on for all units monitored by <command>systemd-oomd</command>. Furthermore, memory accounting must be turned on for all units monitored by <command>systemd-oomd</command>.
@ -53,23 +66,25 @@
is set to <constant>true</constant> in is set to <constant>true</constant> in
<citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para>
<para>You will need a kernel compiled with PSI support. This is available in Linux 4.20 and above.</para> <para>The kernel must be compiled with PSI support. This is available in Linux 4.20 and above.</para>
<para>It is highly recommended for the system to have swap enabled for <command>systemd-oomd</command> to function <para>It is highly recommended for the system to have swap enabled for <command>systemd-oomd</command> to
optimally. With swap enabled, the system spends enough time swapping pages to let <command>systemd-oomd</command> react. function optimally. With swap enabled, the system spends enough time swapping pages to let
Without swap, the system enters a livelocked state much more quickly and may prevent <command>systemd-oomd</command> <command>systemd-oomd</command> react. Without swap, the system enters a livelocked state much more
from responding in a reasonable amount of time. See quickly and may prevent <command>systemd-oomd</command> from responding in a reasonable amount of
<ulink url="https://chrisdown.name/2018/01/02/in-defence-of-swap.html">"In defence of swap: common misconceptions"</ulink> time. See <ulink url="https://chrisdown.name/2018/01/02/in-defence-of-swap.html">"In defence of swap:
for more details on swap. Any swap-based actions on systems without swap will be ignored. While common misconceptions"</ulink> for more details on swap. Any swap-based actions on systems without swap
<command>systemd-oomd</command> can perform pressure-based actions on a system without swap, the pressure increases will be ignored. While <command>systemd-oomd</command> can perform pressure-based actions on such a
will be more abrupt and may require more tuning to get the desired thresholds and behavior.</para> system, the pressure increases will be more abrupt and may require more tuning to get the desired
thresholds and behavior.</para>
<para>Be aware that if you intend to enable monitoring and actions on <filename>user.slice</filename>, <para>Be aware that if you intend to enable monitoring and actions on <filename>user.slice</filename>,
<filename>user-$UID.slice</filename>, or their ancestor cgroups, it is highly recommended that your programs be <filename>user-$UID.slice</filename>, or their ancestor cgroups, it is highly recommended that your
managed by the systemd user manager to prevent running too many processes under the same session scope (and thus programs be managed by the systemd user manager to prevent running too many processes under the same
avoid a situation where memory intensive tasks trigger <command>systemd-oomd</command> to kill everything under the session scope (and thus avoid a situation where memory intensive tasks trigger
cgroup). If you're using a desktop environment like GNOME, it already spawns many session components with the <command>systemd-oomd</command> to kill everything under the cgroup). If you're using a desktop
systemd user manager.</para> environment like GNOME or KDE, it already spawns many session components with the systemd user manager.
</para>
</refsect1> </refsect1>
<refsect1> <refsect1>
@ -79,11 +94,11 @@
<filename>-.slice</filename>, and allowing all descendant cgroups to be eligible candidates may make the most <filename>-.slice</filename>, and allowing all descendant cgroups to be eligible candidates may make the most
sense.</para> sense.</para>
<para><varname>ManagedOOMMemoryPressure=</varname> tends to work better on the cgroups below the root slice <para><varname>ManagedOOMMemoryPressure=</varname> tends to work better on the cgroups below the root
<filename>-.slice</filename>. For units which tend to have processes that are less latency sensitive (e.g. slice. For units which tend to have processes that are less latency sensitive (e.g.
<filename>system.slice</filename>), a higher limit like the default of 60% may be acceptable, as those processes <filename>system.slice</filename>), a higher limit like the default of 60% may be acceptable, as those
can usually ride out slowdowns caused by lack of memory without serious consequences. However, something like processes can usually ride out slowdowns caused by lack of memory without serious consequences. However,
<filename>user@$UID.service</filename> may prefer a much lower value like 40%.</para> something like <filename>user@$UID.service</filename> may prefer a much lower value like 40%.</para>
</refsect1> </refsect1>
<refsect1> <refsect1>

View File

@ -1108,24 +1108,24 @@ DeviceAllow=/dev/loop-control
<citerefentry><refentrytitle>systemd-oomd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> <citerefentry><refentrytitle>systemd-oomd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>
will act on this unit's cgroups. Defaults to <option>auto</option>.</para> will act on this unit's cgroups. Defaults to <option>auto</option>.</para>
<para>When set to <option>kill</option>, <command>systemd-oomd</command> will actively monitor this unit's <para>When set to <option>kill</option>, the unit becomes a candidate for monitoring by
cgroup metrics to decide whether it needs to act. If the cgroup passes the limits set by <command>systemd-oomd</command>. If the cgroup passes the limits set by
<citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> or its <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> or
overrides, <command>systemd-oomd</command> will send a <constant>SIGKILL</constant> to all of the processes the unit configuration, <command>systemd-oomd</command> will select a descendant cgroup and send
under the chosen candidate cgroup. Note that only descendant cgroups can be eligible candidates for killing; <constant>SIGKILL</constant> to all of the processes under it. You can find more details on
the unit that set its property to <option>kill</option> is not a candidate (unless one of its ancestors set candidates and kill behavior at
their property to <option>kill</option>). You can find more details on candidates and kill behavior at
<citerefentry><refentrytitle>systemd-oomd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> <citerefentry><refentrytitle>systemd-oomd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>
and <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>. Setting and
either of these properties to <option>kill</option> will also automatically acquire <citerefentry><refentrytitle>oomd.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para>
<varname>After=</varname> and <varname>Wants=</varname> dependencies on
<filename>systemd-oomd.service</filename> unless <varname>DefaultDependencies=no</varname>.
</para>
<para>When set to <option>auto</option>, <command>systemd-oomd</command> will not actively use this cgroup's <para>Setting either of these properties to <option>kill</option> will also result in
data for monitoring and detection. However, if an ancestor cgroup has one of these properties set to <varname>After=</varname> and <varname>Wants=</varname> dependencies on
<option>kill</option>, a unit with <option>auto</option> can still be an eligible candidate for <filename>systemd-oomd.service</filename> unless <varname>DefaultDependencies=no</varname>.</para>
<command>systemd-oomd</command> to act on.</para>
<para>When set to <option>auto</option>, <command>systemd-oomd</command> will not actively use this
cgroup's data for monitoring and detection. However, if an ancestor cgroup has one of these
properties set to <option>kill</option>, a unit with <option>auto</option> can still be a candidate
for <command>systemd-oomd</command> to terminate.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>

View File

@ -1130,8 +1130,12 @@
killed by the kernel's OOM killer this is logged but the service continues running. If set to killed by the kernel's OOM killer this is logged but the service continues running. If set to
<constant>stop</constant> the event is logged but the service is terminated cleanly by the service <constant>stop</constant> the event is logged but the service is terminated cleanly by the service
manager. If set to <constant>kill</constant> and one of the service's processes is killed by the OOM manager. If set to <constant>kill</constant> and one of the service's processes is killed by the OOM
killer the kernel is instructed to kill all remaining processes of the service, too. Defaults to the killer the kernel is instructed to kill all remaining processes of the service too, by setting the
setting <varname>DefaultOOMPolicy=</varname> in <filename>memory.oom.group</filename> attribute to <constant>1</constant>; also see <ulink
url="https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html">kernel documentation</ulink>.
</para>
<para>Defaults to the setting <varname>DefaultOOMPolicy=</varname> in
<citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>
is set to, except for services where <varname>Delegate=</varname> is turned on, where it defaults to is set to, except for services where <varname>Delegate=</varname> is turned on, where it defaults to
<constant>continue</constant>.</para> <constant>continue</constant>.</para>
@ -1142,9 +1146,9 @@
<citerefentry><refentrytitle>systemd.exec</refentrytitle><manvolnum>5</manvolnum></citerefentry> for <citerefentry><refentrytitle>systemd.exec</refentrytitle><manvolnum>5</manvolnum></citerefentry> for
details.</para> details.</para>
<para>This setting also applies to <command>systemd-oomd</command>, similar to kernel OOM kills <para>This setting also applies to <command>systemd-oomd</command>, similar to the kernel OOM kills
this setting determines the state of the service after <command>systemd-oomd</command> kills a cgroup associated this setting determines the state of the service after <command>systemd-oomd</command> kills a cgroup
with the service.</para></listitem> associated with the service.</para></listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>