mirror of
https://github.com/systemd/systemd-stable.git
synced 2025-03-11 04:58:19 +03:00
cgroup: s/cgroups? ?v?([0-9])/cgroup v\1/gI
Nitpicky, but we've used a lot of random spacings and names in the past, but we're trying to be completely consistent on "cgroup vN" now. Generated by `fd -0 | xargs -0 -n1 sed -ri --follow-symlinks 's/cgroups? ?v?([0-9])/cgroup v\1/gI'`. I manually ignored places where it's not appropriate to replace (eg. "cgroup2" fstype and in src/shared/linux).
This commit is contained in:
parent
788291d3b4
commit
4e1dfa45e9
14
NEWS
14
NEWS
@ -133,13 +133,13 @@ CHANGES WITH 240:
|
||||
|
||||
* The new "MemoryMin=" unit file property may now be used to set the
|
||||
memory usage protection limit of processes invoked by the unit. This
|
||||
controls the cgroupsv2 memory.min attribute. Similarly, the new
|
||||
controls the cgroup v2 memory.min attribute. Similarly, the new
|
||||
"IODeviceLatencyTargetSec=" property has been added, wrapping the new
|
||||
cgroupsv2 io.latency cgroup property for configuring per-service I/O
|
||||
cgroup v2 io.latency cgroup property for configuring per-service I/O
|
||||
latency.
|
||||
|
||||
* systemd now supports the cgroupsv2 devices BPF logic, as counterpart
|
||||
to the cgroupsv1 "devices" cgroup controller.
|
||||
* systemd now supports the cgroup v2 devices BPF logic, as counterpart
|
||||
to the cgroup v1 "devices" cgroup controller.
|
||||
|
||||
* systemd-escape now is able to combine --unescape with --template. It
|
||||
also learnt a new option --instance for extracting and unescaping the
|
||||
@ -355,7 +355,7 @@ CHANGES WITH 240:
|
||||
|
||||
* The JoinControllers= option in system.conf is no longer supported, as
|
||||
it didn't work correctly, is hard to support properly, is legacy (as
|
||||
the concept only exists on cgroupsv1) and apparently wasn't used.
|
||||
the concept only exists on cgroup v1) and apparently wasn't used.
|
||||
|
||||
* Journal messages that are generated whenever a unit enters the failed
|
||||
state are now tagged with a unique MESSAGE_ID. Similarly, messages
|
||||
@ -992,7 +992,7 @@ CHANGES WITH 238:
|
||||
instance to migrate processes if it itself gets the request to
|
||||
migrate processes and the kernel refuses this due to access
|
||||
restrictions. Thanks to this "systemd-run --scope --user …" works
|
||||
again in pure cgroups v2 environments when invoked from the user
|
||||
again in pure cgroup v2 environments when invoked from the user
|
||||
session scope.
|
||||
|
||||
* A new TemporaryFileSystem= setting can be used to mask out part of
|
||||
@ -2708,7 +2708,7 @@ CHANGES WITH 231:
|
||||
desired options.
|
||||
|
||||
* systemd now supports the "memory" cgroup controller also on
|
||||
cgroupsv2.
|
||||
cgroup v2.
|
||||
|
||||
* The systemd-cgtop tool now optionally takes a control group path as
|
||||
command line argument. If specified, the control group list shown is
|
||||
|
2
TODO
2
TODO
@ -58,7 +58,7 @@ Features:
|
||||
* when a socket unit is spawned with an AF_UNIX path in /var/run, complain and
|
||||
patch it to use /run instead
|
||||
|
||||
* set memory.oom.group in cgroupsv2 for all leaf cgroups (kernel v4.19+)
|
||||
* set memory.oom.group in cgroup v2 for all leaf cgroups (kernel v4.19+)
|
||||
|
||||
* add a new syscall group "@esoteric" for more esoteric stuff such as bpf() and
|
||||
usefaultd() and make systemd-analyze check for it.
|
||||
|
@ -17,7 +17,7 @@ container managers.
|
||||
|
||||
Before you read on, please make sure you read the low-level [kernel
|
||||
documentation about
|
||||
cgroupsv2](https://www.kernel.org/doc/Documentation/cgroup-v2.txt). This
|
||||
cgroup v2](https://www.kernel.org/doc/Documentation/cgroup-v2.txt). This
|
||||
documentation then adds in the higher-level view from systemd.
|
||||
|
||||
This document augments the existing documentation we already have:
|
||||
@ -34,8 +34,8 @@ wiki documentation into this very document, too.)
|
||||
## Two Key Design Rules
|
||||
|
||||
Much of the philosophy behind these concepts is based on a couple of basic
|
||||
design ideas of cgroupsv2 (which we however try to adapt as far as we can to
|
||||
cgroupsv1 too). Specifically two cgroupsv2 rules are the most relevant:
|
||||
design ideas of cgroup v2 (which we however try to adapt as far as we can to
|
||||
cgroup v1 too). Specifically two cgroup v2 rules are the most relevant:
|
||||
|
||||
1. The **no-processes-in-inner-nodes** rule: this means that it's not permitted
|
||||
to have processes directly attached to a cgroup that also has child cgroups and
|
||||
@ -58,45 +58,45 @@ your container manager creates and manages cgroups in the system's root cgroup
|
||||
you violate rule #2, as the root cgroup is managed by systemd and hence off
|
||||
limits to everybody else.
|
||||
|
||||
Note that rule #1 is generally enforced by the kernel if cgroupsv2 is used: as
|
||||
Note that rule #1 is generally enforced by the kernel if cgroup v2 is used: as
|
||||
soon as you add a process to a cgroup it is ensured the rule is not
|
||||
violated. On cgroupsv1 this rule didn't exist, and hence isn't enforced, even
|
||||
violated. On cgroup v1 this rule didn't exist, and hence isn't enforced, even
|
||||
though it's a good thing to follow it then too. Rule #2 is not enforced on
|
||||
either cgroupsv1 nor cgroupsv2 (this is UNIX after all, in the general case
|
||||
either cgroup v1 nor cgroup v2 (this is UNIX after all, in the general case
|
||||
root can do anything, modulo SELinux and friends), but if you ignore it you'll
|
||||
be in constant pain as various pieces of software will fight over cgroup
|
||||
ownership.
|
||||
|
||||
Note that cgroupsv1 is currently the most deployed implementation, even though
|
||||
Note that cgroup v1 is currently the most deployed implementation, even though
|
||||
it's semantically broken in many ways, and in many cases doesn't actually do
|
||||
what people think it does. cgroupsv2 is where things are going, and most new
|
||||
kernel features in this area are only added to cgroupsv2, and not cgroupsv1
|
||||
anymore. For example cgroupsv2 provides proper cgroup-empty notifications, has
|
||||
what people think it does. cgroup v2 is where things are going, and most new
|
||||
kernel features in this area are only added to cgroup v2, and not cgroup v1
|
||||
anymore. For example cgroup v2 provides proper cgroup-empty notifications, has
|
||||
support for all kinds of per-cgroup BPF magic, supports secure delegation of
|
||||
cgroup trees to less privileged processes and so on, which all are not
|
||||
available on cgroupsv1.
|
||||
available on cgroup v1.
|
||||
|
||||
## Three Different Tree Setups 🌳
|
||||
|
||||
systemd supports three different modes how cgroups are set up. Specifically:
|
||||
|
||||
1. **Unified** — this is the simplest mode, and exposes a pure cgroupsv2
|
||||
1. **Unified** — this is the simplest mode, and exposes a pure cgroup v2
|
||||
logic. In this mode `/sys/fs/cgroup` is the only mounted cgroup API file system
|
||||
and all available controllers are exclusively exposed through it.
|
||||
|
||||
2. **Legacy** — this is the traditional cgroupsv1 mode. In this mode the
|
||||
2. **Legacy** — this is the traditional cgroup v1 mode. In this mode the
|
||||
various controllers each get their own cgroup file system mounted to
|
||||
`/sys/fs/cgroup/<controller>/`. On top of that systemd manages its own cgroup
|
||||
hierarchy for managing purposes as `/sys/fs/cgroup/systemd/`.
|
||||
|
||||
3. **Hybrid** — this is a hybrid between the unified and legacy mode. It's set
|
||||
up mostly like legacy, except that there's also an additional hierarchy
|
||||
`/sys/fs/cgroup/unified/` that contains the cgroupsv2 hierarchy. (Note that in
|
||||
`/sys/fs/cgroup/unified/` that contains the cgroup v2 hierarchy. (Note that in
|
||||
this mode the unified hierarchy won't have controllers attached, the
|
||||
controllers are all mounted as separate hierarchies as in legacy mode,
|
||||
i.e. `/sys/fs/cgroup/unified/` is purely and exclusively about core cgroupsv2
|
||||
i.e. `/sys/fs/cgroup/unified/` is purely and exclusively about core cgroup v2
|
||||
functionality and not about resource management.) In this mode compatibility
|
||||
with cgroupsv1 is retained while some cgroupsv2 features are available
|
||||
with cgroup v1 is retained while some cgroup v2 features are available
|
||||
too. This mode is a stopgap. Don't bother with this too much unless you have
|
||||
too much free time.
|
||||
|
||||
@ -116,7 +116,7 @@ to talk of one specific cgroup and actually mean the same cgroup in all
|
||||
available controller hierarchies. E.g. if we talk about the cgroup `/foo/bar/`
|
||||
then we actually mean `/sys/fs/cgroup/cpu/foo/bar/` as well as
|
||||
`/sys/fs/cgroup/memory/foo/bar/`, `/sys/fs/cgroup/pids/foo/bar/`, and so on.
|
||||
Note that in cgroupsv2 the controller hierarchies aren't orthogonal, hence
|
||||
Note that in cgroup v2 the controller hierarchies aren't orthogonal, hence
|
||||
thinking about them as orthogonal won't help you in the long run anyway.
|
||||
|
||||
If you wonder how to detect which of these three modes is currently used, use
|
||||
@ -168,7 +168,7 @@ cgroup `/foo.slice/foo-bar.slice/foo-bar-baz.slice/quux.service/`.
|
||||
By default systemd sets up four slice units:
|
||||
|
||||
1. `-.slice` is the root slice. i.e. the parent of everything else. On the host
|
||||
system it maps directly to the top-level directory of cgroupsv2.
|
||||
system it maps directly to the top-level directory of cgroup v2.
|
||||
|
||||
2. `system.slice` is where system services are by default placed, unless
|
||||
configured otherwise.
|
||||
@ -187,8 +187,8 @@ above are just the defaults.
|
||||
|
||||
Container managers and suchlike often want to control cgroups directly using
|
||||
the raw kernel APIs. That's entirely fine and supported, as long as proper
|
||||
*delegation* is followed. Delegation is a concept we inherited from cgroupsv2,
|
||||
but we expose it on cgroupsv1 too. Delegation means that some parts of the
|
||||
*delegation* is followed. Delegation is a concept we inherited from cgroup v2,
|
||||
but we expose it on cgroup v1 too. Delegation means that some parts of the
|
||||
cgroup tree may be managed by different managers than others. As long as it is
|
||||
clear which manager manages which part of the tree each one can do within its
|
||||
sub-graph of the tree whatever it wants.
|
||||
@ -217,7 +217,7 @@ guarantees:
|
||||
hierarchy (in unified and hybrid mode) as well as on systemd's own private
|
||||
hierarchy (in legacy and hybrid mode). It won't pass ownership of the legacy
|
||||
controller hierarchies. Delegation to less privileges processes is not safe
|
||||
in cgroupsv1 (as a limitation of the kernel), hence systemd won't facilitate
|
||||
in cgroup v1 (as a limitation of the kernel), hence systemd won't facilitate
|
||||
access to it.
|
||||
|
||||
3. Any BPF IP filter programs systemd installs will be installed with
|
||||
@ -322,19 +322,19 @@ to work on that, and widen your horizon a bit. You are welcome.
|
||||
systemd supports a number of controllers (but not all). Specifically, supported
|
||||
are:
|
||||
|
||||
* on cgroupsv1: `cpu`, `cpuacct`, `blkio`, `memory`, `devices`, `pids`
|
||||
* on cgroupsv2: `cpu`, `io`, `memory`, `pids`
|
||||
* on cgroup v1: `cpu`, `cpuacct`, `blkio`, `memory`, `devices`, `pids`
|
||||
* on cgroup v2: `cpu`, `io`, `memory`, `pids`
|
||||
|
||||
It is our intention to natively support all cgroupsv2 controllers as they are
|
||||
added to the kernel. However, regarding cgroupsv1: at this point we will not
|
||||
It is our intention to natively support all cgroup v2 controllers as they are
|
||||
added to the kernel. However, regarding cgroup v1: at this point we will not
|
||||
add support for any other controllers anymore. This means systemd currently
|
||||
does not and will never manage the following controllers on cgroupsv1:
|
||||
does not and will never manage the following controllers on cgroup v1:
|
||||
`freezer`, `cpuset`, `net_cls`, `perf_event`, `net_prio`, `hugetlb`. Why not?
|
||||
Depending on the case, either their API semantics or implementations aren't
|
||||
really usable, or it's very clear they have no future on cgroupsv2, and we
|
||||
really usable, or it's very clear they have no future on cgroup v2, and we
|
||||
won't add new code for stuff that clearly has no future.
|
||||
|
||||
Effectively this means that all those mentioned cgroupsv1 controllers are up
|
||||
Effectively this means that all those mentioned cgroup v1 controllers are up
|
||||
for grabs: systemd won't manage them, and hence won't delegate them to your
|
||||
code (however, systemd will still mount their hierarchies, simply because it
|
||||
mounts all controller hierarchies it finds available in the kernel). If you
|
||||
@ -355,9 +355,9 @@ cgroups in them — from previous runs, and be extra careful with them as they
|
||||
might still carry settings that might not be valid anymore.
|
||||
|
||||
Note a particular asymmetry here: if your systemd version doesn't support a
|
||||
specific controller on cgroupsv1 you can still make use of it for delegation,
|
||||
specific controller on cgroup v1 you can still make use of it for delegation,
|
||||
by directly fiddling with its hierarchy and replicating the cgroup tree there
|
||||
as necessary (as suggested above). However, on cgroupsv2 this is different:
|
||||
as necessary (as suggested above). However, on cgroup v2 this is different:
|
||||
separately mounted hierarchies are not available, and delegation has always to
|
||||
happen through systemd itself. This means: when you update your kernel and it
|
||||
adds a new, so far unseen controller, and you want to use it for delegation,
|
||||
@ -417,7 +417,7 @@ unified you (of course, I guess) need to provide only `/sys/fs/cgroup/` itself.
|
||||
arbitrary naming, you might need to escape some of the names (for example,
|
||||
you really don't want to create a cgroup named `tasks`, just because the
|
||||
user created a container by that name, because `tasks` after all is a magic
|
||||
attribute in cgroupsv1, and your `mkdir()` will hence fail with `EEXIST`. In
|
||||
attribute in cgroup v1, and your `mkdir()` will hence fail with `EEXIST`. In
|
||||
systemd we do escaping by prefixing names that might collide with a kernel
|
||||
attribute name with an underscore. You might want to do the same, but this
|
||||
is really up to you how you do it. Just do it, and be careful.
|
||||
@ -462,9 +462,9 @@ unified you (of course, I guess) need to provide only `/sys/fs/cgroup/` itself.
|
||||
to get the cgroup for a unit. The method `GetUnitByControlGroup()` may be
|
||||
used to get the unit for a cgroup.)
|
||||
|
||||
6. ⚡ Think twice before delegating cgroupsv1 controllers to less privileged
|
||||
6. ⚡ Think twice before delegating cgroup v1 controllers to less privileged
|
||||
containers. It's not safe, you basically allow your containers to freeze the
|
||||
system with that and worse. Delegation is a strongpoint of cgroupsv2 though,
|
||||
system with that and worse. Delegation is a strongpoint of cgroup v2 though,
|
||||
and there it's safe to treat delegation boundaries as privilege boundaries.
|
||||
|
||||
And that's it for now. If you have further questions, refer to the systemd
|
||||
|
@ -872,7 +872,7 @@ int cg_set_access(
|
||||
bool fatal;
|
||||
};
|
||||
|
||||
/* cgroupsv1, aka legacy/non-unified */
|
||||
/* cgroup v1, aka legacy/non-unified */
|
||||
static const struct Attribute legacy_attributes[] = {
|
||||
{ "cgroup.procs", true },
|
||||
{ "tasks", false },
|
||||
@ -880,7 +880,7 @@ int cg_set_access(
|
||||
{},
|
||||
};
|
||||
|
||||
/* cgroupsv2, aka unified */
|
||||
/* cgroup v2, aka unified */
|
||||
static const struct Attribute unified_attributes[] = {
|
||||
{ "cgroup.procs", true },
|
||||
{ "cgroup.subtree_control", true },
|
||||
@ -2039,7 +2039,7 @@ int cg_get_keyed_attribute(
|
||||
char **v;
|
||||
int r;
|
||||
|
||||
/* Reads one or more fields of a cgroupsv2 keyed attribute file. The 'keys' parameter should be an strv with
|
||||
/* Reads one or more fields of a cgroup v2 keyed attribute file. The 'keys' parameter should be an strv with
|
||||
* all keys to retrieve. The 'ret_values' parameter should be passed as string size with the same number of
|
||||
* entries as 'keys'. On success each entry will be set to the value of the matching key.
|
||||
*
|
||||
@ -2491,7 +2491,7 @@ int cg_kernel_controllers(Set **ret) {
|
||||
|
||||
static thread_local CGroupUnified unified_cache = CGROUP_UNIFIED_UNKNOWN;
|
||||
|
||||
/* The hybrid mode was initially implemented in v232 and simply mounted cgroup v2 on /sys/fs/cgroup/systemd. This
|
||||
/* The hybrid mode was initially implemented in v232 and simply mounted cgroup2 on /sys/fs/cgroup/systemd. This
|
||||
* unfortunately broke other tools (such as docker) which expected the v1 "name=systemd" hierarchy on
|
||||
* /sys/fs/cgroup/systemd. From v233 and on, the hybrid mode mountnbs v2 on /sys/fs/cgroup/unified and maintains
|
||||
* "name=systemd" hierarchy on /sys/fs/cgroup/systemd for compatibility with other tools.
|
||||
@ -2739,13 +2739,13 @@ bool cg_is_legacy_wanted(void) {
|
||||
if (wanted >= 0)
|
||||
return wanted;
|
||||
|
||||
/* Check if we have cgroups2 already mounted. */
|
||||
/* Check if we have cgroup v2 already mounted. */
|
||||
if (cg_unified_flush() >= 0 &&
|
||||
unified_cache == CGROUP_UNIFIED_ALL)
|
||||
return (wanted = false);
|
||||
|
||||
/* Otherwise, assume that at least partial legacy is wanted,
|
||||
* since cgroups2 should already be mounted at this point. */
|
||||
* since cgroup v2 should already be mounted at this point. */
|
||||
return (wanted = true);
|
||||
}
|
||||
|
||||
|
@ -48,13 +48,13 @@ typedef enum CGroupMask {
|
||||
CGROUP_MASK_BPF_FIREWALL = CGROUP_CONTROLLER_TO_MASK(CGROUP_CONTROLLER_BPF_FIREWALL),
|
||||
CGROUP_MASK_BPF_DEVICES = CGROUP_CONTROLLER_TO_MASK(CGROUP_CONTROLLER_BPF_DEVICES),
|
||||
|
||||
/* All real cgroupv1 controllers */
|
||||
/* All real cgroup v1 controllers */
|
||||
CGROUP_MASK_V1 = CGROUP_MASK_CPU|CGROUP_MASK_CPUACCT|CGROUP_MASK_BLKIO|CGROUP_MASK_MEMORY|CGROUP_MASK_DEVICES|CGROUP_MASK_PIDS,
|
||||
|
||||
/* All real cgroupv2 controllers */
|
||||
/* All real cgroup v2 controllers */
|
||||
CGROUP_MASK_V2 = CGROUP_MASK_CPU|CGROUP_MASK_IO|CGROUP_MASK_MEMORY|CGROUP_MASK_PIDS,
|
||||
|
||||
/* All cgroupv2 BPF pseudo-controllers */
|
||||
/* All cgroup v2 BPF pseudo-controllers */
|
||||
CGROUP_MASK_BPF = CGROUP_MASK_BPF_FIREWALL|CGROUP_MASK_BPF_DEVICES,
|
||||
|
||||
_CGROUP_MASK_ALL = CGROUP_CONTROLLER_TO_MASK(_CGROUP_CONTROLLER_MAX) - 1
|
||||
|
@ -104,7 +104,7 @@ static const char *maybe_format_bytes(char *buf, size_t l, bool is_valid, uint64
|
||||
|
||||
static bool is_root_cgroup(const char *path) {
|
||||
|
||||
/* Returns true if the specified path belongs to the root cgroup. The root cgroup is special on cgroupsv2 as it
|
||||
/* Returns true if the specified path belongs to the root cgroup. The root cgroup is special on cgroup v2 as it
|
||||
* carries only very few attributes in order not to export multiple truth about system state as most
|
||||
* information is available elsewhere in /proc anyway. We need to be able to deal with that, and need to get
|
||||
* our data from different sources in that case.
|
||||
|
@ -881,7 +881,7 @@ static void cgroup_context_apply(
|
||||
/* In fully unified mode these attributes don't exist on the host cgroup root. On legacy the weights exist, but
|
||||
* setting the weight makes very little sense on the host root cgroup, as there are no other cgroups at this
|
||||
* level. The quota exists there too, but any attempt to write to it is refused with EINVAL. Inside of
|
||||
* containers we want to leave control of these to the container manager (and if cgroupsv2 delegation is used
|
||||
* containers we want to leave control of these to the container manager (and if cgroup v2 delegation is used
|
||||
* we couldn't even write to them if we wanted to). */
|
||||
if ((apply_mask & CGROUP_MASK_CPU) && !is_local_root) {
|
||||
|
||||
@ -925,7 +925,7 @@ static void cgroup_context_apply(
|
||||
}
|
||||
}
|
||||
|
||||
/* The 'io' controller attributes are not exported on the host's root cgroup (being a pure cgroupsv2
|
||||
/* The 'io' controller attributes are not exported on the host's root cgroup (being a pure cgroup v2
|
||||
* controller), and in case of containers we want to leave control of these attributes to the container manager
|
||||
* (and we couldn't access that stuff anyway, even if we tried if proper delegation is used). */
|
||||
if ((apply_mask & CGROUP_MASK_IO) && !is_local_root) {
|
||||
@ -1067,7 +1067,7 @@ static void cgroup_context_apply(
|
||||
|
||||
/* In unified mode 'memory' attributes do not exist on the root cgroup. In legacy mode 'memory.limit_in_bytes'
|
||||
* exists on the root cgroup, but any writes to it are refused with EINVAL. And if we run in a container we
|
||||
* want to leave control to the container manager (and if proper cgroupsv2 delegation is used we couldn't even
|
||||
* want to leave control to the container manager (and if proper cgroup v2 delegation is used we couldn't even
|
||||
* write to this if we wanted to.) */
|
||||
if ((apply_mask & CGROUP_MASK_MEMORY) && !is_local_root) {
|
||||
|
||||
@ -1109,7 +1109,7 @@ static void cgroup_context_apply(
|
||||
}
|
||||
}
|
||||
|
||||
/* On cgroupsv2 we can apply BPF everywhere. On cgroupsv1 we apply it everywhere except for the root of
|
||||
/* On cgroup v2 we can apply BPF everywhere. On cgroup v1 we apply it everywhere except for the root of
|
||||
* containers, where we leave this to the manager */
|
||||
if ((apply_mask & (CGROUP_MASK_DEVICES | CGROUP_MASK_BPF_DEVICES)) &&
|
||||
(is_host_root || cg_all_unified() > 0 || !is_local_root)) {
|
||||
@ -1841,14 +1841,14 @@ static bool unit_has_mask_realized(
|
||||
/* Returns true if this unit is fully realized. We check four things:
|
||||
*
|
||||
* 1. Whether the cgroup was created at all
|
||||
* 2. Whether the cgroup was created in all the hierarchies we need it to be created in (in case of cgroupsv1)
|
||||
* 3. Whether the cgroup has all the right controllers enabled (in case of cgroupsv2)
|
||||
* 2. Whether the cgroup was created in all the hierarchies we need it to be created in (in case of cgroup v1)
|
||||
* 3. Whether the cgroup has all the right controllers enabled (in case of cgroup v2)
|
||||
* 4. Whether the invalidation mask is currently zero
|
||||
*
|
||||
* If you wonder why we mask the target realization and enable mask with CGROUP_MASK_V1/CGROUP_MASK_V2: note
|
||||
* that there are three sets of bitmasks: CGROUP_MASK_V1 (for real cgroupv1 controllers), CGROUP_MASK_V2 (for
|
||||
* real cgroupv2 controllers) and CGROUP_MASK_BPF (for BPF-based pseudo-controllers). Now, cgroup_realized_mask
|
||||
* is only matters for cgroupsv1 controllers, and cgroup_enabled_mask only used for cgroupsv2, and if they
|
||||
* that there are three sets of bitmasks: CGROUP_MASK_V1 (for real cgroup v1 controllers), CGROUP_MASK_V2 (for
|
||||
* real cgroup v2 controllers) and CGROUP_MASK_BPF (for BPF-based pseudo-controllers). Now, cgroup_realized_mask
|
||||
* is only matters for cgroup v1 controllers, and cgroup_enabled_mask only used for cgroup v2, and if they
|
||||
* differ in the others, we don't really care. (After all, the cgroup_enabled_mask tracks with controllers are
|
||||
* enabled through cgroup.subtree_control, and since the BPF pseudo-controllers don't show up there, they
|
||||
* simply don't matter. */
|
||||
|
@ -3137,9 +3137,9 @@ static int exec_child(
|
||||
}
|
||||
}
|
||||
|
||||
/* If delegation is enabled we'll pass ownership of the cgroup to the user of the new process. On cgroupsv1
|
||||
/* If delegation is enabled we'll pass ownership of the cgroup to the user of the new process. On cgroup v1
|
||||
* this is only about systemd's own hierarchy, i.e. not the controller hierarchies, simply because that's not
|
||||
* safe. On cgroupsv2 there's only one hierarchy anyway, and delegation is safe there, hence in that case only
|
||||
* safe. On cgroup v2 there's only one hierarchy anyway, and delegation is safe there, hence in that case only
|
||||
* touch a single hierarchy too. */
|
||||
if (params->cgroup_path && context->user && (params->flags & EXEC_CGROUP_DELEGATE)) {
|
||||
r = cg_set_access(SYSTEMD_CGROUP_CONTROLLER, params->cgroup_path, uid, gid);
|
||||
|
@ -248,8 +248,8 @@ typedef struct Unit {
|
||||
|
||||
/* Counterparts in the cgroup filesystem */
|
||||
char *cgroup_path;
|
||||
CGroupMask cgroup_realized_mask; /* In which hierarchies does this unit's cgroup exist? (only relevant on cgroupsv1) */
|
||||
CGroupMask cgroup_enabled_mask; /* Which controllers are enabled (or more correctly: enabled for the children) for this unit's cgroup? (only relevant on cgroupsv2) */
|
||||
CGroupMask cgroup_realized_mask; /* In which hierarchies does this unit's cgroup exist? (only relevant on cgroup v1) */
|
||||
CGroupMask cgroup_enabled_mask; /* Which controllers are enabled (or more correctly: enabled for the children) for this unit's cgroup? (only relevant on cgroup v2) */
|
||||
CGroupMask cgroup_invalidated_mask; /* A mask specifiying controllers which shall be considered invalidated, and require re-realization */
|
||||
CGroupMask cgroup_members_mask; /* A cache for the controllers required by all children of this cgroup (only relevant for slice units) */
|
||||
int cgroup_inotify_wd;
|
||||
|
@ -257,7 +257,7 @@ static int client_context_read_cgroup(Server *s, ClientContext *c, const char *u
|
||||
|
||||
/* We use the unit ID passed in as fallback if we have nothing cached yet and cg_pid_get_path_shifted()
|
||||
* failed or process is running in a root cgroup. Zombie processes are automatically migrated to root cgroup
|
||||
* on cgroupsv1 and we want to be able to map log messages from them too. */
|
||||
* on cgroup v1 and we want to be able to map log messages from them too. */
|
||||
if (unit_id && !c->unit) {
|
||||
c->unit = strdup(unit_id);
|
||||
if (c->unit)
|
||||
|
@ -33,7 +33,7 @@ if grep -q cgroup2 /proc/filesystems ; then
|
||||
# And now check again, "io" should have vanished
|
||||
grep -qv io /sys/fs/cgroup/system.slice/cgroup.controllers
|
||||
else
|
||||
echo "Skipping TEST-19-DELEGATE, as the kernel doesn't actually support cgroupsv2" >&2
|
||||
echo "Skipping TEST-19-DELEGATE, as the kernel doesn't actually support cgroup v2" >&2
|
||||
fi
|
||||
|
||||
echo OK > /testok
|
||||
|
Loading…
x
Reference in New Issue
Block a user