1
1
mirror of https://github.com/systemd/systemd-stable.git synced 2025-01-11 05:17:44 +03:00

manager: prohibit clone3() in seccomp filters

RestrictNamespaces should block clone3() like flatpak:
a10f52a756

clone3() passes arguments in a structure referenced by a pointer, so we can't
filter on the flags as with clone(). Let's disallow the whole function call.

(cherry picked from commit 30193fe817)
This commit is contained in:
Zbigniew Jędrzejewski-Szmek 2022-04-19 12:44:26 +02:00
parent 45335a3eed
commit 32e7c65372

View File

@ -1227,6 +1227,21 @@ int seccomp_restrict_namespaces(unsigned long retain) {
if (r < 0) if (r < 0)
return r; return r;
/* We cannot filter on individual flags to clone3(), and we need to disable the
* syscall altogether. ENOSYS is used instead of EPERM, so that glibc and other
* users shall fall back to clone(), as if on an older kernel.
*
* C.f. https://github.com/flatpak/flatpak/commit/a10f52a7565c549612c92b8e736a6698a53db330,
* https://github.com/moby/moby/issues/42680. */
r = seccomp_rule_add_exact(
seccomp,
SCMP_ACT_ERRNO(ENOSYS),
SCMP_SYS(clone3),
0);
if (r < 0)
log_debug_errno(r, "Failed to add clone3() rule for architecture %s, ignoring: %m", seccomp_arch_to_string(arch));
if ((retain & NAMESPACE_FLAGS_ALL) == 0) if ((retain & NAMESPACE_FLAGS_ALL) == 0)
/* If every single kind of namespace shall be prohibited, then let's block the whole setns() syscall /* If every single kind of namespace shall be prohibited, then let's block the whole setns() syscall
* altogether. */ * altogether. */