IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
strstrafter() is like strstr() but returns a pointer to the first
character *after* the found substring, not on the substring itself.
Quite often this is what we actually want.
Inspired by #27267 I think it makes sense to add a helper for this,
to avoid the potentially fragile manual pointer increment afterwards.
The overflow check was hosed in two ways: overflows in C are undefined,
hence gcc was free to just optimize the whole thing away. We need to
catch overflows before we run into them, not after.
It checked for an overflow against size_t, but the field we need to
write this in is unsigned. i.e. typically 32bit rather than 64bit. Hence
check for the right maximum.
(The whole check is paranoia anyway, the kernel really shouldn't return
values that would induce an overflow, but you never know, the syscall
turned out to be problematic in so many other ways, hence let's stick to
this.)
The concept of a "mount" is a local one, hence there's no point in going
to the network to retrieve mnt_id or STATX_ATTR_MOUNT_ROOT. Hence set
AT_STATX_DONT_SYNC so that the call will not go to the network ever, and
risk deadlocking on that.
Just some extra safety.
When CHASE_MKDIR_0755 is specified without CHASE_NONEXISTENT and
CHASE_PARENT, then chase() succeeds only when the file specified by
the path already exists, and in that case, chase() does not create
any parent directories, and CHASE_MKDIR_0755 is meaningless.
Let's mention that CHASE_MKDIR_0755 needs to be specified with
CHASE_NONEXISTENT or CHASE_PARENT, and adds a assertion about that.
Enabling these options when not running as root requires a user
namespace, so implicitly enable PrivateUsers=.
This has a side effect as it changes which users are visible to the unit.
However until now these options did not work at all for user units, and
in practice just a handful of user units in Fedora, Debian and Ubuntu
mistakenly used them (and they have been all fixed since).
This fixes the long-standing confusing issue that the user and system
units take the same options but the behaviour is wildly (and sometimes
silently) different depending on which is which, with user units
requiring manually specifiying PrivateUsers= in order for sandboxing
options to actually work and not be silently ignored.
In scope_set_state(), the timer event source may be disabled depending
on the state. Currently, it will be disabled when the state is
SCOPE_RUNNING. This has the effect of new RuntimeMaxSec values being
ignored on coldplug.
Note that this issue is not currently present when scopes are started
because when scope_start() is called, scope_arm_timer() is called after
scope_set_state().
Confexts should not contain code, so mount confexts with noexec.
We cannot mount invidial extensions as noexec, as the overlay ignores
it and bypasses it, we need to use the flag on the whole overlay for
it to be effective.
But given there are legacy scripts still shipped in /etc, allow to
override it with --noexec=false.
When a unit is upheld and fails, and there are no state changes in
the upholder, it will not be retried, which is against what the
documentation suggests.
Requeue when the job finishes. Same for the other two queues.
Repart considers the start and end of the usable space to the first multiple
of grainsz (at least 4096 bytes). However the first usable LBA of a GPT
partition is at sector 34 (512 bytes sectors) which is not a multiple of 4096.
The backup GPT label at the end also takes up 33 sectors, meaning the last
usable LBA is at 34 sectors from the end, unlikely to be a 4096 multiple as
well.
This meant that the very first and last sectors were never discarded. However
more problematically if an existing partition started before the first
usable grainsz multiple its start didn't get taken into account as a valid
starting point and got its data discarded.
Signed-off-by: Sjoerd Simons <sjoerd@collabora.com>
Apparently CMSG_DATA() alignment is very much undefined. Which is quite
an ABI fuck-up, but we need to deal with this. CMSG_TYPED_DATA() already
checks alignment of the specified pointer. Let's also check matching
alignment of the underlying structures, which we already can do at
compile-time.
See: #27241
(This does not fix#27241, but should catch such errors already at
compile-time instead of runtime)
Just to match service_release_stdio_fd() and service_release_fd_store()
in the name, since they do similar things.
This follows the concept that we "release" resources, and this is all
generically wrapped in "service_release_resources()".
We already clear the various fds we keep from the release_resources()
handler, let's also destroy the runtime dir from there if this
preservation mode is selected.
This makes a minor semantic change: previously we'd keep a runtime
directory around if RuntimeDirectoryPreserve=restart is selected and at
least one JOB_START job was around. With this logic we'll keep it around
a tiny bit longer: as long as any job for the unit is around.
The file descriptors we keep in the fdstore might be basically anything,
let's clean it up with our asynchronous closing feature, to not
deadlock on close().
(Let's also do the same for stdin/stdout/stderr fds, since they might
point to network services these days.)
Now that we have a potentially pinned fdstore let's add a concept for
cleaning it explicitly on user requested. Let's expose this via
"systemctl clean", i.e. the same way as user directories are cleaned.
Oftentimes it is useful to allow the per-service fd store to survive
longer than for a restart. This is useful in various scenarios:
1. An fd to some security relevant object needs to be stashed somewhere,
that should not be cleaned automatically, because the security
enforcement would be dropped then.
2. A user namespace fd should be allocated on first invocation and be
kept around until the user logs out (i.e. systemd --user ends), á la
#16328 (This does not implement what #16318 asks for, but should
solve the use-case discussed there.)
3. There's interest in allow a concept of "userspace reboots" where the
kernel stays running, and userspace is swapped out (i.e. all services
exit, and the rootfs transitioned into a new version of it) while
keeping some select resources pinned, very similar to how we
implement a switch root. Thus it is useful to allow services to exit,
while leaving their fds around till the very end.
This is exposed through a new FileDescriptorStorePreserve= setting that
is closely modelled after RuntimeDirectoryPreserve= (in fact it reused
the same internal type), since we want similar behaviour in the end, and
quite often they probably want to be used together.
Let's normalize how we release service resources, i.e. the three types
of fds we maintain for each service:
1. the fdstore
2. the socket fd for per-connection socket activated services
3. stdin/stdout/stderr
The generic service_release_resources() hook now calls into
service_release_fd_store() + service_close_socket_fd()
service_release_stdio_fd() one after the other, releasing them all for
the generic "release_resources" infra of the unit lifecycle.
We do no longer close the socket fd from service_set_state(), moving
this exclusively into service_release_resources(), so that all fds are
closed the same way.
The per-unit-type release_resources() hook (most prominent use: to
release a service unit's fdstore once a unit is entirely dead and has no
jobs more) was currently invoked as part of unit_check_gc(), whose
primary purpose is to determine if a unit should be GC'ed. This was
always a bit ugly, as release_resources() changes state of the unit,
while unit_check_gc() is otherwise (and was before release_resources()
was added) a "passive" function that just checks for a couple of
conditions.
unit_check_gc() is called at various places, including when we wonder if
we should add a unit to the gc queue, and then again when we take it out
of the gc queue to dtermine whether to really gc it now. The fact that
these checks have side effects so far wasn't too problematic, as the
state changes (primarily: that services would empty their fdstores) were
relatively limited and scope.
A later patch in this series is supposed to extend the service state
engine with a separate state distinct from SERVICE_DEAD that is very
much like it but indicates that the service still has active resources
(specifically the fdstore). For cases like that the releasing of the
fdstore would result in state changes (as we'd then return to a classic
SERVICE_DEAD state). And this is where the fact that the
release_resources() is called as side-effect becomes problematic: it
would mean that unit state changes would instantly propagate to state
changes elsewhere, though we usually want this to be done through the
run queue for coalescing and avoidance of recursion.
Hence, let's clean this up: let's move the release_resources() logic
into a queue of its own, and then enqueue items into it from the general
state change notification handle in unit_notify().
Since da6053d0a7 this is a size_t, not an
unsigned. The difference doesn't matter on LE archs, but it matters on
BE (i.e. s390x), since we'll return entirely nonsensical data.
Let's fix that.
Follow-up-for: da6053d0a7
An embarassing bug introduced in 2018... That made me scratch my head
for way too long, as it made #27135 fail on s390x while it passed
everywhere else.
The verity fec_* parameters allows to use Forward Error Correction to
recover from corruption if hash verification fails.
This adds the options fec_device, fec_offset and fec_roots (sixth
argument) which are the equivalent of the options --fec-device,
--fec-offset and --fec-roots in the veritysetup world.
- fec-device=FILE
- fec-offset=BYTES
- fec-roots=UINT64
See `veritysetup(8)` for more details.