IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This implements a "storage target mode", similar to what MacOS provides
since a long time as "Target Disk Mode":
https://en.wikipedia.org/wiki/Target_Disk_Mode
This implementation is relatively simple:
1. a new generic target "storage-target-mode.target" is added, which
when booted into defines the target mode.
2. a small tool and service "systemd-storagetm.service" is added which
exposes a specific device or all devices as NVMe-TCP devices over the
network. NVMe-TCP appears to be hot shit right now how to expose
block devices over the network. And it's really simple to set up via
configs, hence our code is relatively short and neat.
The idea is that systemd-storagetm.target can be extended sooner or
later, for example to expose block devices also as USB mass storage
devices and similar, in case the system has "dual mode" USB controller
that can also work as device, not just as host. (And people could also
plug in sharing as NBD, iSCSI, whatever they want.)
How to use this? Boot into your system with a kernel cmdline of
"rd.systemd.unit=storage-target-mode.target ip=link-local", and you'll see on
screen the precise "nvme connect" command line to make the relevant
block devices available locally on some other machine. This all requires
that the target mode stuff is included in the initrd of course. And the
system will the stay in the initrd forever.
Why bother? Primarily three use-cases:
1. Debug a broken system: with very few dependencies during boot get
access to the raw block device of a broken machine.
2. Migrate from system to another system, by dd'ing the old to the new
directly.
3. Installing an OS remotely on some device (for example via Thunderbolt
networking)
(And there might be more, for example the ability to boot from a
laptop's disk on another system)
Limitations:
1. There's no authentication/encryption. Hence: use this on local links
only.
2. NVMe target mode on Linux supports r/w operation only. Ideally, we'd
have a read-only mode, for security reasons, and default to it.
Future love:
1. We should have another mode, where we simply expose the homed LUKS
home dirs like that.
2. Some lightweight hookup with plymouth, to display a (shortened)
version of the info we write to the console.
To test all this, just run:
mkosi --kernel-command-line-extra="rd.systemd.unit=storage-target-mode.target" qemu
This replaces the local implementation of a timeout file lock with our
new generic one.
Note that a comment in the old code claimed we couldn't use alarm()-like timeouts,
but htat's not entirely true: we can if we use SIGKILL, and thus know
for sure that the process will be dead in case the timer is hit before
we actually enter the file lock syscall. But we also know it will be
delivered if we hit after.
This is just like lock_generic(), but applies the lock with a timeout.
This requires jumping through some hoops by executing things in a child
process, so that we can abort if necessary via a timer. Linux after all
has no native way to take file locks with a timeout.
Sometimes it makes sense to hard kill a client if we die. Let's hence
add a third FORK_DEATHSIG flag for this purpose: FORK_DEATHSIG_SIGKILL.
To make things less confusing this also renames FORK_DEATHSIG to
FORK_DEATHSIG_SIGTERM to make clear it sends SIGTERM. We already had
FORK_DEATHSIG_SIGINT, hence this makes things nicely symmetric.
A bunch of users are switched over for FORK_DEATHSIG_SIGKILL where we
know it's safe to abort things abruptly. This should make some kernel
cases more robust, since we cannot get confused by signal masks or such.
While we are at it, also fix a bunch of bugs where we didn't take
FORK_DEATHSIG_SIGINT into account in safe_fork()
This is just like FORMAT_PROC_FD_PATH() but goes via the PID number
rather than the "self" symlink.
This is useful whenever we want to generate a path that is useful
outside of our local scope.
mkosi detects whether /dev/kvm is available and uses it if it is. But
some GHA hosts have it, but it's broken and not supported, so we need
to explicitly disable it.
varlink_dispatch() is a simple wrapper around json_dispatch() that
returns clean, standards-compliant InvalidParameter error back to
clients, if the specified JSON cannot be parsed properly.
For this json_dispatch() is extended to return the offending field's
name. Because it already has quite a few parameters, I then renamed
json_dispatch() to json_dispatch_full() and made json_dispatch() a
wrapper around it that passes the new argument as NULL. While doing so I
figured we should also get rid of the bad= argument in the short
wrapper, since it's only used in the OCI code.
To simplify the OCI code this adds a second wrapper oci_dispatch()
around json_dispatch_full(), that fills in bad= the way we want.
Net result: instead of one json_dispatch() call there are now:
1. json_dispatch_full() for the fully feature mother of all dispathers.
2. json_dispatch() for the simpler version that you want to use most of
the time.
3. varlink_dispatch() that generates nice Varlink errors
4. oci_dispatch() that does the OCI specific error handling
And that's all there is.
To avoid timeouts in oss-fuzz. The timeout reported in #29736 happened
with a ~500K test case, so with a conservative 128K limit we should
still be well within a range for any reasonable-ish generated input to
get through, while avoiding timeouts.
Resolves: #29736
Prior to this commit, if the target had been a symlink, we did nothing
with it. Let's try with fchmodat2() and skip gracefully if not supported.
Co-authored-by: Mike Yuan <me@yhndnzj.com>
Follow-up for 8b45281daa
and preparation for later commits.
Since libcs are more interested in the POSIX `fchmodat(3)`, they are
unlikely to provide a direct wrapper for this syscall. Thus, the headers
we examine to set `HAVE_*` are picked somewhat arbitrarily.
Also, hook up `try_fchmodat2()` in `test-seccomp.c`. (Also, correct that
function's prototype, despite the fact that mistake would not matter in
practice)
Co-authored-by: Mike Yuan <me@yhndnzj.com>
Currently the ID_NET_DRIVER is set in net_setup_link builtin.
But this is called pretty late in the udev processing chain.
Right now in some custom rules it was workarounded by calling ethtool
binary directly, which is ugly.
So let's split this code to a separate builtin.
If we use CHASE_PARENT on a path ending in ".." then things are a bit
weird, because we the last path we look at is actually the *parent* and not
the *child* of the preceeding path. Hence we cannot just return the 2nd
to last fd we look at. We have to correct it, by going *two* levels up,
to get to the actual parent, and make sure CHASE_PARENT does what it
should.
Example: for the path /a/b/c chase() with CHASE_PARENT will return
/a/b/c as path, and the fd returned points to /a/b. All good. But now,
for the path /a/b/c/.. chase() with CHASE_PARENT would previously return
/a/b as path (which is OK) but the fd would point to /a/b/c, which is
*not* the parent of /a/b, after all! To get to the actual parent of
/a/b we have to go *two* levels up to get to /a.
Very confusing. But that's what we here for, no?
@mrc0mmand ran into this in https://github.com/systemd/systemd/pull/28891#issuecomment-1782833722
This reverts the following two commits:
- "udev: decrease devlink priority for encrypted partitions"
c4521fc17b.
- "udev: decrease devlink priority for iso disks"
df1dccd255.
These commits are workarounds for issues caused by
331aa7aa15.
With the previous commit, these workarounds are not necessary anymore,
as partitions are always processed later than their whole disk, and
a decrypted volume is also processed later than its backing volume.
Several udev rules depends on the previous behavior, i.e. that udev
replaces the devlink with the newer device node when the priority is
equivalent. Let's relax the optimization done by
331aa7aa15.
Follow-up for 331aa7aa15.
Note, the offending commit drops O(N) of file reads per uevent, and this
commit does not change the computational order. So, hopefully the
performance impact of this change is small enough.
Fixes#28141.
If one sets the SystemMaxUse=64G by the current documentation would expect that each files size would be around 1/8 of this value (8G), althought if the SystemMaxFileSize is not explicit set, it has a max of 128M per file.