IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
We have /usr/lib/systemd/libsystemd-{shared,core}-nnn.so. With this
path the 'nnn' part can be changed to something different. The idea
is that during a package build this will be set to the package version.
This way during in-place upgrades with the same major version both
the new and old libraries can cooexit. This should fix the issue
when systemd programs are called during package upgrades and fail
to exec because the expect different symbols in the library they
are linked to.
This should fix https://bugzilla.redhat.com/show_bug.cgi?id=1906010.
The scheme is very similar to libsystemd-shared.so: instead of building a
static library, we build a shared library from the same objects and link the
two users to it. Both systemd and systemd-analyze consist mostly of the fairly
big code in libcore, so we save a bit on the installation:
(-0g, no strip)
-rwxr-xr-x 5238864 Dec 14 12:52 /var/tmp/inst1/usr/lib/systemd/systemd
-rwxr-xr-x 5399600 Dec 14 12:52 /var/tmp/inst1/usr/bin/systemd-analyze
-rwxr-xr-x 244912 Dec 14 13:17 /var/tmp/inst2/usr/lib/systemd/systemd
-rwxr-xr-x 461224 Dec 14 13:17 /var/tmp/inst2/usr/bin/systemd-analyze
-rwxr-xr-x 5271568 Dec 14 13:17 /var/tmp/inst2/usr/lib/systemd/libsystemd-core-250.so
(-0g, strip)
-rwxr-xr-x 2522080 Dec 14 13:19 /var/tmp/inst1/usr/lib/systemd/systemd
-rwxr-xr-x 2604160 Dec 14 13:19 /var/tmp/inst1/usr/bin/systemd-analyze
-rwxr-xr-x 113304 Dec 14 13:19 /var/tmp/inst2/usr/lib/systemd/systemd
-rwxr-xr-x 207656 Dec 14 13:19 /var/tmp/inst2/usr/bin/systemd-analyze
-rwxr-xr-x 2648520 Dec 14 13:19 /var/tmp/inst2/usr/lib/systemd/libsystemd-core-250.so
So for systemd itself we grow a bit (2522080 → 2648520+113304=2761824), but
overall we save. The most is saved on all the test files that link to libcore,
if they are installed, because there's 15 of them:
$ du -s /var/tmp/inst?
220096 /var/tmp/inst1
122960 /var/tmp/inst2
I also considered making systemd-analyze a symlink to /usr/lib/systemd/systemd
and turning systemd into a multicall binary. We did something like this with
udevd and udevadm. But that solution doesn't fit well in this case.
systemd-analyze has a bunch of functionality that is not used in systemd,
so the systemd binary would need to grow quite a bit. And we're likely to
add new types of verification or introspection features in analyze, and this
baggage would only grow. In addition, there are the test binaries which also
benefit from this.
This patch changes busctl capture to generate pcapng format
instead of the legacy pcap format files. It includes basic
meta-data in the file and still uses microsecond time
resolution. In future, more things can be added such as
high resolution timestams, statistics, etc.
PCAP Next Generation capture file format is what tshark uses
and is in process of being standardized in IETF. It is also
readable with libpcap.
$ capinfos /tmp/new.pcapng
File name: /tmp/new.pcapng
File type: Wireshark/... - pcapng
File encapsulation: D-Bus
File timestamp precision: microseconds (6)
Packet size limit: file hdr: (not set)
Packet size limit: inferred: 4096 bytes
Number of packets: 22
File size: 21kB
Data size: 20kB
Capture duration: 0.005694 seconds
First packet time: 2021-12-11 11:57:42.788374
Last packet time: 2021-12-11 11:57:42.794068
Data byte rate: 3,671kBps
Data bit rate: 29Mbps
Average packet size: 950.27 bytes
Average packet rate: 3,863 packets/s
SHA256: b85ed8b094af60c64aa6d9db4a91404e841736d36b9e662d707db9e4096148f1
RIPEMD160: 81f9bac7ec0ec5cd1d55ede136a5c90413894e3a
SHA1: 8400822ef724b934d6000f5b7604b9e6e91be011
Strict time order: True
Capture oper-sys: Linux 5.14.0-0.bpo.2-amd64
Capture application: systemd 250 (250-rc2-33-gdc79ae2+)
Number of interfaces in file: 1
Interface #0 info:
Encapsulation = D-Bus (146 - dbus)
Capture length = 4096
Time precision = microseconds (6)
Time ticks per second = 1000000
Number of stat entries = 0
Number of packets = 22
Previously, when -Ddns-over-tls=false, libopenssl was missing in the
dependency of resolved.
Also, this drops libgpg_error when it is not necessary.
Replaces #21878.
This adds /etc/locale.conf to the set of configuration files
populated by tmpfiles.d factory /etc handling.
In particular, the build-time locale configuration in systemd is
now wired to a /usr factory file, and installed to the system.
On boot, if other locale customization tools did not write
/etc/locale.conf on the system, the factory default file gets
copied to /etc by systemd-tmpfiles.
This is done in order to avoid skews between different system
components when no locale settings are configured. At that point,
systemd can safely act as the fallback owner of /etc/locale.conf.
afl-clang and hufzz-clang try to instrument the code and the
underlying compilers don't like it. It should probably be
fixed in both afl and honggfuzz eventually but until then
let's just use "raw" clang to build bpf-skeletons.
It's a follow-up to https://github.com/systemd/systemd/pull/21607
341890de86 made "bootctl install" create
ESP\MID, in preparation of cf73f65089 that
followed it and created 00-entry-directory.install to make ESP\MID\KVER
if ESP\MID existed ‒ this meant that "bootctl install" followed by
"kernel-install $(uname -r) /boot/vml*$(uname -r) /boot/ini*$(uname -r)"
actually installed the kernel correctly.
Later, 31e57550b5 reverted the first
commit, meaning, that now running those two commands first installs
sd-boot, but then does nothing. Everything appears to work right,
nothing errors out, but no changes are actually done. To the untrained
eye (all of them), even running with -v appears to work:
all the hooks are run, as is depmod, but, again, nothing happens.
This is horrible. Nothing in either manpage suggests what to do
(nor should it, really), but the user is left with a bootloader that
appears fully funxional, since nothing suggests a failure in the output,
but with an unbootable machine, /no way to boot it/, even if they drop
to an EFI shell, since the boot bundle isn't present on the ESP,
and no real recourse even if they boot into a recovery system,
apart from installing like GRUB or whatever.
00- is purely instrumentation for 90-,
and separating one from the other has led to downstream dissatisfaxion
(indeed, the last mentioned commit cited cited exactly that as the
reversion reason), while creating $ENTRY_DIR_ABS is only required
for bootloaders using the BLS, and shouldn't itself toggle anything.
To that end, introduce an /{e,l}/k/install.conf file that allows
overriding the detected layout, and detect it as "bls" if
$BOOT_ROOT/$MACHINE_ID ($ENTRY_DIR_ABS/..) exists, otherwise "other" ‒
if a user wishes to select a different bootloader,
like GRUB, they (or, indeed, the postinst script) can specify
layout=grub. This disables 90- and $ENTRY_DIR_ABS manipulation.
The way that the cryptsetup plugins were built was unnecessarilly complicated.
We would build three static libraries that would then be linked into dynamic
libraries. No need to do this.
While at it, let's use a convenience library to avoid compiling the shared code
more than once.
We want the output .so files to be located in the main build directory,
like with all consumable build artifacts, so we need to maintain the split
between src/cryptsetup/cryptsetup-token/meson.build and the main meson.build
file.
AFAICT, the build artifacts are the same: exported and undefined symbols are
identical. There is a tiny difference in size, but I think it might be caused
by a different build directory name.
Use a 'convenience library' to do the compilation once and then link the
objects into all the files that need it. Those files are small, so this probably
doesn't matter too much for speed, but has the advantage that we don't get the
same error four times if something goes wrong.
The library is conditionalized in the same way importd itself, because we
cannot build it without the deps.
Build option "link-boot-shared" to build a statically linked bootctl and
systemd-bless-boot by using
-Dlink-boot-shared=false
on systems with full systemd stack except bootctl and systemd-bless-boot,
such as CentOS/RHEL 9.
This is a soft disable. Passing `dbus-interfaces-dir` build option
will with path or 'yes' enable exports again even when cross
compiling. (maybe your environment will allow to execute
cross compiled binaries)
Currently, all the logic related to writing journal files lives in
journal-file.c which is part of libsystemd (sd-journal). Because it's
part of libsystemd, we can't depend on any code from src/shared.
To allow using code from src/shared when writing journal files, let's
gradually move the write related logic from journal-file.c to
journald-file.c in src/journal. This directory is not part of libsystemd
and as such can use code from src/shared.
We can safely remove any journal write related logic from libsystemd as
it's not used by any public APIs in libsystemd.
This commit introduces the new file along with the JournaldFile struct
which wraps an instance of JournalFile. The goal is to gradually move
more functions from journal-file.c and fields from JournalFile to
journald-file.c and JournaldFile respectively.
This commit also modifies all call sites that write journal files to
use JournaldFile instead of JournalFile. All sd-journal tests that
write journal files are moved to src/journal so they can make use of
journald-file.c.
Because the deferred closes logic is only used by journald, we move it
out of journal-file.c as well. In journal_file_open(), we would wait for
any remaining deferred closes for the file we're about to open to complete
before continuing if the file was not newly created. In journald_file_open(),
we call this logic unconditionally since it stands that if a file is newly
created, it can't have any outstanding deferred closes.
No changes in behavior are introduced aside from the earlier execution
of waiting for any deferred closes to complete when opening a new journal
file.
In 9cf75222f2 the conf.get() statements for `bpf-framework` and
`valgrind` were dropped, which causes the respective features to always
show as disabled (since they don't follow the "standard" naming scheme
with HAVE_/ENABLE_ prefixes).
It could work, but it doesn't make much sense. If we already have openssl as
the cryptolib that provides the necessary support, let's not bring in another
library. Disallowing this simplifies things and reduces our support matrix.
This allows resolved and importd to be built without libgcrypt.
Note that we now say either 'cryptographic library' or 'cryptolib'.
Co-authored-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
This is heavily based on Kevin Kuehler's work, but the logic is also
significantly changed: instead of a straighforward port to openssl, both
versions of the code are kept, and at compile time we pick one or the other.
The code is purposefully kept "dumb" — the idea is that the libgcrypt codepaths
are only temporary and will be removed after everybody upgrades to openssl 3.
Thus, a separate abstraction layer is not introduced. Instead, very simple
ifdefs are used to select one or the other. If we added an abstraction layer,
we'd have to remove it again afterwards, and it don't think it makes sense to
do that for a temporary solution.
Co-authored-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
# Conflicts:
# meson.build
meson-0.59.4-1.fc35.noarch says:
WARNING: You should add the boolean check kwarg to the run_command call.
It currently defaults to false,
but it will default to true in future releases of meson.
See also: https://github.com/mesonbuild/meson/issues/9300
When working on systemd, it's often useful to be able to comment out
a function to see how a build behaves without it. Currently, when doing
this with a static function that's only used once, the build fails because
the function then becomes unused. As such, Let's downgrade the unused
function error to a warning in local builds.
After reading https://simonbyrne.github.io/notes/fastmath/ I think we
should drop -ffast-math. The JSON code actually looks for NaN, so the
fact it becomes unreliable kinda sucks.
Moreover, we don't do any number crunching. We use floating point fields
only sporadical for trivial math. Hence the optimization is entirely
unnecessary.
Moving all of the gnu-efi detection into src/boot/efi/meson.build makes
more sense than having it partially split.
And thanks to subdir_done() we can simplify the code a lot.
Fixes: #21258
Getting the variable directly from pkg-config (without
adding the sysroot prefix) is prone to host contamination
when building in sysroots as the compiler starts looking for the
headers on the host in addition to the sysroot.
Signed-off-by: Alexander Kanavin <alex@linutronix.de>
This adds support for dm integrity targets and an associated
/etc/integritytab file which is required as the dm integrity device
super block doesn't include all of the required metadata to bring up
the device correctly. See integritytab man page for details.
glibc 2.30 (Aug 2019) added a wrapper for getdents64(). For older
versions let's define our own.
(This syscall exists since Linux 2.4, hence should be safe to use for
us)
In upstream, we have a linearly-growing list of net-naming-scheme defines;
we add a new one for every release where we make user-visible changes to the
naming scheme.
But the general idea was that downstream distributions could define their
own combinations (or even just their own names for existing combinations),
so provide stability for their users. So far this required patching of the
netif-naming-scheme.c and .h files to add the new lines.
With this patch, patching is not required:
$ meson configure build \
-Dextra-net-naming-schemes=gargoyle=v238+npar_ari+allow_rerenames,gargoyle2=gargoyle+nspawn_long_hash \
-Ddefault-net-naming-scheme=gargoyle2
or even
$ meson configure build \
-Dextra-net-naming-schemes=gargoyle=v238+npar_ari+allow_rerenames,gargoyle2=gargoyle+nspawn_long_hash,latest=v249 \
-Ddefault-net-naming-scheme=gargoyle2
The syntax is a comma-separated list of NAME=name+name+…
This syntax is a bit scary, but any typos result in compilation errors,
so I think it should be OK in practice.
With this approach, we don't allow users to define arbitrary combinations:
what is allowed is still defined at compilation time, so it's up to the
distribution maintainers to provide reasonable combinations. In this regard,
the only difference from status quo is that it's much easier to do (and harder
to do incorrectly, for example by forgetting to add a name to one of the
maps).
We used 'combo' type for the scheme list. For a while we forgot to add
new names, and recently aa0a23ec86 added v241, v243, v245, and v247.
I want to allow defining new values during configuration, which means
that we can't use meson to verify the list of options. So any value is
allowed, but then two tests are added: one that will fail compilation if some
invalid name is given (other than "latest"), and one that converts
DEFAULT_NET_NAMING_SCHEME to a NamingScheme pointer.
Before:
```
Compiling C object src/libsystemd-network/libsystemd-network.a.p/dhcp6-option.c.o
../src/libsystemd-network/dhcp6-option.c: In function ‘dhcp6_option_parse_ia’:
../src/libsystemd-network/dhcp6-option.c:633:70: warning: passing argument 3 of ‘dhcp6_option_parse’ makes pointer from integer without a cast [-Wint-conversion]
633 | r = dhcp6_option_parse(option_data, option_data_len, offset, &subopt, &subdata_len, &subdata);
| ^~~~~~
| |
| size_t {aka long unsigned int}
../src/libsystemd-network/dhcp6-option.c:358:25: note: expected ‘size_t *’ {aka ‘long unsigned int *’} but argument is of type ‘size_t’ {aka ‘long unsigned int’}
358 | size_t *offset,
| ~~~~~~~~^~~~~~
```
After:
```
../src/libsystemd-network/dhcp6-option.c: In function ‘dhcp6_option_parse_ia’:
../src/libsystemd-network/dhcp6-option.c:633:70: error: passing argument 3 of ‘dhcp6_option_parse’ makes pointer from integer without a cast [-Werror=int-conversion]
633 | r = dhcp6_option_parse(option_data, option_data_len, offset, &subopt, &subdata_len, &subdata);
| ^~~~~~
| |
| size_t {aka long unsigned int}
../src/libsystemd-network/dhcp6-option.c:358:25: note: expected ‘size_t *’ {aka ‘long unsigned int *’} but argument is of type ‘size_t’ {aka ‘long unsigned int’}
358 | size_t *offset,
| ~~~~~~~~^~~~~~
cc1: some warnings being treated as errors
```
Compilation would fail because we could have HAVE_SMACK_RUN_LABEL without
HAVE_SMACK. This doesn't make much sense, so let's just make -Dsmack=false
completely disable smack.
Also, the logic in smack-setup.c seems dubious: '#ifdef SMACK_RUN_LABEL'
would evaluate to true even if -Dsmack-run-label='' is used. I think
this was introduced in the conversion to meson:
8b197c3a8a added
AC_ARG_WITH(smack-run-label,
AS_HELP_STRING([--with-smack-run-label=STRING],
[run systemd --system with a specific SMACK label]),
[AC_DEFINE_UNQUOTED(SMACK_RUN_LABEL, ["$withval"], [Run with a smack label])],
[])
i.e. it really was undefined if not specified. And it was same
still in 72cdb3e783 when configure.ac
was dropped.
So let's use the single conditional HAVE_SMACK_RUN_LABEL everywhere.
Units are copied out via sendmsg datafd from images, but that means
the SELinux labels get lost in transit. Extract them and copy them over.
Given recvmsg cannot use multiple IOV transparently when the sizes are
variable, use a '\0' as a separator between the filename and the label.
upstream meson stopped allowing combining boolean with the plus
operator, and now requires using the logical and operator
reference:
43302d3296Fixes: #20632
Add support for systemd-pkcs11 based LUKS2 device activation
via libcryptsetup plugin. This make the feature (pkcs11 sealed
LUKS2 keyslot passphrase) usable from both systemd utilities
and cryptsetup cli.
The feature is configured via -Dlibcryptsetup-plugins combo
with default value set to 'auto'. It get's enabled automatically
when cryptsetup 2.4.0 or later is installed in build system.
Add support for systemd-fido2 based LUKS2 device activation
via libcryptsetup plugin. This make the feature (fido2 sealed
LUKS2 keyslot passphrase) usable from both systemd utilities
and cryptsetup cli.
The feature is configured via -Dlibcryptsetup-plugins combo
with default value set to 'auto'. It get's enabled automatically
when cryptsetup 2.4.0 or later is installed in build system.
The output is similar to our hand-crafted status message, but it's nice to use
the built-in functionality. After all, it was amended during development to
support our use case.
This undoes part of 4c890ad3cc: the
implementations of update-dbus-docs and update-man-rules are moved back to
man/meson.build, and alias_target() is used to keep the visible target names
unchanged.
The rules for man pages are reworked so that it's possible to invoke the
targets even if xstlproc is not available. After all, xsltproc is only needed
for the final formatted output, and not other processing.
Ubuntu Bionic 18.04 has 0.45, so it was below the previously required
minimum version already. Focal 20.04 has 0.53.2. Let's require that
and use various features that are available.
Otherwise the build sometimes fails in a racy way:
```
[274/1850] Compiling C object src/cryptsetup/cryptsetup-tokens/libcryptsetup-token-systemd-tpm2_static.a.p/cryptsetup-token-systemd-tpm2.c.o
FAILED: src/cryptsetup/cryptsetup-tokens/libcryptsetup-token-systemd-tpm2_static.a.p/cryptsetup-token-systemd-tpm2.c.o
cc -Isrc/cryptsetup/cryptsetup-tokens/libcryptsetup-token-systemd-tpm2_static.a.p (...) -c ../build/src/cryptsetup/cryptsetup-tokens/cryptsetup-token-systemd-tpm2.c
../build/src/cryptsetup/cryptsetup-tokens/cryptsetup-token-systemd-tpm2.c:12:10: fatal error: version.h: No such file or directory
12 | #include "version.h"
| ^~~~~~~~~~~
compilation terminated.
```
Follow-up to d1ae38d85a.
Add support for systemd-tpm2 based LUKS2 device activation
via libcryptsetup plugin. This make the feature (tpm2 sealed
LUKS2 keyslot passphrase) usable from both systemd utilities
and cryptsetup cli.
The feature is configured via -Dlibcryptsetup-plugins combo
with default value set to 'auto'. It get's enabled automatically
when cryptsetup 2.4.0 or later is installed in build system.
This closes an important gap: so far we would reexecute the system manager and
restart system services that were configured to do so, but we wouldn't do the
same for user managers or user services.
The scheme used for user managers is very similar to the system one, except
that there can be multiple user managers running, so we query the system
manager to get a list of them, and then tell each one to do the equivalent
operations: daemon-reload, disable --now, set-property Markers=+needs-restart,
reload-or-restart --marked.
The total time that can be spend on this is bounded: we execute the commands in
parallel over user managers and units, and additionally set SYSTEMD_BUS_TIMEOUT
to a lower value (15 s by default). User managers should not have too many
units running, and they should be able to do all those operations very
quickly (<< 1s). The final restart operation may take longer, but it's done
asynchronously, so we only wait for the queuing to happen.
The advantage of doing this synchronously is that we can wait for each step to
happen, and for example daemon-reloads can finish before we execute the service
restarts, etc. We can also order various steps wrt. to the phases in the rpm
transaction.
When this was initially proposed, we discussed a more relaxed scheme with bus
property notifications. Such an approach would be more complex because a bunch
of infrastructure would have to be added to system manager to propagate
appropriate notifications to the user managers, and then the user managers
would have to wait for them. Instead, now there is no new code in the managers,
all new functionality is contained in src/rpm/. The ability to call 'systemctl
--user user@' makes this approach very easy. Also, it would be very hard to
order the user manager steps and the rpm transaction steps.
Note: 'systemctl --user disable' is only called for a user managers that are
running. I don't see a nice way around this, and it shouldn't matter too much:
we'll just leave a dangling symlink in the case where the user enabled the
service manually.
A follow-up for https://bugzilla.redhat.com/show_bug.cgi?id=1792468 and
fa97d2fcf6.
Instead of embedding the commands to invoke directly in the macros,
let's use a helper script as indirection. This has a couple of advantages:
- the macro language is awkward, we need to suffix most commands by "|| :"
and "\", which is easy to get wrong. In the new scheme, the macro becomes
a single simple command.
- in the script we can use normal syntax highlighting, shellcheck, etc.
- it's also easier to test the invoked commands by invoking the helper
manually.
- most importantly, the logic is contained in the helper, i.e. we can
update systemd rpm and everything uses the new helper. Before, we would
have to rebuild all packages to update the macro definition.
This raises the question whether it makes sense to use the lua scriptlets when
the real work is done in a bash script. I think it's OK: we still have the
efficient lua scripts that do the short scripts, and we use a single shared
implementation in bash to do the more complex stuff.
The meson version is raised to 0.47 because that's needed for install_mode.
We were planning to raise the required version anyway…
This moves the /var/log/README content out of /var and into the
docs location, replacing the previous file with a symlink
created through a tmpfiles.d entry.
We disabled it in f73fb7b742 in response to an
apparent gcc bug. It seems that depending on the combination of optimization
options, gcc still ignores (void). But this seems to work fine with clang, so
let's re-enable the warning conditionally.
So, as it turns out AF_ALG is turned off in a lot of kernels/container
environments, including our CI. Hence, if we link against OpenSSL
anyway, let's just use that client side. It's also faster.
One of those days we should drop the khash code, and ust use OpenSSL,
once the licensing issues are resolved.
The general idea with users and groups created through sysusers is that an
appropriate number is picked when the allocation is made. The number that is
selected will be different on each system based on the order of creation of
users, installed packages, etc. Since system users and groups are not shared
between installations, this generally is not an issue. But it becomes a problem
for initrd: some file systems are shared between the initrd and the host (/run
and /dev are probably the only ones that matter). If the allocations are
different in the host and the initrd, and files survive switch-root, they will
have wrong ownership.
This makes the gids build-time-configurable for all groups and users where
state may survive the switch from initrd to the host.
In particular, all "hardware access" groups are like this: files in /dev will
be owned by them. Eventually the new udev would change ownership, but there
would be a momemnt where the files were owned by the wrong group. The
allocations are "soft-static" in the language of Fedora packaging guidelines:
the uid/gid will be used if possible, but we'll fall back to a different
one. TTY_GID is the exception, because the number is used directly.
Similarly, the possibility to configure "soft-static" uids is added for daemons
which may usefully run in the initramfs: systemd-network (lease information and
interface state is serialized to /run), systemd-resolve (stub files and
interface state), systemd-timesync (/run/systemd/timesync).
Journal files are owned by the group systemd-journal, and acls are granted
for wheel and adm.
systemd-oom and systemd-coredump are excluded from this patch: I assume that
oomd is not useful in the initrd, and coredump leaves no state (it only creates
a pipe in /run?).
The defaults are not changed: if nothing is configured, dynamic allocation will
be used. I looked at a Debian system, and the numbers are all different than
on Fedora.
For Fedora, see the list of uids and gids at https://pagure.io/setup/blob/master/f/uidgid.
In particular, systemd-network and systemd-resolve got soft-static numbers to
make it easy to transition from a non-host-specific initrd to a host system
already a few years back (https://bugzilla.redhat.com/show_bug.cgi?id=1102002).
I also requested static allocations for sgx, input, render in
https://pagure.io/packaging-committee/issue/1078,
https://pagure.io/setup/pull-request/27.
Debugging udev issues especially during the early boot is fairly
difficult. Currently, you need to enable (at least) debug logging and
start monitoring uevents, try to reproduce the issue and then analyze
and correlate two (usually) huge log files. This is not ideal.
This patch aims to provide much more focused debugging tool,
tracepoints. More often then not we tend to have at least the basic idea
about the issue we are trying to debug further, e.g. we know it is
storage related. Hence all of the debug data generated for network
devices is useless, adds clutter to the log files and generally
slows things down.
Using this set of tracepoints you can start asking very specific
questions related to event processing for given device or subsystem.
Tracepoints can be used with various tracing tools but I will provide
examples using bpftrace.
Another important aspect to consider is that using tracepoints you can
debug production systems. There is no need to install test packages with
added logging, no debuginfo packages, etc...
Example usage (you might be asking such questions during the debug session),
Q: How can I list all tracepoints?
A: bpftrace -l 'usdt:/usr/lib/systemd/systemd-udevd:udev:*'
Q: What are the arguments for each tracepoint?
A: Look at the code and search for use of DEVICE_TRACE_POINT macro.
Q: How many times we have executed external binary?
A: bpftrace -e 'usdt:/usr/lib/systemd/systemd-udevd:udev:spawn_exec { @cnt = count(); }'
Q: What binaries where executed while handling events for "dm-0" device?
A bpftrace -e 'usdt:/usr/lib/systemd/systemd-udevd:udev:spawn_exec / str(arg1) == "dm-0"/ { @cmds[str(arg4)] = count(); }'
Thanks to Thomas Weißschuh <thomas@t-8ch.de> for reviewing this patch
and contributions that allowed us to drop the dependency on dtrace tool
and made the resulting code much more concise.
In commit d895e10a a test was introduced to validate that prefix is a
child of rootprefix. However, it only works when rootprefix is "/".
Since the test is ignored when rootprefix is equal to prefix, this is
only noticed if specifying both -Drootprefix= and -Dprefix=, e.g.:
$ meson foo -Drootprefix=/foo -Dprefix=/foo/bar
meson.build:111:8: ERROR: Problem encountered: Prefix is not below
root prefix (now rootprefix=/foo prefix=/foo/bar)
On Debian, bpftool is installed in /usr/sbin, which is not in $PATH for
non-root users by default, so finding it fails.
Add a secondary, hard-coded '/usr/sbin/bpftool' after 'bpftool' so that
meson can find it.
https://packages.debian.org/sid/amd64/bpftool/filelist
This ensures that the fuzz test code is also built by default.
It also increases the test coverage a bit. Compiling the tests
*with* sanitizers is painfully slow, so this is not enabled. But
just compiling them sauté is hardly noticable. Running the tests
increases the test count and runtime:
622 tests, 26 s
to
922 tests, 35 s
I think this is acceptable.