IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
We'd call dns_resource_record_equal(), which calls dns_resource_key_equal()
internally, and then dns_resource_key_equal() a second time. Let's be
a bit smarter, and call dns_resource_key_equal() only once.
(before)
dns_resource_key_hash_func_count=514
dns_resource_key_compare_func_count=275
dns_resource_key_equal_count=62371
4.13s user 0.01s system 99% cpu 4.153 total
(after)
dns_resource_key_hash_func_count=514
dns_resource_key_compare_func_count=276
dns_resource_key_equal_count=31337
2.13s user 0.01s system 99% cpu 2.139 total
This doesn't necessarily make things faster, because we still spend more time
in dns_answer_add(), but it improves the compuational complexity of this part.
If we even make dns_resource_key_equal_faster, this will become worthwhile.
Run: systemctl show -a dbus.service | grep -E "SELinuxContext|AppArmorProfile|SmackProcessLabel"
Before patch:
SELinuxContext=[unprintable]
AppArmorProfile=[unprintable]
SmackProcessLabel=[unprintable]
After patch:
SELinuxContext=[""|"value of context"]
AppArmorProfile=[""|"value of context"]
SmackProcessLabel=[""|"value of context"]
Same as with the other users, any non-trivial use of the objects requires
use from a single thread only or external locking. Using atomic operations
just for reference counts is not useful.
The sd-hwdb objects cannot be used concurrently from two threads in any
meaningful way, because query and iteration operations modify the object.
Thus atomic reference counts are pointless.
We had atomic counters, but all other operations were non-serialized. This
means that concurrent access to the bus object was only safe if _all_ threads
were doing read-only access. Even sending of messages from threads would not be
possible, because after sending of the message we usually want to remove it
from the send queue in the bus object, which would race. Let's just kill this.
The subvol snapshot logic doesn't cover sub-mounts either, and it really
shouldn't in the general case, hence let's simply stop at submounts in
all cases, both in the main and in the fall-back codepath.
As discussed here:
https://github.com/systemd/systemd/pull/11243#pullrequestreview-209477230
Before this commit bus messages had a single reference count: when it
reached zero the message would be freed. This simple approach meant a
cyclic dependency was typically seen: a message that was enqueued in a
bus connection object would reference the bus connection object but also
itself be referenced by the bus connection object. So far out strategy
to avoid cases like this was: make sure to process the bus connection
regularly so that messages don#t stay queued, and at exit flush/close
the connection so that the message queued would be emptied, and thus the
cyclic dependencies resolved. Im many cases this isn't done properly
however.
With this change, let's address the issue more systematically: let's
break the reference cycle. Specifically, there are now two types of
references to a bus message:
1. A regular one, which keeps both the message and the bus object it is
associated with pinned.
2. A "queue" reference, which is weaker: it pins the message, but not
the bus object it is associated with.
The idea is then that regular user handling uses regular references, but
when a message is enqueued on its connection, then this takes a "queue"
reference instead. This then means that a queued message doesn't imply
the connection itself remains pinned, only regular references to the
connection or a message associated with it do. Thus, if we end up in the
situation where a user allocates a bus and a message and enqueues the
latter in the former and drops all refs to both, then this will detect
this case and free both.
Note that this scheme isn't perfect, it only covers references between
messages and the busses they are associated with. If OTOH a bus message
is enqueued on a different bus than it is associated with cyclic deps
cannot be recognized with this simple algorithm, and thus if you enqueue
a message associated with a bus A on a bus B, and another message
associated with bus B on a bus A, a cyclic ref will be in effect and not
be discovered. However, given that this is an exotic case (though one
that happens, consider systemd-bus-stdio-bridge), it should be OK not to
cover with this, and people have to explicit flush all queues on exit in
that case.
Note that this commit only establishes the separate reference counters
per message. A follow-up commit will start making use of this from the
bus connection object.
Don't try to be smart, don't bypass the ref counting logic if there's no
real reason to.
This matters if we want to tweak the ref counting logic later.
Let's make sure our own code follows coding style and initializes all
return values on all types of success (and leaves it uninitialized in
all types of failure).
Let's do this like we usually do and size arrays with size_t.
We already do this for the "allocated" counter correctly, and externally
we expose the queue sizes as uint64_t anyway, hence there's really no
point in usigned "unsigned" internally.
Previously, when we'd copy an individual file we'd synthesize a
user.crtime_usec xattr with the source's creation time if we can
determine it. As the creation/birth time was until recently not
queriable form userspace this effectively just propagated the same xattr
on the source to the same xattr on the destination. However, current
kernels now allow to query the birthtime using statx() and we do make
use of that now. Which means that suddenly we started synthesizing these
xattrs much more regularly.
Doing this actually does make sense, but only in very few cases:
not for the typical regular files we copy, but certainly when dealing
with disk images. Hence, let's keep this kind of propagation, but let's
make it a flag and default to off. Then turn it on whenever we deal with
disk images, and leave it off otherwise.
This is particularly relevant as overlayfs combining a real fs, and a
tmpfs on top will result in EOPNOTSUPP when it is attempted to open a
file with xattrs for writing, as tmpfs does not support xattrs, and
hence the copy-up cannot work. Hence, let's avoid synthesizing this
needlessly, to increase compat with overlayfs.
Previously, we'd refuse the combination, and claimed we'd imply it, but
actually didn't. Let's allow the combination and imply read-only from
--volatile=, because that's what's documented, what we claim we do, and
what makes sense.
We document and all our code assumes that LoaderDevicePartUUID is
initialized to the ESP's UUID. Let's hence not override the variable if
it is already set, in order to not confuse userspace if the kernel's EFI
image is run from a different partition than the ESP.
This matches behaviour for all other variables set by the EFI stub, in
particular the closely related LoaderImageIdentifier variable.
Our own variables are in the the "loader" GUID namespace, but our code
so far checked the "global" GUID namespace (i.e. EFI's own), before
setting the variables. Correct that, so that we always check the right
namespace for existing variables before we write them.
The specification always said so, let's actually implement this.
Unfortunately UEFI's own APIs don't allow us to search for partition
type GUID, hence we have to implement a minimal GPT parser ourselves.
In a follow-up commit we'd like to invoke config_entry_add_from_file()
on partitions that are not the ESP, let's prepare fpr that and allow
loaded_image_path to be passed as NULL.
This makes the code a bit simpler (after all the call is not interested
in the loaded image, just where it is found), and more like
config_load_entries() which takes the same arguments.
This also makes things easier for us later on, when we add support for
discovering images in $XBOOTLDR partitions.
Let's add a new function that checks whether some directory is the
top-level directory inside an fs, splitting out the code for this from
verify_esp().
While we are at it, let's slightly improve the code, so that we can
correctly work if we have no priviliges but the ESP is mounted
unaccessible: if we can't stat() the path "$ESP/.." then manually remove
the last component of $ESP and check that instead. Which is very similar
in behaviour, and hopefully good enough in the unprivileged case.
udev metadata access works unprivileged, which the blkid stuff doesn't
(as that needs raw device node access). Hence let's use udev if we lack
privs, and raw device access only if root.
This 'root' field contains the root path of the partition we found the
snippet in. The 'kernel', 'initrd', 'efi', … fields are relative to this
path.
This becomes particularly useful later when we add support for loading
snippets from both the ESP and XBOOTLDR, but already simplifies the code
for us a bit in systemctl.
Previously, we'd mount the ESP to /efi if that existed and was empty,
falling back to /boot if that existed and was empty.
With this change, the XBOOTLDR partition is mounted to /boot
unconditionally. And the EFI is mounted to /efi if that exists (but it
doesn't have to be empty — after all the name is very indicative of what
this is supposed to be), and to /boot as a fallback but only if it
exists and is empty (we insist on emptiness for that, since it might be
used differently than what we assume).
The net effect is that $BOOT should be reliably found under /boot, and
the ESP is either /efi or /boot.
(Note that this commit only is relevant for nspawn and suchlike, i.e.
the codepaths that mount an image without involving udev during boot.)
The boot loader spec supports two places to store boot loader
configuration: the ESP and a generic replacement for it in case the ESP
is not available or not suitable. Let's look for both.
This informs chase_symlinks that symlinks should be treated as if
the path given by --root= is the root of their file system.
With the parent commit, this allows tmpfiles to create files as the
root user under a prefix that may be owned by an unprivileged user.
In particular, this fixes the case where tmpfiles generates initial
files in a staging root directory for packaging under a directory
owned by the unprivileged packager user (e.g. in Gentoo).
When chase_symlinks is given a root path, it is assumed that all
processed symlinks are restricted under that path. It should not
be necessary to verify components of that prefix path since they
are not relevant to the symlinks.
This change skips unsafe UID transitions in this root prefix, i.e.
it now ignores when an unprivileged user's directory contains a
root-owned directory above the symlink root.
When systemd retrieve the time zone it read what is in the file
/usr/share/zoneinfo/zone.tab provided by the Time Zone Database.
According to the comments in zone.tab its content is for backward-
compatibility aid for older programs. New programs should use
zone1970.tab. This patch replaces zone.tab with zone1970.tab.
Apparently this happens IRL. Let's carefully deal with issues like this:
when we overrun, let's not go back to zero but instead leave the highest
cookie bit set. We use that as indication that we are in "overrun
territory", and then are particularly careful with checking cookies,
i.e. that they haven't been used for still outstanding replies yet. This
should retain the quick cookie generation behaviour we used to have, but
permits dealing with overruns.
Replaces: #11804Fixes: #11809
CID#996458. Coverity warns that we trust desc->bLength as read in
the input data to adjust our position in the buffer. This value could
be anything, leading to overflow. It's unlikely that the kernel feeds
us invalid data, but let's me more careful.
If any error is encountered, more logs are given.
Coverity says:
> Pointer to local outside scope (RETURN_LOCAL)9.
> use_invalid: Using dirs, which points to an out-of-scope temporary variable of type char const *[5].
And indeed, the switch statement forms a scope. Let's use an if to
avoid creating a scope.
Previously, if a .networ file contains invalid [Address] or [Route]
section, then the file is completely dropped. This makes networkd
just drops invalid sections.
The option cursor-file takes a filename as argument. If the file exists and
contains a valid cursor, this is used to start the output after this position.
At the end, the last cursor gets written to the file.
This allows for an easy implementation of a timer that regularly looks in the
journal for some messages.
journalctl --cursor-file err-cursor -b -p err
journalctl --cursor-file audit-cursor -t audit --grep DENIED
Or you might want to walk the journal in steps of 10 messages:
journalctl --cursor-file ./curs -n10 --since=today -t systemd
This test case is a bit silly, but it shows that our code is unprepared to
handle so many network servers, with quadratic complexity in various places.
I don't think there are any valid reasons to have hundres of NTP servers
configured, so let's just emit a warning and cut the list short.
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13354
Previously we logged even info message from libselinux as USER_AVC's to
audit. For example, setting SELinux to permissive mode generated
following audit message,
time->Tue Feb 26 11:29:29 2019
type=USER_AVC msg=audit(1551198569.423:334): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='avc: received setenforce notice (enforcing=0) exe="/usr/lib/systemd/systemd" sauid=0 hostname=? addr=? terminal=?'
This is unnecessary and wrong at the same time. First, kernel already
records audit event that SELinux was switched to permissive mode, also
the type of the message really shouldn't be USER_AVC.
Let's ignore SELINUX_WARNING and SELINUX_INFO and forward to audit only
USER_AVC's and errors as these two libselinux message types have clear
mapping to audit message types.
Previously, When the first link configures rules, it removes all saved
rules, which were configured by networkd previously, in the foreign rule
database, but the rules themselves are still in the database.
Thus, when the second or later link configures rules, it errnously
treats the rules already exist.
This is the root of issue #11280.
This removes rules from the foreign database when they are removed.
Fixes#11280.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11587.
We had a sample which was large enough that write(2) failed to push all the
data into the pipe, and an assert failed. The code could be changed to use
a loop, but then we'd need to interleave writes and sd_event_run (to process
the journal). I don't think the complexity is worth it — fuzzing works best
if the sample is not too huge anyway. So let's just reject samples above 64k,
and tell oss-fuzz about this limit.
When there are multiple ExecStop= statements, the next command would continue
to run even after TimeoutStopSec= is up and sends SIGTERM. This is because,
unless Type= is oneshot, the exit code/status would evaluate to SERVICE_SUCCESS
in service_sigchld_event()'s call to is_clean_exit(). This success indicates
following commands would continue running until the end of the list
is reached, or another timeout is hit and SIGKILL is sent.
Since long running processes should not be invoked in non-SERVICE_EXEC_START
commands, consider them for EXIT_CLEAN_COMMAND instead of EXIT_CLEAN_DAEMON.
Passing EXIT_CLEAN_COMMAND to is_clean_exit() evaluates the SIGTERM exit
code/status to failure and will stop execution after the first timeout is hit.
Fixes#11431
6f177c7dc0 caused key file errors to immediately fail, which would make it hard to correct an issue due to e.g. a crypttab typo or a damaged key file.
Closes#11723.
And before resolving NetDev names, check conditions in .network,
and if they do not match the system environment, drop the network
unit earlier.
Fixes#4211.
This isn't really necessary for the subsequent commit, but I expect that we'll
need to unpoison more often once we turn on msan in CI, so I think think this
change makes sense in the long run.
This effectively reverts 5971cb9de9 and
2b00a4e03d.
Usually, it is not necessary to assign addresses to bridge slaves,
but such functionality is supported by kernel. If users explicitly
request such configuration, networkd should support that.
User instance of systemd is optional feature and if user@.service
template is masked then administrator most likely doesn't want --user
instances of systemd for logged in users. We don't need to be verbose
about it.
This enables graphical capability for a video adapter of Parallels
virtualization platform (Parallels Desktop for Mac product) which is not
a DRM device at the moment.
This fixes GUI in Fedora 29 guest on Parallels Desktop where gdm now
strictly checks for CanGraphical property of a seat, see [1].
Should be noted that there's no in-kernel driver for Parallels video at
the moment so device matching is done by vid/pid.
[1] https://gitlab.gnome.org/GNOME/gdm/merge_requests/37
With multiple iterations, I found it hard to pick out the interesting bits in
the column of text. I tried plain highlighting first, but it doesn't seem
enough. But blue/yellow makes it easy to jump to the right iteration.
This was intended to be just a refactoring, but it also fixes a minor bug:
after printing "never", we would skip subsequent expressions:
$ systemd-analyze calendar --iterations=20 @0 @1
systemd-analyze calendar --iterations=20 @0 @1
Original form: @0
Normalized form: 1970-01-01 00:00:00 UTC
Next elapse: never
(the second expression was skipped).
Fixes#5659.
Currently, if Persistent=true and the machine is off at the scheduled time of the timer unit, the timer
will be triggered immediately at the next boot even if RandomizedDelaySec= is specified.
As a result, if multiple timers meet that condition, they will be triggered at the same time and too
much CPU/IO work makes boot slow down.
With this commit, if the scheduled time of the persistent timer has already elapsed at boot,
set the time when systemd first started as the scheduled time and RandomizedDelaySec= is applied to it.
We were already using OrderedSets in the manager object, but strvs in the
configuration parsing code. Using sets gives us better scaling when many
domains are used.
In oss-fuzz #13059 the attached reproducer takes approximately 30.5 s to be
parsed. Converting to sets makes this go down to 10s. This is not _vastly_
faster, but using sets seems like a nicer approach anyway. In particular, we
avoid the quadratic de-unification operation after each addition.
After debugging the issue with gdb, I found that the following change
94ddb08 "cgtop: Still try to get CPU statistics if controller-free"
has introduced a bug, which prevents process(..) method processing
memory and io controllers when cpu_accounting_is_cheap() is true.
The obvious fix is to move this branch to be the last one, keeping
the intended behavior of the above change, without having a negative
effect on the other controllers.
Fixes#11773 [systemd-cgtop no longer shows memory (and io) usage]
The second name was used in documentation, and the first in the code that
generated the unit. 'systemd-makefs' is the name we want, for example for
consistency with the systemd-makefs executable.
In principle this breaks compatibility, but in practice this is unlikely to be
noticeable. Each instance of the unit is created by writing out a full
definition, so the template was never defined. So the name could only be used
for ordering, and there is not reason to order things against this unit from
the outside: the ordering would rather be against the final mount unit.
Fixes#11769.
The current implementation copied the *complete* header to boot_params,
thus making the kernel ignore many of the fields.
As mentioned in the code comment for the sentinel variable in
bootparam.h a bootloader should only copy the setup_header, set some
fields in boot_params and zero out anything else.
This change makes systemd-boot (mostly) compliant with the Linux Boot
Protocol and the EFI Handover Protocol described in bootparam.h and
Documentation/boot.txt to fix various issues:
- Secure boot not being detected corretly by Linux (#11717)
- tboot error message / warning on boot (#11717)
- Strange purple text color when booting in qemu with OVMF
- Hopefully even more ...
This reverts commit b4f9f2a62f.
Revert this because a) the quiet bug is fixed in linux and b)
Documentation/boot.txt says "All other fields should be zero."
adds a fully safe way how apps can pin files into /tmp temporarily, excepting them from the tmpfiles aging algorithm, based on BSD file locks on dirs we descend into
Let services use a private UTS namespace. In addition, a seccomp filter is
installed on set{host,domain}name and a ro bind mounts on
/proc/sys/kernel/{host,domain}name.
Since commit 0722b35934, the root mountpoint is
unconditionnally turned to slave which breaks units that are using explicitly
MountFlags=shared (and no other options that would implicitly require a slave
root mountpoint).
Here is a test case:
$ systemctl cat test-shared-mount-flag.service
# /etc/systemd/system/test-shared-mount-flag.service
[Service]
Type=simple
ExecStartPre=/usr/bin/mkdir -p /mnt/tmp
ExecStart=/bin/sh -c "/usr/bin/mount -t tmpfs -o size=10M none /mnt/tmp && sleep infinity"
ExecStop=-/bin/sh -c "/usr/bin/umount /mnt/tmp"
MountFlags=shared
$ systemctl start test-shared-mount-flag.service
$ findmnt /mnt/tmp
$
Mount on /mnt/tmp is not visible from the host although MountFlags=shared was
used.
This patch fixes that and turns the root mountpoint to slave when it's really
required.
Some settings cannot set simultaneously. Let's warn and drop
incompatible settings.
Currently, it is not comprehensive. But this may be a good first step.
When the link goes down, DHCP client_receive_message*() functions return an
error and the related I/O source is removed from the main loop. With the
current implementation of systemd-networkd this doesn't matter because the DHCP
client is always stopped on carrier down and restarted on carrier up. However
it seems wrong to have the DHCP client crippled (because no packet can be
received anymore) once the link goes temporarily down.
Change the receive functions to ignore a ENETDOWN event so that the client will
be able to receive packets again after the link comes back.
Paths are limited to BUS_PATH_SIZE_MAX but the maximum size is anyway too big
to be allocated on the stack, so let's switch to the heap where there is a
clear way to understand if the allocation fails.
Even though the dbus specification does not enforce any length limit on the
path of a dbus message, having to analyze too long strings in PID1 may be
time-consuming and it may have security impacts.
In any case, the limit is set so high that real-life applications should not
have a problem with it.