IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
A distro (Fedora in particular) may want to enable oomd in a unstable
branch for testing, even though the package as a whole is compiled in release
mode. Let's emit a warning but otherwise allow this.
Let's make this easier for readers by grouping common subjects together.
Roughly: pid1 features, unit file changes, general syntax changes, kernel
options, general defaults, udevd features, networkd and .network/.netdev
features, networkctl, resolved, systemctl, systemd-run, journald, journalctl,
various other tools, low-level dbus and library stuff, documentation.
https://tools.ietf.org/html/draft-knodel-terminology-02https://lwn.net/Articles/823224/
This gets rid of most but not occasions of these loaded terms:
1. scsi_id and friends are something that is supposed to be removed from
our tree (see #7594)
2. The test suite defines an API used by the ubuntu CI. We can remove
this too later, but this needs to be done in sync with the ubuntu CI.
3. In some cases the terms are part of APIs we call or where we expose
concepts the kernel names the way it names them. (In particular all
remaining uses of the word "slave" in our codebase are like this,
it's used by the POSIX PTY layer, by the network subsystem, the mount
API and the block device subsystem). Getting rid of the term in these
contexts would mean doing some major fixes of the kernel ABI first.
Regarding the replacements: when whitelist/blacklist is used as noun we
replace with with allow list/deny list, and when used as verb with
allow-list/deny-list.
This is an attempt to clean up the POP3/SMTP/LPR/… DHCP lease server
data logic in networkd. This reduces code duplication and fixes a number
of bugs.
This removes any support for collecting POP3/SMPT/LPR servers acquired
via local DHCP client releases since noone uses that, and given how old
these protocols are I doubt this will change. It keeps support for
configuring them for the dhcp server however.
The differences between the DNS/NTP/SIP/POP3/SMTP/LPR configuration
logics are minimized.
This removes the relevant symbols from sd-network.h (which is an
internal API only at this point after all).
This is unfortunately not well test, given the old code for this had
barely any tests. But the new code should not perform worse at least,
and allow us to release, since it corrects some interfaces visible in
the .network configuration format.
Fixes: #15943
Six years ago we declared it obsolete and removed it from the docs
(c073a0c4a5) and added a note about it in
NEWS. Two years ago we add warning messages about it, indicating the
feature will be removed (41b283d0f1) and
mentioned it in NEWS again.
Let's now kill it for good.
With cgroup v2 the cgroup freezer is implemented as a cgroup
attribute called cgroup.freeze. cgroup can be frozen by writing "1"
to the file and kernel will send us a notification through
"cgroup.events" after the operation is finished and processes in the
cgroup entered quiescent state, i.e. they are not scheduled to
run. Writing "0" to the attribute file does the inverse and process
execution is resumed.
This commit exposes above low-level functionality through systemd's DBus
API. Each unit type must provide specialized implementation for these
methods, otherwise, we return an error. So far only service, scope, and
slice unit types provide the support. It is possible to check if a
given unit has the support using CanFreeze() DBus property.
Note that DBus API has a synchronous behavior and we dispatch the reply
to freeze/thaw requests only after the kernel has notified us that
requested operation was completed.
It's not that I think that "hostname" is vastly superior to "host name". Quite
the opposite — the difference is small, and in some context the two-word version
does fit better. But in the tree, there are ~200 occurrences of the first, and
>1600 of the other, and consistent spelling is more important than any particular
spelling choice.
Hiding the first column, which may contain bullet circles, with --no-legend
is undocumented and potentially unexpected. On the other hand, not printing
bullet circles with --plain is documented so hiding the column with that
switch is sensible.
The combination "--full --no-legend --no-pager --plain" is appropriate for
automated processing of systemctl output.
Right now the kernel will not dump anything that went through setuid or
setgid. But it is routine for daemons to do that, and it makes things hard to
debug.
systemd-coredump saves the coredump readable by the users the process was
running as. This should be enough to avoid information leakage. So let's also
tell the kernel to do the coredump.
For https://bugzilla.redhat.com/show_bug.cgi?id=1790972.
Both patterns are stored in the same file, so they are enabled or disabled
together. (Though suid_dumpable=2 is supposed to be safe even when writing to
plain files.)
This never made into a release, so we can change the name with impunity.
Suggested by Davide Pesavento.
I opted to add the "ing" ending. "Fair queuing" is the name of the general
concept and algorithm, and "Fair queue" is mostly used for the implementation
name.
This makes the naming more consistent: we now have
bootctl systemd-efi-options,
$SYSTEMD_EFI_OPTIONS
and the SystemdOptions EFI variable.
(SystemdEFIOptions would be redundant, because it is only used in the context
of efivars, and users don't interact with that name directly.)
bootctl is adjusted to use 2sp indentation, similarly to systemctl and other
programs.
Remove the prefix with the old name from 'bootctl systemd-efi-options' output,
since it's redundant and we don't want the old name anyway.
On some systems with lots of devices, device probing for certain drivers can
take a very long time. If systemd-udevd detects a timeout and kills the worker
running modprobe using SIGKILL, some devices will not be probed, or end up in
unusable state. The --event-timeout option can be used to modify the maximum
time spent in an uevent handler. But if systemd-udevd exits, it uses a
different timeout, hard-coded to 30s, and exits when this timeout expires,
causing all workers to be KILLed by systemd afterwards. In practice, this may
lead to workers being killed after significantly less time than specified with
the event-timeout. This is particularly significant during initrd processing:
systemd-udevd will be stopped by systemd when initrd-switch-root.target is
about to be isolated, which usually happens quickly after finding and mounting
the root FS.
If systemd-udevd is started by PID 1 (i.e. basically always), systemd will
kill both udevd and the workers after expiry of TimeoutStopSec. This is
actually better than the built-in udevd timeout, because it's more transparent
and configurable for users. This way users can avoid the mentioned boot problem
by simply increasing StopTimeoutSec= in systemd-udevd.service.
If udevd is not started by systemd (standalone), this is still an
improvement. udevd will kill hanging workers when the event timeout is
reached, which is configurable via the udev.event_timeout= kernel
command line parameter. Before this patch, udevd would simply exit with
workers still running, which would then become zombie processes.
With the timeout removed, the sd_event_now() assertion in manager_exit() can be
dropped.
Discussed in #13743, the -.service semantic conflicts with the
existing root mount and slice names, making this feature not
uniformly extensible to all types. Change the name to be
<type>.d instead.
Updating to this format also extends the top-level dropin to
unit types.
I think we can mention that systemd-resolved is able to validate IP
address certificates and prefer TLS 1.3 before TLS 1.2 now.
Also the `machinectl reboot` command actually works now.
Signed-off-by: Christian Rebischke <chris@nullday.de>
In the past, we asked people to open a security bug on one of the "big"
distros. This worked OK as far as getting bugs reported and notifying some
upstream developers went. But we always had trouble getting information to
all the appropriate parties, because each time a bug was reported, a big
thread was created, with a growing CC list. People who were not CCed early
enough were missing some information, etc.
To clean this up, we decided to create a private mailing list. The natural
place would be freedesktop.org, but unfortunately the request to create a
mailing list wasn't handled
(https://gitlab.freedesktop.org/freedesktop/freedesktop/issues/134). And even
if it was, at this point, if there was ever another administrative issue, it
seems likely it could take months to resolve. So instead, we asked for a list
to be created on the redhat mailservers.
Please consider the previous security issue reporting mechanisms rescinded, and
send any senstive bugs to systemd-security@redhat.com.
Current kernels with BFQ scheduler do not yet set their IO weight
through "io.weight" but through "io.bfq.weight" (using a slightly
different interface supporting only default weights, not per-device
weights). This commit enables "IOWeight=" to just to that.
This patch may be dropped at some time later.
Github-Link: https://github.com/systemd/systemd/issues/7057
Signed-off-by: Kai Krakow <kai@kaishome.de>
waitid(2) and the libc function signature calls this "exit status", and
uses "exit code" for something different. Let's stick to the same
nomenclature hence.
This makes ping(8) work without CAP_NET_ADMIN and CAP_NET_RAW because
those aren't effective inside rootless Podman containers.
It's quite useful when using OSTree based operating systems like Fedora
Silverblue, where development environments are often set up using
rootless Podman containers with helpers like Toolbox [1]. Not having
a basic network utility like ping(8) work inside the development
environment can be inconvenient.
See:
https://lwn.net/Articles/422330/http://man7.org/linux/man-pages/man7/icmp.7.htmlhttps://github.com/containers/libpod/issues/1550
The upper limit of the range of group identifiers is set to 2147483647,
which is 2^31-1. Values greater than that get rejected by the kernel
because of this definition in linux/include/net/ping.h:
#define GID_T_MAX (((gid_t)~0U) >> 1)
That's not so bad because values between 2^31 and 2^32-1 are reserved
on systemd-based systems anyway [2].
[1] https://github.com/debarshiray/toolbox
[2] https://systemd.io/UIDS-GIDS.html#summary
Change the resolved.conf Cache option to a tri-state "no, no-negative, yes" values.
If a lookup returns SERVFAIL systemd-resolved will cache the result for 30s (See 201d995),
however, there are several use cases on which this condition is not acceptable (See systemd#5552 comments)
and the only workaround would be to disable cache entirely or flush it , which isn't optimal.
This change adds the 'no-negative' option when set it avoids putting in cache
negative answers but still works the same heuristics for positive answers.
Signed-off-by: Jorge Niedbalski <jnr@metaklass.org>
Make possible to set NUMA allocation policy for manager. Manager's
policy is by default inherited to all forked off processes. However, it
is possible to override the policy on per-service basis. Currently we
support, these policies: default, prefer, bind, interleave, local.
See man 2 set_mempolicy for details on each policy.
Overall NUMA policy actually consists of two parts. Policy itself and
bitmask representing NUMA nodes where is policy effective. Node mask can
be specified using related option, NUMAMask. Default mask can be
overwritten on per-service level.
These make sense to be explicitly set at 0 (which has a different effect
than the default, since it can affect processing of `DefaultMemoryXXX`).
Without this, it's not easily possible to relinquish memory protection
for a subtree, which is not great.
Let's be safe, rather than sorry. This way DynamicUser=yes services can
neither take benefit of, nor create SUID/SGID binaries.
Given that DynamicUser= is a recent addition only we should be able to
get away with turning this on, even though this is strictly speaking a
binary compatibility breakage.
"BOOT" is misleading, because it sounds like this refers to /boot or $BOOT,
when in fact it refers to some subdirectory. Those variable names are purely
interal, so we can change them. $BOOT_DIR_ABS was used in NEWS, but it should
not be (because it is an internal detail), so the old NEWS entry is reworded to
use "entry directory".
Previously, this system call was included in @system-service since it is
a "getter" only, i.e. only queries information, and doesn't change
anything, and hence was considered not risky.
However, as it turns out, mincore() is actually security sensitive, see
the discussion here:
https://lwn.net/Articles/776034/
Hence, let's adjust the system call filter and drop mincore() from it.
This constitues a compatibility break to some level, however I presume
we can get away with this as the systemcall is pretty exotic. The fact
that it is pretty exotic is also reflected by the fact that the kernel
intends to majorly change behaviour of the system call soon (see the
linked LWN article)
These sysctls were added in Linux 4.19 (torvalds/linux@30aba6656f), and
we should enable them just like we enable the older hardlink/symlink
protection since v199. Implements #11414.
Nitpicky, but we've used a lot of random spacings and names in the past,
but we're trying to be completely consistent on "cgroup vN" now.
Generated by `fd -0 | xargs -0 -n1 sed -ri --follow-symlinks 's/cgroups? ?v?([0-9])/cgroup v\1/gI'`.
I manually ignored places where it's not appropriate to replace (eg.
"cgroup2" fstype and in src/shared/linux).
This reverts commit edda44605f.
The kernel explicitly supports resuming with a different kernel than the one
used before hibernation. If this is something that shouldn't be supported, the
place to change this is in the kernel. We shouldn't censor something that this
exclusively in the kernel's domain.
People might be using this to switch kernels without restaring programs, and
we'd break this functionality for them.
Also, even if resuming with a different kernel was a bad idea, we don't really
prevent that with this check, since most users have more than one kernel and
can freely pick a different one from the menu. So this only affected the corner
case where the kernel has been removed, but there is no reason to single it
out.
From WordNet (r) 3.0 (2006) [wn]:
time-out
n 1: a brief suspension of play; "each team has two time-outs left"
From The Free On-line Dictionary of Computing (18 March 2015) [foldoc]:
timeout
A period of time after which an error condition is raised if
some event has not occured. A common example is sending a
message. If the receiver does not acknowledge the message
within some preset timeout period, a transmission error is
assumed to have occured.
This switches the RFC3704 Reverse Path filtering from Strict mode to Loose
mode. The Strict mode breaks some pretty common and reasonable use cases,
such as keeping connections via one default route alive after another one
appears (e.g. plugging an Ethernet cable when connected via Wi-Fi).
The strict filter also makes it impossible for NetworkManager to do
connectivity check on a newly arriving default route (it starts with a
higher metric and is bumped lower if there's connectivity).
Kernel's default is 0 (no filter), but a Loose filter is good enough. The
few use cases where a Strict mode could make sense can easily override
this.
The distributions that don't care about the client use cases and prefer a
strict filter could just ship a custom configuration in
/usr/lib/sysctl.d/ to override this.
After discussions with kernel folks, a system with memcg really
shouldn't need extra hard limits on file descriptors anymore, as they
are properly accounted for by memcg anyway. Hence, let's bump these
values to their maximums.
This also adds a build time option to turn thiss off, to cover those
users who do not want to use memcg.
Back in 2012 the project was renamed, see the release notes for v 0.105
[https://cgit.freedesktop.org/polkit/tree/NEWS#n754]. Let's update our
documentation and comments to do the same. Referring to PolicyKit is confusing
to users because at the time the polkit api changed too, and we support the new
version. I updated NEWS too, since all the references to PolicyKit there were
added after the rename.
"PolicyKit" is unchanged in various URLs and method call names.
This is an additional synchronization point normally not needed. Hence,
let's make it passive, i.e. pull it in from the unit which wants to be
ordered before the update service rather than by the update service
itself.
We really should try to be as precise as possible here. Saying
"your interfaces might be renamed" scares the shit of out people,
for obvious reasons. This change only touches some niche cases
fortunately, let's make this clear.
I figure sooneror later we'll have more of these docs, hence let's give
them a clean place to be.
This leaves NEWS and README/README.md as well as the LICENSE texts in
the root directory of the project since that appears to be customary for
Free Software projects.
After discussions with @htejun it appears it's OK now to enable memory
accounting by default for all units without affecting system performance
too badly. facebook has made good experiences with deploying memory
accounting across their infrastructure.
This hence turns MemoryAccounting= from opt-in to opt-out, similar to
how TasksAccounting= is already handled. The other accounting options
remain off, their performance impact is too big still.
CHANGE OF BEHAVIOUR — with this commit "f" line's behaviour is altered
to match what the documentation says: if an "argument" string is
specified it is written to the file only when the file didn't exist
before. Previously, it would be appended to the file each time
systemd-tmpfiles was invoked — which is not a particularly useful
behaviour as the tool is not idempotent then and the indicated files
grow without bounds each time the tool is invoked.
I did some spelunking whether this change in behaviour would break
things, but afaics nothing relies on the previous O_APPEND behaviour of
this line type, hence I think it's relatively safe to make "f" lines
work the way the docs say, rather than adding a new modifier for it or
so.
Triggered by:
https://lists.freedesktop.org/archives/systemd-devel/2018-January/040171.html
With Type=notify services, EXTEND_TIMEOUT_USEC= messages will delay any startup/
runtime/shutdown timeouts.
A service that hasn't timed out, i.e, start time < TimeStartSec,
runtime < RuntimeMaxSec and stop time < TimeoutStopSec, may by sending
EXTEND_TIMEOUT_USEC=, allow the service to continue beyond the limit for
the execution phase (i.e TimeStartSec, RunTimeMaxSec and TimeoutStopSec).
EXTEND_TIMEOUT_USEC= must continue to be sent (in the same way as
WATCHDOG=1) within the time interval specified to continue to reprevent
the timeout from occuring.
Watchdog timeouts are also extended if a EXTEND_TIMEOUT_USEC is greater
than the remaining time on the watchdog counter.
Fixes#5868.
The code intentionally ignored unknown specifiers, treating them as text. This
needs to change because otherwise we can never add a new specifier in a backwards
compatible way. So just treat an unknown (potential) specifier as an error.
In principle this is a break of backwards compatibility, but the previous
behaviour was pretty much useless, since the expanded value could change every
time we add new specifiers, which we do all the time.
As a compromise for backwards compatibility, only fail on alphanumerical
characters. This should cover the most cases where an unescaped percent
character is used, like size=5% and such, which behave the same as before with
this patch. OTOH, this means that we will not be able to use non-alphanumerical
specifiers without breaking backwards compatibility again. I think that's an
acceptable compromise.
v2:
- add NEWS entry
v3:
- only fail on alphanumerical
This fixes various typos, removes some duplications, and adds a bit more
detail in the few places which are potential pitfalls for users.
Also change the way the paragraphs about new options begin, because having
a paragraph saying "Two new options have been added", and then bit lower
again "Two new options have been added" is confusing.
This creates a second private resolve.conf file which lists the stub resolver
and the resolved acquired search domains.
This runtime file should be used as a symlink target for /etc/resolv.conf such
that non-nss based applications can resolve search domains.
Fixes: #7009
This adds "systemd-resolve --reset-server-features" for explicitly
forgetting what we learnt. This might be useful for debugging
purposes, and to force systemd-resolved to restart its learning logic
for all DNS servers.
This removes the '@credentials' syscall set that was added in commit
v234-468-gcd0ddf6f75.
Most of these syscalls are so simple that we do not want to filter them.
They work on the current calling process, doing only read operations,
they do not have a deep kernel path.
The problem may only be in 'capget' syscall since it can query arbitrary
processes, and used to discover processes, however sending signal 0 to
arbitrary processes can be used to discover if a process exists or not.
It is unfortunate that Linux allows to query processes of different
users. Lets put it now in '@process' syscall set, and later we may add
it to a new '@basic-process' set that allows most basic process
operations.
Typically when DHCP server sets MTU it is a lower one. And a lower than usual
MTU is then thus required on said network to have operational networking. This
makes networkd's dhcp client to work in more similar way to other dhcp-clients
(e.g. isc-dhcp). In particular, in a cloud setting, without this default
instances have resulted in timing out talking to cloud metadata source and
failing to provision.
This does not change this default for the Annonymize code path.
They’re counterparts to the existing set-log-level and set-log-target
verbs, simply printing the current value to stdout. This makes it
slightly easier to temporarily change the log level and/or target and
then restore the old value(s).
This reverts commit 4f5e972279.
Building with gperf 3.0 works just fine; we had an autoconf check to
determine the correct data types, and this check was ported to meson.