IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The original loop called fix_order() on each service immediately after
loading it, but fix_order() would reference other units which were not
loaded yet.
This resulted in bogus and unnecessary orderings based on the static
start priorities.
Therefore call load_sysv() for every init script when traversing them in
enumerate_sysv(). This ensures that all units are loaded when
fix_order() is called.
Bug-Debian: https://bugs.debian.org/771118
The list of provided facility names as specified via Provides: in the
LSB header was originally implemented by adding those facilities to the
Names= property via unit_add_name().
In commit 95ed3294c6 the internal SysV
support was replaced by a generator and support for parsing the Names=
option had been removed from the unit file parsing in v186.
As a result, Provides: for non-virtual facility was dropped when
introducing the sysv-generator.
Since quite a few SysV init scripts still use that functionality (at
least in distros like Debian which have a large body of SysV init
scripts), add back support by making those facility names available via
symlinks to the unit filename to ensure correct orderings between
SysV init scripts which use those facility names.
Bug-Debian: https://bugs.debian.org/774335
Our write pattern is quite awful for CoW file systems (btrfs...), as we
keep updating file parts in the beginning of the file. This results in
fragmented journal files. Hence: when rotating files, defragment them,
since at that point we know that no further write accesses will be made.
Making use of the fd storage capability of the previous commit, allow
restarting journald by serilizing stream state to /run, and pushing open
fds to PID 1.
With this change it is possible to send file descriptors to PID 1, via
sd_pid_notify_with_fds() which PID 1 will store individually for each
service, and pass via the usual fd passing logic on next invocation.
This is useful for enable daemon reload schemes where daemons serialize
their state to /run, push their fds into PID 1 and terminate, restoring
their state on next start from the data in /run and passed in from PID
1.
The fds are kept by PID 1 as long as no POLLHUP or POLLERR is seen on
them, and the service they belong to are either not dead or failed, or
have a job queued.
When systemd starts a service, it first opened /run/systemd/journal/stdout
socket, and only later switched to the right user.group (if they are
specified). Later on, journald looked at the credentials, and saw
root.root, because credentials are stored at the time the socket is
opened. As a result, all messages passed over _TRANSPORT=stdout were
logged with _UID=0, _GID=0.
Drop real uid and gid temporarily to fix the issue.
Let's unify the code that counts the running jobs a bit, in order to
make sure we are less likely to miss one.
This is related to this bug:
https://bugs.freedesktop.org/show_bug.cgi?id=87349
However, it probably won't fix it fully, and I cannot reproduce the issue.
The change also adds an explicit assert change when the counter is off.
Catch up with latest changes in kdbus.ko:
* Signals can be sent as unicast now, hence they need to be marked as
such with the KDBUS_MSG_SIGNAL in the message flags.
* Follow ioctl number change for KDBUS_CMD_FREE
When setting up a namespace, mount flags like noexec, nosuid and
nodev are cleared, so the mounts always have exec, suid and dev
flags enabled.
Copy source directory mount flags to target mount when remounting
the bind mounts.
We always should use the same checks when deciding whether swap support
and mounting of devices is supported. Hence, let's make
fstab-generator's logic more similar to the usual logic we follow:
a) Look for /proc/swaps and no container support before activating
swaps.
b) Look for /sys being writable befire supporting device mounts.
Regression introduced by ed757c0cb0
Mirror the implementation of columns(), since the fd_columns()
functions returns a negative integer for errors.
Also fix columns() to return the unsigned variable instead of the
signed intermediary (they're the same, but better to be explicit).
Since the file headers might be replaced by zeroed pages now due to
sigbus we should make sure we don't end up dividing by zero because we
don't check values read from journal file headers for changes.
This makes them robust regarding truncation. Ideally, we'd export this
as an API, but given how messy SIGBUS handling is, and the uncertain
ownership logic of signal handlers we should not do this (unless libc
one day invents a scheme how to sanely install SIGBUS handlers for
specific memory areas only). However, for now we can still make all our
own tools robust.
Note that external tools will only have read-access to the journal
anyway, where SIGBUS is much more unlikely, given that only writes are
subject to disk full problems.
Even though we use fallocate() it appears that file systems like btrfs
will trigger SIGBUS on certain low-disk-space situation. We should
handle that, hence catch the signal, add it to a list of invalidated
pages, and replace the page with an empty memory area. After each write
check if SIGBUS was triggered, and consider the write invalid if it was.
This should make journald a lot more robust with file systems where
fallocate() is not reliable, for example all CoW file systems
(btrfs...), where changing written data can fail with disk full errors.
https://bugzilla.redhat.com/show_bug.cgi?id=1045810
-n is only allowed for root. /etc/mtab is nowadays almost always a link to /proc/,
so in practice this does not really matter too much, but should allow .mount units
to work in --user mode.
https://bugs.freedesktop.org/show_bug.cgi?id=87602
Those values are based on a file we read from disk, so we should
verify everything we receive, and make sure everything we print
is sensible.
Also, print fractional seconds for TTL.
Fix a bunch of needless memzero() calls, a bunch of use-after-free
regarding _cleanup_free_ and drop unused variables.
Hint: Do NOT use _cleanup_free_ for temporary strappend() helpers that are
freed multiple times. All you safe is the last free() call, which is
really not worth the trouble resetting it to NULL all the time.
This partially reverts:
commit f131770b14
Author: Veres Lajos <vlajos@gmail.com>
Date: Mon Dec 29 09:45:58 2014 +0000
tree-wide: spelling fixes
The commit in question changed a binary file. I didn't look at the diff in
particular, so I have no idea what exactly was changed. However, the file
is generated and it looked highly suspiciuous. Therefore, I reverted that
part.
Note that this is generated by "make update-unifont" so really no reason
to touch at all.
If some sleep operation was not possible (e.g. because swap is missing),
we would try twice: once through logind, which would result in a clean error:
Failed to execute operation: Sleep verb not supported
and then second time by starting the appropriate unit directly, which is
more messy. If logind tells us that something is not possible (or already
in progress), report that to the user and quit. If logind is present and working
we should not try to work around it.
Loosely based on https://bugs.freedesktop.org/show_bug.cgi?id=87832.
ENOSYS is used to signify compiled-out functionality. Using it for
different kinds of error is misleading.
For BUS_ERROR_SLEEP_VERB_NOT_SUPPORTED, logind-action.c uses ENOTSUP
already, so changing it to ENOTSUP makes the dbus and action paths
behave the same.
This implements two new helpers, discussed on systemd-devel about 1 year
ago:
sd_bus_emit_object_added()
sd_bus_emit_object_removed()
Both calls are equivalent to their respective counterpart
sd_bus_emit_interfaces_{added/removed}(), but can figure out the list of
interfaces themselves, instead of requiring the caller to provide them.
Furthermore, both calls properly deal with builtin interfaces provided via
org.freedesktop.DBus.* and alike.
Both calls simply traverse a node and all its parent nodes to figure out a
list of all interfaces registered as vtable or fallback. It then appends
each of them, similar to the interfaces_{added/removed}() helpers.
Note that interfaces_{added/removed}() runs a parent traversal for *each*
passed interface. Therefore, it can simply bail out, once it found a
parent node that implements a given interface.
With object_{added/removed}() we cannot know the registered interfaces in
advance, thus, we cannot run one traversal per node. Instead, we run a
single traversal and remember all interfaces that we added. Therefore, a
child-interface overrides all conflicting parent-interfaces. We keep a
"Set *s" context to track those while climbing up the tree.
The kernel provides capabilities as a u32 array, sd-bus uses an u8 array.
This works fine on little-endian as both are encoded the same way.
However, this fails on big-endian if we do not perform sufficient
byte-swapping on each u32 entry.
This patch makes sd-bus use u32, too. We avoid changing any kernel
provided data so we can keep pointing into kdbus pool buffers which
contain u32 arrays.
The number of available caps can be read from
/proc/sys/kernel/cap_last_cap during runtime. Our helper cap_last_cap()
does that, so there's no reason to remember the size of any capability
cache. We can just pre-allocate arrays with a suitable size for all
available caps and reject any higher caps.
The kernel capability API uses u32 as base so make sure we do the same.
Note that this is specified by POSIX, so it's unlikely to change.
This macro calculates A / B but rounds up instead of down. We explicitly
do *NOT* use:
(A + B - 1) / A
as it suffers from an integer overflow, even though the passed values are
properly tested against overflow. Our test-cases show this behavior.
Instead, we use:
A / B + !!(A % B)
Note that on "Real CPUs" this does *NOT* result in two divisions. Instead,
instructions like idivl@x86 provide both, the quotient and the remainder.
Therefore, both algorithms should perform equally well (I didn't verify
this, though).
This reverts commit 206e7a5f7b.
We actually want to allow shutting down containers that use
RegisterMachine() rather than CreateMachine() to register their own
unit. It should be safe to do so, since the primary usecase for
RegisterMachine() are container managers that run only a single
container within their own unit, such as systemd-nspawn.
This file was introduced with linux-3.2, use it instead of probing for it
via prctl(PR_CAPBSET_READ).
For now, keep the old code for backwards compat. We can drop it once 3.2
is our lowest requirement.
The test-cap-list code is extended to verify cap_last_cap() is the same as
we'd get via prctl probing and /proc.
All we care about is that the kernel (pid==0) sent the message. Verifying the sender uid
seems to break when using userns.
Reported by Stéphane Graber.
We no longer configure the addresses on the loopback interface, but simply bring it up
and let the kernel do the rest. Also change the check to only check if the interface
is up, rather than checking for the IPv4 loopback address.
In test_raw_clone, make sure the cloned thread calls _exit() and in the parent
thread call waitpid(..., __WCLONE) to wait for the child thread to terminate,
otherwise there is a race condition where the child thread will log to the
console after the test process has already exited and the assertion from the
child thread might not be enforced.
The absence of this patch might also create problems for other tests that would
be added after this one, since potentially both parent and child would run
those tests as the child would continue running.
Tested by confirming that the logs from the child are printed before the test
terminates and that a false assertion in the child aborts the test with a core
dump.
[zj: also add check for the return value.]
It does not use any functions from libcap directly. The CAP_SYS_TIME constant
in use by this file comes from <linux/capability.h> imported through "missing.h".
Tested that "systemd-timedated" builds cleanly and works after this change.
It does not use any functions from libcap directly. The CAP_SYS_ADMIN constant
in use by this file comes from <linux/capability.h> imported through "missing.h".
Tested that "systemd-localed" builds cleanly and works after this change.
They do not use any functions from libcap directly. The CAP_SYS_ADMIN constant
in use by bus-objects.c comes from <linux/capability.h> imported through
"missing.h". The "missing.h" header is imported through "util.h" which gets
imported in "bus-util.h".
Tested that everything builds cleanly after this change.
They do not use any functions from libcap directly. The CAP_KILL constant in
use by these files comes from <linux/capability.h> imported through
"missing.h".
Tested that "systemd-machined" builds cleanly and works after this change.
It does not use any functions from libcap directly. The CAP_SYS_ADMIN constant
in use by this file comes from <linux/capability.h> imported through "missing.h".
Tested that "systemd-hostnamed" builds cleanly and works after this change.
It does not use any functions from libcap directly. The CAP_MKNOD constant in
use by this file comes from <linux/capability.h> imported through "missing.h".
Tested that "systemd-tmpfiles" builds cleanly and works after this change.
They do not use any functions from libcap directly. The CAP_* constants in use
through these files come from "missing.h" which will import <linux/capability.h>
and complement it with CAP_* constants not defined by the current kernel
headers. The "missing.h" header is imported through "util.h" which gets
imported in "logind.h".
Tested that "systemd-logind" builds cleanly and works after this change.
It does not use any functions from libcap directly. The CAP_* constants in use
through this file come from "missing.h" which will import <linux/capability.h>
and complement it with CAP_* constants not defined by the current kernel
headers.
Add an explicit import of our "capability.h" since it does use the function
capability_bounding_set_drop from that header file. Previously, that header was
implicitly imported through through "cap-list.h".
Tested that "systemd-nspawn" builds cleanly and works after this change.
The new test-cap-list introduced in commit 2822da4fb7 uses the included
table of capabilities. However, it uses cap_last_cap() which probes the kernel
for the last available capability. On an older kernel (e.g. 3.10 from RHEL 7)
that causes the test to fail with the following message:
Assertion '!capability_to_name(cap_last_cap()+1)' failed at src/test/test-cap-list.c:30, function main(). Aborting.
Fix it by exporting the size of the static table and using it in the test
instead of the dynamic one from the current kernel.
Tested by successfully running ./test-cap-list and the whole `make check` test
suite with this patch on a RHEL 7 host.
Pretty much everywhere else we use the generic term "machine" when
referring to containers in API, so let's do though in sd-bus too. In
particular, since the concept of a "container" exists in sd-bus too, but
as part of the marshalling system.
After all, pretty much all our tools include it, and it should hence be
shared.
Also move sysfs-show.h from core/ to login/, since it has no point to
exist in core.
sd_bus_creds_get_well_known_names() fails with -ENODATA in case the
message has no names attached, which is intended behavior if the
remote connection didn't own any names at the time of sending.
The function already deals with 'sender_names' being an empty strv,
so we can just continue in such cases.
Messages to destinations that are not currently owned by any bus connection
will cause kdbus related function to return with either -ENXIO or -ESRCH.
Such conditions should not make the proxyd terminate but send a sane
SD_BUS_ERROR_NAME_HAS_NO_OWNER error reply to the proxied connection.
I figure "pull-dck" is not a good name, given that one could certainly
read the verb in a way that might be funny for 16year-olds. ;-)
Also, don't hardcode the index URL to use, make it runtime and configure
time configurable instead.
We originally only supported escaping ucs2 encoded characters (as \uxxxx). This
only covers the BMP. Support escaping also utf16 surrogate pairs (on the form
\uxxxx\uyyyy) to cover all of unicode.
Sync kdbus.h with upstream changes:
* Two optional cancellation points where added for synchronously
blocking KDBUS_CMD_SEND commands: A sigmask to change the mask
of accepted signals before the task is put to sleep, and a
generic file descriptor that can be written to, in order to cancel
the command. Both methods are currently unused.
* The KDBUS_CMD_CANCEL ioctl was removed. sd-bus was never using
that command, so there's no change needed.
* Some kerneldoc fixes
* (potentially) public headers must reside in src/systemd/ (not in
src/libsystemd*)
* some private (not prefixed with sd_) functions moved from sd-lldp.h to
lldp-internal.h
* introduce lldp-util.h for the cleanup macro, as these should not be public
* rename the cleanup macro, we always name them _cleanup_foo_, never
_cleanup_sd_foo_
* mark some function arguments as 'const'
This adds a new bus call to machined that enumerates /var/lib/container
and returns all trees stored in it, distuingishing three types:
- GPT disk images, which are files suffixed with ".gpt"
- directory trees
- btrfs subvolumes
EOF is meaningless if the direction of iteration changes.
Move the EOF optimization under the direction check.
This fixes test-journal-interleaving for me.
Thanks to Filipe Brandenburger for telling me about the failure.
next_with_matches() is odd in that its "unit64_t *offset" parameter is
both input and output. In other it's purely for output.
The function is called from two places in next_beyond_location(). In
both of them "&cp" is used as the argument and in both cases cp is
guaranteed to equal f->current_offset.
Let's just have next_with_matches() ignore "*offset" on input and
operate with f->current_offset.
I did not investigate why it is, but it makes my usual benchmark run
reproducibly faster:
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m4.032s
user 0m3.896s
sys 0m0.135s
(Compare to preceding commit, where real was 4.4s.)
I accidentally broke the detection of duplicate entries in 7943f42275
"journal: optimize iteration by returning previously found candidate
entry".
When we have a known location of a candidate entry, we must not return
from next_beyond_location() immediately. We must go through the
duplicates detection to make sure the candidate differs from the
already iterated entry.
This fix slows down iteration a bit, but it's still faster than it
was before the rework.
$ time ./journalctl --since=2014-06-01 --until=2014-07-01 > /dev/null
real 0m4.448s
user 0m4.298s
sys 0m0.149s
(Compare with results from commit 7943f42275, where real was 5.3s before
the rework.)
This patch introduces LLDP support to networkd. it implements the
receiver side of the protocol.
The Link Layer Discovery Protocol (LLDP) is an industry-standard,
vendor-neutral method to allow networked devices to advertise
capabilities, identity, and other information onto a LAN. The Layer 2
protocol, detailed in IEEE 802.1AB-2005.LLDP allows network devices
that operate at the lower layers of a protocol stack (such as
Layer 2 bridges and switches) to learn some of the capabilities
and characteristics of LAN devices available to higher
layer protocols.
This adds a simply but powerful tool for downloading container images
from the most popular container solution used today. Use it like
this:
# systemd-import pull-dck mattdm/fedora
# systemd-nspawn -M fedora
This will donwload the layers for "mattdm/fedora", and make them
available locally as /var/lib/container/fedora.
The tool is pretty complete, as long as it's only about pulling down
images, or updating them. Pushing or searching is not supported yet.
The handling of the command name and other arguments is unified. This
simplifies things and should make them more predictable for users.
Incidentally, this makes ExecStart handling match the .desktop file
specification, apart for the requirment for an absolute path.
https://bugs.freedesktop.org/show_bug.cgi?id=86171
Commit a2a5291b3f changed the parser to reject unfinished quoted
strings. Unfortunately it introduced an error where a trailing
backslash would case an infinite loop. Of course this must fixed, but
the question is what to to instead. Allowing trailing backslashes and
treating them as normal characters would be one option, but this seems
suboptimal. First, there would be inconsistency between handling of
quoting and of backslashes. Second, a trailing backslash is most
likely an error, at it seems better to point it out to the user than
to try to continue.
Updated rules:
ExecStart=/bin/echo \\ → OK, prints a backslash
ExecStart=/bin/echo \ → error
ExecStart=/bin/echo "x → error
ExecStart=/bin/echo "x"y → error
This fixes 2 problems introduced by 6feeeab0bc:
1) If name_to_handle_at returns ENOSYS for the child, we'll wrongly
return -ENOSYS when it returns the same for the parent. Immediately
jump to the fallback logic when we get ENOSYS.
2) If name_to_handle_at returns EOPNOTSUPP for the child but suceeds
for the parent, we'll be comparing an uninitialized value (mount_id) to
an initialized value (mount_id_parent). Initialize the mount_id
variables to invalid mount_ids to avoid this.
This pulls out the hwdb managment from udevadm into an independent tool.
The old code is left in place for backwards compatibility, and easy of
testing, but all documentation is dropped to encourage use of the new
tool instead.
In next_beyond_location() when the JournalFile's location type is
LOCATION_SEEK, it means there's nothing to do, because we already have
the location of the candidate entry. Do an early return. Note that now
next_beyond_location() does not anymore guarantee on return that the
entry is mapped, but previous patches made sure the caller does not
care.
This optimization is at least as good as "journal: optimize iteration:
skip files that cannot improve current candidate entry" was.
Timing results on my workstation, using:
$ time ./journalctl -q --since=2014-06-01 --until=2014-07-01 > /dev/null
Before "Revert "journal: optimize iteration: skip files that cannot
improve current candidate entry":
real 0m5.349s
user 0m5.166s
sys 0m0.181s
Now:
real 0m3.901s
user 0m3.724s
sys 0m0.176s
If from a previous iteration we know we are at the end of a journal
file, don't bother looking into the file again. This is complicated by
the fact that the EOF does not have to be permanent (think of
"journalctl -f"). So we also check if the number of entries in the
journal file changed.
This optimization has a similar effect as "journal: optimize iteration:
skip whole files behind current location" had.
set_location() is called from real_journal_next() when a winning entry
has been picked from among the candidates in journal files.
The location type is always set to LOCATION_DISCRETE. No need to pass
it as a parameter.
The per-JournalFile location information is already updated at this
point. No need for having the direction and offset here.
In next_beyond_location() when we find a candidate entry in a journal
file, save its location information in struct JournalFile.
The purpose of remembering the locations of candidate entries is to be
able to save work in the next iteration. This patch does only the
remembering part.
LOCATION_SEEK means the location identifies a candidate entry.
When a winner is picked from among candidates, it becomes
LOCATION_DISCRETE.
LOCATION_TAIL here signifies we've iterated the file to the end (or the
beginning in the case of reversed direction).
fork() is not async-signal-safe and calling it from the signal handler
could result in a deadlock when at_fork() handlers are called. Using
the raw clone() syscall sidesteps that problem.
The tricky part is that raise() does not work, since getpid() does not
work. Add raw_getpid() to get the real pid, and use kill() instead of
raise().
https://bugs.freedesktop.org/show_bug.cgi?id=86604
If child supports, but the parent does not, or when the child does
not support, but the parent does, assume the child is a mount point.
Only if neither supports use the fallback.
dup3() allows setting O_CLOEXEC which we are not interested in. However,
it also fails if called with the same fd as input and output, which is
something we don't want. Hence use dup2().
Also, we need to explicitly turn off O_CLOEXEC for the fds, in case the
input fd was O_CLOEXEC and < 3.
[zj: When we lstat the target path, symlinks above the last component
will be followed by both stat and lstat. So when we look at the
parent, we should follow symlinks.]