1
0
mirror of https://github.com/systemd/systemd.git synced 2024-11-08 11:27:32 +03:00
Commit Graph

375 Commits

Author SHA1 Message Date
Lennart Poettering
d6797c920e namespace: beef up read-only bind mount logic
Instead of blindly creating another bind mount for read-only mounts,
check if there's already one we can use, and if so, use it. Also,
recursively mark all submounts read-only too. Also, ignore autofs mounts
when remounting read-only unless they are already triggered.
2014-06-06 14:37:40 +02:00
Djalal Harouni
e866af3acc nspawn: make nspawn robust to container failure
nspawn and the container child use eventfd to wait and notify each other
that they are ready so the container setup can be completed.

However in its current form the wait/notify event ignore errors that
may especially affect the child (container).

On errors the child will jump to the "child_fail" label and terminate
with _exit(EXIT_FAILURE) without notifying the parent. Since the eventfd
is created without the "EFD_NONBLOCK" flag, this leaves the parent
blocking on the eventfd_read() call. The container can also be killed
at any moment before execv() and the parent will not receive
notifications.

We can fix this by using cheap mechanisms, the new high level eventfd
API and handle SIGCHLD signals:

* Keep the cheap eventfd and EFD_NONBLOCK flag.

* Introduce eventfd states for parent and child to sync.
Child notifies parent with EVENTFD_CHILD_SUCCEEDED on success or
EVENTFD_CHILD_FAILED on failure and before _exit(). This prevents the
parent from waiting on an event that will never come.

* If the child is killed before execv() or before notifying the parent,
we install a NOP handler for SIGCHLD which will interrupt blocking calls
with EINTR. This gives a chance to the parent to call wait() and
terminate in main().

* If there are no errors, parent will block SIGCHLD, restore default
handler and notify child which will do execv(), then parent will pass
control to process_pty() to do its magic.

This was exposed in part by:
https://bugs.freedesktop.org/show_bug.cgi?id=76193

Reported-by: Tobias Hunger tobias.hunger@gmail.com
2014-05-25 11:23:35 +08:00
Djalal Harouni
113cea802d nspawn: move container wait logic into wait_for_container()
Move the container wait logic into its own wait_for_container() function
and add two status codes: CONTAINER_TERMINATED or CONTAINER_REBOOTED.
The status will be stored in its argument, this way we handle:
a) Return negative on failures.
b) Return zero on success and set the status to either
   CONTAINER_REBOOTED or CONTAINER_TERMINATED.

These status codes are used to terminate nspawn or loop again in case of
CONTAINER_REBOOTED.
2014-05-25 11:23:30 +08:00
Cristian Rodríguez
590b6b9188 Use %m instead of strerror(errno) where appropiate 2014-05-25 11:18:28 +08:00
Lennart Poettering
cdb2b9d05a nspawn: restore journal directory is empty check
This undoes part of commit e6a4a517be.

Instead of removing the error message about non-empty journal bind mount
directories, simply downgrade the message to a warning and proceed.
2014-05-22 15:21:01 +09:00
Djalal Harouni
e6a4a517be nspawn: allow to bind mount journal on top of a non empty container journal dentry
Currently if nspawn was called with --link-journal=host or
--link-journal=auto and the right /var/log/journal/machine-id/ exists
then the bind mount the subdirectory into the container might fail due
to the ~/mycontainer/var/log/journal/machine-id/ of the container not
being empty.

There is no reason to check if the container journal subdir is empty
since there will be a bind mount on top of it. The user asked for a bind
mount so give it.

Note: a next call with --link-journal=guest may fail due to the
/var/log/journal/machine-id/ on the host not being empty.

https://bugs.freedesktop.org/show_bug.cgi?id=76193

Reported-by: Tobias Hunger <tobias.hunger@gmail.com>
2014-05-22 09:55:23 +09:00
Nis Martensen
f1721625e7 fix spelling of privilege 2014-05-19 00:40:44 +09:00
Lennart Poettering
9f24adc288 nspawn: properly format container_uuid in UUID format
http://lists.freedesktop.org/archives/systemd-devel/2014-April/018971.html
2014-05-16 19:37:19 +02:00
Philip Lorenz
70f539ca14 nspawn: Fix erroneous OOM when building group list
change_uid_gid() never initialises sz which may cause greedy_realloc to
skip the initial buffer allocation.
2014-04-10 09:50:39 -04:00
Tom Gundersen
d8e538ecd9 sd-rtnl: rework rtnl type system
Use a static table with all the typing information, rather than repeated
switch statements. This should make it a lot simpler to add new types.

We need to keep all the type info to be able to create containers
without exposing their implementation details to the users of the library.

As a freebee we verify the types of appended/read attributes.

The API is extended to nicely deal with unions of container types.
2014-03-28 19:11:59 +01:00
Lennart Poettering
3d94f76c99 util: replace close_pipe() with new safe_close_pair()
safe_close_pair() is more like safe_close(), except that it handles
pairs of fds, and doesn't make and misleading allusion, as it works
similarly well for socketpairs() as for pipe()s...
2014-03-24 03:22:44 +01:00
Lennart Poettering
03e334a1c7 util: replace close_nointr_nofail() by a more useful safe_close()
safe_close() automatically becomes a NOP when a negative fd is passed,
and returns -1 unconditionally. This makes it easy to write lines like
this:

        fd = safe_close(fd);

Which will close an fd if it is open, and reset the fd variable
correctly.

By making use of this new scheme we can drop a > 200 lines of code that
was required to test for non-negative fds or to reset the closed fd
variable afterwards.
2014-03-18 19:31:34 +01:00
Tom Gundersen
039dd4afd6 nspawn: UP the host side of the veth pair after adding it to a bridge 2014-03-16 13:55:41 +01:00
Dave Reisner
7947952ede nspawn: remove unused variable 2014-03-13 21:56:07 -04:00
Brandon Philips
f418f31d50 nspawn: allow -EEXIST on mkdir_safe /home/${uid}
With systemd 211 nspawn attempts to create the home directory for the
given uid. However, if the home directory already exists then it will
fail. Don't error out on -EEXIST.
2014-03-14 02:25:56 +01:00
Tom Gundersen
01dde0611b nspawn: make host0's MAC address persistent
We still need to make sure that no two MAC addresses are the same, so we use
a logic similar to what is used in udev to generate MAC addresses, and base
it on a hash of the host's machine ID and thecontainer's name.
2014-03-13 17:47:33 +01:00
Lennart Poettering
727fd4fda5 nspawn: honour GPT partition flags when mounting file systems following the discoverable partitions spec 2014-03-13 01:33:33 +01:00
Mantas Mikulėnas
4de8292689 nspawn: fix argv[0] for getent 2014-03-11 17:45:20 +01:00
Lennart Poettering
a07f961e98 nspawn: allow using kdbus from nspawn containers 2014-03-11 17:43:41 +01:00
Lennart Poettering
8c4e25b73c nspawn: fix getent fallback 2014-03-11 03:08:54 +01:00
Lennart Poettering
0cb9fbcd44 nspawn: when resoliving UIDs/GIDs for "-u", do so in forked off /usr/bin/getent instead of in-process
When the container runs a different native architecture than the host we
shouldn't attempt to load the container's NSS modules with the host's
libc. Instead, resolve UID/GID by invoking /usr/bin/getent in the
container. The tool should be fairly universally available and allows us
to do resolving of the UID/GID with the container's libc in a parsable
format.

https://bugs.freedesktop.org/show_bug.cgi?id=75733
2014-03-11 02:41:13 +01:00
Lennart Poettering
d96c1ecf7b nspawn: make sure we don't try to mount the container block device in the child after the parent added us to the device cgroup 2014-03-11 01:01:38 +01:00
Lennart Poettering
eb0f0863f5 nspawn: don't try mknod() of /dev/console with the correct major/minor
We overmount /dev/console with an external pty anyway, hence there's no
point in using the real major/minor when we create the node to
overmount. Instead, use the one of /dev/null now.

This fixes a race against the cgroup device controller setup we are
using. In case /dev/console was create before the cgroup policy was
applied all was good, but if created in the opposite order the mknod()
would fail, since creating /dev/console is not allowed by it. Creating
/dev/null instances is however permitted, and hence use it.
2014-03-10 21:36:01 +01:00
Lennart Poettering
1b9e5b1263 nspawn: add --image= switch to boot GPT disk images that follow the Discoverable Partitions Specification 2014-03-10 20:35:52 +01:00
Tero Roponen
13e8ceb84e nspawn: fix detection of missing /proc/self/loginuid
Running 'systemd-nspawn -D /srv/Fedora/' gave me this error:
 Failed to read /proc/self/loginuid: No such file or directory

 Container Fedora failed with error code 1.

This patch fixes the problem.
2014-02-28 12:58:02 +01:00
Lennart Poettering
9875fd7875 nspawn: no need for duplicate checks against EEXIST 2014-02-26 02:19:28 +01:00
Lennart Poettering
c74e630d0c nspawn: add new switch --network-macvlan= to add a macvlan device to the container 2014-02-25 02:37:59 +01:00
Lennart Poettering
9457ac5b4e nspawn: make use of the devices cgroup controller by default 2014-02-24 03:38:58 +01:00
Lennart Poettering
08af0da269 nspawn: when adding a veth interface to a bridge, use the "vb-" rather than "ve-" interface name prefix
This way we can recognize the interfaces later on to apply different
host-side configuration to them.
2014-02-21 04:02:12 +01:00
Lennart Poettering
151b9b9662 api: in constructor function calls, always put the returned object pointer first (or second)
Previously the returned object of constructor functions where sometimes
returned as last, sometimes as first and sometimes as second parameter.
Let's clean this up a bit. Here are the new rules:

1. The object the new object is derived from is put first, if there is any

2. The object we are creating will be returned in the next arguments

3. This is followed by any additional arguments

Rationale:

For functions that operate on an object we always put that object first.
Constructors should probably not be too different in this regard. Also,
if the additional parameters might want to use varargs which suggests to
put them last.

Note that this new scheme only applies to constructor functions, not to
all other functions. We do give a lot of freedom for those.

Note that this commit only changes the order of the new functions we
added, for old ones we accept the wrong order and leave it like that.
2014-02-20 00:03:10 +01:00
Lennart Poettering
39883f622f make gcc shut up
If -flto is used then gcc will generate a lot more warnings than before,
among them a number of use-without-initialization warnings. Most of them
without are false positives, but let's make them go away, because it
doesn't really matter.
2014-02-19 17:53:50 +01:00
Lennart Poettering
ac45f971a1 core: add Personality= option for units to set the personality for spawned processes 2014-02-19 03:27:03 +01:00
Lennart Poettering
6afc95b736 nspawn: add new --personality= switch to make it easier to run 32bit containers on a 64bit host 2014-02-18 23:37:27 +01:00
Lennart Poettering
3302da4667 nspawn: x86 is special with its socketcall() semantics, be permissive in the seccomp setup 2014-02-18 22:27:46 +01:00
Lennart Poettering
e9642be2cc seccomp: add helper call to add all secondary archs to a seccomp filter
And make use of it where appropriate for executing services and for
nspawn.
2014-02-18 22:14:00 +01:00
Dave Reisner
f3d5485b80 nspawn: allow 32-bit chroots from 64-bit hosts
Arch Linux uses nspawn as a container for building packages and needs
to be able to start a 32bit chroot from a 64bit host. 24fb111207
disrupted this feature when seccomp handling was added.
2014-02-18 21:26:24 +01:00
Tom Gundersen
4fb7242cbb sd-rtnl-message: store reference to the bus in the message
This mimics the sd-bus api, as we may need it in the future.
2014-02-18 11:21:22 +01:00
Lennart Poettering
37c47eb709 nspawn: netns_fd can be removed now 2014-02-17 15:49:21 +01:00
Thomas Hindoe Paaboel Andersen
32457153f4 nspawn: typo fix in help 2014-02-16 22:15:24 +01:00
Tom Gundersen
ab046dde6f nspawn: add new --network-bridge= switch
This adds the host side of the veth link to the given bridge.

Also refactor the creation of the veth interfaces a bit to set it up
from the host rather than the container. This simplifies the addition
to the bridge, but otherwise the behavior is unchanged.
2014-02-16 21:40:28 +01:00
Tom Gundersen
818dc5e72a sd-rtnl: always include linux/rtnetlink.h 2014-02-15 12:14:45 +01:00
Tom Gundersen
ee3a6a51e5 sd-rtnl: message_open_container - don't take a 'size' argument
We can always know the size based on the type, so let's do this inside the library.
2014-02-15 12:14:45 +01:00
Lennart Poettering
262d10e6bd nspawn: if we don't find bash, try sh 2014-02-14 16:41:03 +01:00
Lennart Poettering
6b9132a9c4 nspawn: don't accept just any tree to execute
When invoked without -D in an arbitrary directory we should not try to
execute anything, make some validity checks first.
2014-02-14 16:35:18 +01:00
Lennart Poettering
24fb111207 nspawn: make socket(AF_NETLINK, *, NETLINK_AUDIT) fail with EAFNOTSUPPORT in containers
The kernel still doesn't support audit in containers, so let's make use
of seccomp and simply turn it off entirely. We can get rid of this big
as soon as the kernel is fixed again.
2014-02-13 20:30:02 +01:00
Lennart Poettering
69c79d3c32 nspawn: add new --network-veth switch to add a virtual ethernet link to the host 2014-02-13 18:47:53 +01:00
Lennart Poettering
7e2270246b nspawn: check with udev before we take possession of an interface 2014-02-13 14:38:02 +01:00
Lennart Poettering
b88eb17a7a nspawn: no need to subscribe to netlink messages if we just want to execute one operation 2014-02-13 14:08:16 +01:00
Lennart Poettering
a42c8b54b1 nspawn: --private-network should imply CAP_NET_ADMIN 2014-02-13 14:07:59 +01:00
Lennart Poettering
d595c5cc9e rtnl: rename constructors from the form sd_rtnl_xxx_yyy_new() to sd_rtnl_xxx_new_yyy()
So far we followed the rule to always indicate the "flavour" of
constructors after the "_new_" or "_open_" in the function name, so
let's keep things in sync here for rtnl and do the same.
2014-02-13 13:53:25 +01:00
Lennart Poettering
cf6a891173 rtnl: drop "sd_" prefix from cleanup macros
The "sd_" prefix is supposed to be used on exported symbols only, and
not in the middle of names. Let's drop it from the cleanup macros hence,
to make things simpler.

The bus cleanup macros don't carry the "sd_" either, so this brings the
APIs a bit nearer.
2014-02-13 03:44:14 +01:00
Lennart Poettering
aa28aefe61 nspawn: add new --network-interface= switch to move an existing interface into the container 2014-02-13 03:27:39 +01:00
Lennart Poettering
39ed67d146 nspawn: introduce --capability=all for retaining all capabilities 2014-02-13 02:45:11 +01:00
Lennart Poettering
db999e0f92 nspawn: newer kernels (>= 3.14) allow resetting the audit loginuid, make use of this 2014-02-12 03:02:09 +01:00
Lennart Poettering
89f7c8465c machined: optionally, allow registration of pre-existing units (scopes
or services) as machine with machined
2014-02-11 17:16:08 +01:00
Lennart Poettering
eb91eb187b nspawn: add --register=yes|no switch to optionally disable registration of the container with machined 2014-02-11 17:16:07 +01:00
Lennart Poettering
8a96d94e4c nspawn: add new --share-system switch to run a container without PID/UTS/IPC namespacing 2014-02-10 13:18:16 +01:00
Lennart Poettering
82adf6af7c nspawn,man: use a common vocabulary when referring to selinux security contexts
Let's always call the security labels the same way:

  SMACK: "Smack Label"
  SELINUX: "SELinux Security Context"

And the low-level encapsulation is called "seclabel". Now let's hope we
stick to this vocabulary in future, too, and don't mix "label"s and
"security contexts" and so on wildly.
2014-02-10 13:18:16 +01:00
Vincent Batts
fcf90586a2 nspawn: require /etc/os-release only for init
/etc/os-release is expected for the case for booting a full system, and
need not be required for thin container execution.
2014-02-10 11:57:53 +01:00
Lennart Poettering
ba978d7b32 nspawn: rename --file-label to --apifs-label since it's really just about the API file systems, nothing else 2014-02-07 19:29:28 +01:00
Tom Gundersen
5d63309cf5 nspawn: fix HAVE_SELINUX ifdef 2014-02-06 17:30:01 +01:00
Lennart Poettering
284c0b9176 nspawn: add --quiet switch for turning off any output noise 2014-02-06 00:43:14 +01:00
Lennart Poettering
1c03020cc4 nspawn: always use default bus 2014-02-05 23:06:34 +01:00
Lennart Poettering
d002827b03 nspawn: various fixes in selinux hookup
- As suggested, prefix argument variables with "arg_" how we do this
  usually.

- As suggested, don't involve memory allocations when storing command
  line arguments.

- Break --help text at 80 chars

- man: explain that this is about SELinux

- don't do unnecessary memory allocations when putting together mount
  option string
2014-02-04 22:56:07 +01:00
Dan Walsh
a8828ed938 Add SELinux support to systemd-nspawn
This patch adds to new options:

-Z PROCESS_LABEL

This specifies the process label to run on processes run within the container.

-L FILE_LABEL

The file label to assign to memory file systems created within the container.

For example if you wanted to wrap an container with SELinux sandbox labels, you could execute a command line the following

chcon system_u:object_r:svirt_sandbox_file_t:s0:c0,c1 -R /srv/container
systemd-nspawn -L system_u:object_r:svirt_sandbox_file_t:s0:c0,c1 -Z system_u:system_r:svirt_lxc_net_t:s0:c0,c1 -D /srv/container /bin/sh
2014-02-04 13:33:15 -08:00
Kay Sievers
486e99a387 bus: update kdbus.h (ABI break) 2014-02-01 17:21:36 +01:00
Lennart Poettering
40ddbdf85b nspawn: fix reboot event fd reuse 2014-01-29 20:58:50 +01:00
Lennart Poettering
7f112f50fe exec: introduce PrivateDevices= switch to provide services with a private /dev
Similar to PrivateNetwork=, PrivateTmp= introduce PrivateDevices= that
sets up a private /dev with only the API pseudo-devices like /dev/null,
/dev/zero, /dev/random, but not any physical devices in them.
2014-01-20 21:28:37 +01:00
Lennart Poettering
354bfd2b16 nspawn: do not invoke RegisterMachine on machined from inside the new PID namespace
On kdbus user credentials are not translated across PID namespaces, but
simply invalidated if sender and receiver namespaces don't match. This
makes it impossible to properly authenticate requests from different PID
namespaces (which is probably a good thing). Hence, register the machine
in the parent and not the client and properly synchronize this.
2014-01-09 08:46:23 +08:00
Shawn Landden
e10a55fd72 DEFAULT_PATH_SPLIT_USR macro 2013-12-20 23:14:21 -05:00
Lennart Poettering
f4889f656b nspawn: add new --setenv= switch to set an environment variable for the container to spawn 2013-12-13 16:37:16 +01:00
Zbigniew Jędrzejewski-Szmek
4d680aeea1 nspawn: complain and continue if machine has same id
If --link-journal=host or --link-journal=guest is used, this totally
cannot work and we exit with an error. If however --link-journal=auto
or --link-journal=no is used, just display a warning.

Having the same machine id can happen if booting from the same
filesystem as the host. Since other things mostly function correctly,
let's allow that.

https://bugs.freedesktop.org/show_bug.cgi?id=68369
2013-12-11 22:39:41 -05:00
Lennart Poettering
9e5548644f bus: connect directly via kdbus in sd_bus_open_system_container()
kdbus fortunately exposes the container's busses in the host fs, hence
we can access it directly instead of doing the namespacing dance.
2013-12-12 00:07:49 +01:00
Zbigniew Jędrzejewski-Szmek
2b6bf07dd2 Get rid of our reimplementation of basename
The only problem is that libgen.h #defines basename to point to it's
own broken implementation instead of the GNU one. This can be fixed
by #undefining basename.
2013-12-06 21:29:55 -05:00
Shawn Landden
2ed4e5e0b8 nspawn: fix buggy mount_binds, now works for bind-mounted files 2013-12-06 00:38:13 -05:00
Lennart Poettering
9bd37b40fa nspawn: set up a kdbus namespace when starting a container 2013-11-30 16:36:46 +01:00
Lennart Poettering
898d5c9137 nspawn: improve error message when we cannot resolve the root directory argument 2013-11-26 03:50:32 +01:00
Lennart Poettering
420c7379fb nspawn: add new --drop-capability= switch 2013-11-20 22:10:42 +01:00
Lennart Poettering
76b543756e bus: introduce concept of a default bus for each thread and make use of it everywhere
We want to emphasize bus connections as per-thread communication
primitives, hence introduce a concept of a per-thread default bus, and
make use of it everywhere.
2013-11-12 00:12:43 +01:00
Lennart Poettering
5b30bef856 bus: log message parsing errors everywhere with a generalized bus_log_parse_error() 2013-11-07 21:26:31 +01:00
Lennart Poettering
eb9da376d7 clients: unify how we invoke getopt_long()
Among other things this makes sure we always expose a --version command
and show it in the help texts.
2013-11-06 18:28:39 +01:00
Lennart Poettering
1f0cd86b3d nspawn: explicitly terminate machines when we exit nspawn
https://bugs.freedesktop.org/show_bug.cgi?id=68370
https://bugzilla.redhat.com/show_bug.cgi?id=988883
2013-11-06 02:31:35 +01:00
Djalal Harouni
b3451bed41 nspawn: log out of memory errors 2013-11-05 18:06:44 +01:00
Lennart Poettering
04d3927924 machinectl: add new command to spawn a getty inside a container 2013-10-31 01:43:38 +01:00
Lennart Poettering
4ba9328022 nspawn: split out pty forwaring logic into ptyfwd.c 2013-10-31 01:43:38 +01:00
Lennart Poettering
88212f7bd1 nspawn: only pass in slice setting if it is set 2013-10-30 18:40:21 +01:00
Lennart Poettering
40ca29a137 timedated: use libsystemd-bus instead of libdbus for bus communication
Among other things this also adds a few things necessary for the change:

- Considerably more powerful error returning APIs in libsystemd-bus

- Adapter for connecting an sd_bus to an sd_event

- As I reworked the PolicyKit logic to the new library I also made it
  asynchronous, so that PolicyKit requests of one user cannot block out
  another user anymore.

- We always use the macro names for common bus error. That way it is
  harder to mistype them since the compiler will notice
2013-10-16 06:15:02 +02:00
Zbigniew Jędrzejewski-Szmek
51d122af23 Introduce _cleanup_fdset_free_ 2013-10-13 17:56:54 -04:00
Lennart Poettering
51045322c4 nspawn: always copy /etc/resolv.conf rather than bind mount
We were already creating the file if it was missing, and this way
containers can reconfigure the file without running into problems.

This also makes resolv.conf handling more alike to handling of
/etc/localtime, which is also not a bind mount.
2013-10-02 19:45:12 +02:00
Dave Reisner
cecf24e7f0 fix grammatical error 2013-09-19 14:55:35 -04:00
Dave Reisner
d2421337f6 nspawn: be less liberal about creating bind mount destinations
Previously, if a file's bind mount destination didn't exist, nspawn
would blindly create a directory, and the subsequent bind mount would
fail. Examine the filetype of the source and ensure that, if the
destination does not exist, that it is created appropriately.

Also go one step further and ensure that the filetypes of the source
and destination match.
2013-09-19 14:48:43 -04:00
Zbigniew Jędrzejewski-Szmek
d182614649 nspawn: trivial simplification 2013-08-23 12:48:14 -04:00
Jesper Larsen
aea38d8047 nspawn: Reorder includes to fix compilation
Commit 2e996f4d4b added an include
of linux/netlink.h

This kernel header is not self contained in the linux 2.6 kernel
which breaks compilation with an unknown type sa_family_t

A workaround is to include linux/netlink.h after sys/socket.h
2013-07-19 08:25:50 -04:00
Lennart Poettering
6a4e0b1347 nspawn: use the corect method signature for CreateMachine() 2013-07-02 15:02:54 +02:00
Lennart Poettering
1ee306e124 machined: split out machine registration stuff from logind
Embedded folks don't need the machine registration stuff, hence it's
nice to make this optional. Also, I'd expect that machinectl will grow
additional commands quickly, for example to join existing containers and
suchlike, hence it's better keeping that separate from loginctl.
2013-07-02 03:47:23 +02:00
Zbigniew Jędrzejewski-Szmek
bd5a54582a nspawn: '-C' option has been removed
Fixup for 9444b1f "logind: add infrastructure to keep track of
machines, and move to slices."
2013-06-20 00:05:52 -04:00
Lennart Poettering
9444b1f20e logind: add infrastructure to keep track of machines, and move to slices
- This changes all logind cgroup objects to use slice objects rather
  than fixed croup locations.

- logind can now collect minimal information about running
  VMs/containers. As fixed cgroup locations can no longer be used we
  need an entity that keeps track of machine cgroups in whatever slice
  they might be located. Since logind already keeps track of users,
  sessions and seats this is a trivial addition.

- nspawn will now register with logind and pass various bits of metadata
  along. A new option "--slice=" has been added to place the container
  in a specific slice.

- loginctl gained commands to list, introspect and terminate machines.

- user.slice and machine.slice will now be pulled in by logind.service,
  since only logind.service requires this slice.
2013-06-20 03:49:59 +02:00
Dave Reisner
c2384970ff nspawn: only warn about audit when booting the container
The audit subsystem isn't relevant when nspawn is only being used as a
chroot.
2013-05-10 08:59:00 -04:00
Colin Walters
2e996f4d4b nspawn: Include netlink headers rather than using #ifdef
This is a better fix than e13e1fad8b for
failing to compile without audit that
77b6e19458 introduced.
2013-05-09 19:31:20 -04:00
Colin Walters
e13e1fad8b Fix previous commit for !HAVE_AUDIT 2013-05-09 18:37:26 -04:00
Lennart Poettering
77b6e19458 audit: since audit is apparently never going to be fixed for containers tell the user what's going on
Let's try to be helpful to the user and give him a hint what he can do
to make nspawn work with normal OS containers.

https://bugzilla.redhat.com/show_bug.cgi?id=893751
2013-05-10 00:17:36 +02:00
Lennart Poettering
e724b0639c hostname: only suppress setting of pretty hostname if it is non-equal to the static hostname and if the static hostname is set, too
https://bugzilla.redhat.com/show_bug.cgi?id=957814
2013-05-07 20:56:41 +02:00
Lennart Poettering
b00ad20fa0 build-sys: support builds without EAs again 2013-05-07 19:03:46 +02:00
Lennart Poettering
f8964235e6 nspawn: explain that we look for /etc/os-release in the container directory
https://bugs.freedesktop.org/show_bug.cgi?id=64014
2013-05-06 21:06:18 +02:00
Dave Reisner
a5f5f8a077 nspawn: inherit the exit status of container
If we get as far as successfully starting the container, nspawn should
inherit the exit status of the child container process as its own.
2013-05-02 10:41:03 -04:00
Zbigniew Jędrzejewski-Szmek
38158b920e cgls: add --machine/-M
cg_get_machine_path is modified to include the escaped machine name
+ ".nspawn" if the machine argument is nonnull.
2013-05-01 10:15:25 -04:00
Lennart Poettering
05947befce units: add an easy-to-use unit template file systemd-nspawn@.service for running containers as system services 2013-04-30 08:36:02 -03:00
Lennart Poettering
aa96c6cb44 id128: when taking user input for a 128bit ID, validate syntax
Also, always accept both our simple hexdump syntax and UUID syntax.
2013-04-30 08:36:01 -03:00
Evangelos Foutras
d7e011e5bf nspawn: add -M option to optstring
This was missed in commit 7027ff61a3 and
means that the --machine option would work but not its shorthand, -M.
2013-04-29 09:00:27 -04:00
Lennart Poettering
ae018d9bc9 cgroup: make sure all our cgroup objects have a suffix and are properly escaped
Session objects will now get the .session suffix, user objects the .user
suffix, nspawn containers the .nspawn suffix.

This also changes the user cgroups to be named after the numeric UID
rather than the username, since this allows us the parse these paths
standalone without requiring access to the cgroup file system.

This also changes the mapping of instanced units to cgroups. Instead of
mapping foo@bar.service to the cgroup path /user/foo@.service/bar we
will now map it to /user/foo@.service/foo@bar.service, in order to
ensure that all our objects are properly suffixed in the tree.
2013-04-22 23:14:12 -03:00
Lennart Poettering
aff38e74bd nspawn: suffix the nspawn cgroups with ".nspawn"
As discussed with Dan Berrange it's a good idea to suffix all objects in
the cgroup tree with ".something", so that when the system is
partitioned using a resource management tool we can drop objects of
different types into the same partition directory without generate
namespace conflicts.

We'l add this to the Pax Control Group document as soon as write access
to the fdo wiki is restored.
2013-04-22 23:14:12 -03:00
Lennart Poettering
dc2c75602d nspawn: always use cg_get_path() to determine fs path for a cgroup 2013-04-22 23:14:12 -03:00
Zbigniew Jędrzejewski-Szmek
a383724e42 systemd,nspawn: use extended attributes to store metadata
All attributes are stored as text, since root_directory is already
text, and it seems easier to have all of them in text format.

Attributes are written in the trusted. namespace, because the kernel
currently does not allow user. attributes on cgroups. This is a PITA,
and CAP_SYS_ADMIN is required to *read* the attributes. Alas.

A second pipe is opened for the child to signal the parent that the
cgroup hierarchy has been set up.
2013-04-21 21:43:43 -04:00
Zbigniew Jędrzejewski-Szmek
f333fbb1ef nspawn: create empty /etc/resolv.conf if necessary
nspawn will overmount resolv.conf if it exists. Since e.g.
default install with yum doesn't create /etc/resolv.conf,
a container created with yum will not have network. This
seems undesirable, and since we overmount the file anyway,
let's create it too.

Also, mounting a read-write /etc/resolv.conf in the container
is treated as a failure, since it makes it possible to
modify hosts /etc/resolv.conf from inside the container.
2013-04-18 19:38:28 -04:00
Harald Hoyer
7fd1b19bc9 move _cleanup_ attribute in front of the type
http://lists.freedesktop.org/archives/systemd-devel/2013-April/010510.html
2013-04-18 09:11:22 +02:00
Lennart Poettering
6606089752 path-util: unify code for detecting OS trees
This also makes sure we always detect an OS tree the same way, by
checking for /etc/os-release.
2013-04-16 05:47:04 +02:00
Lennart Poettering
7027ff61a3 nspawn: introduce the new /machine/ tree in the cgroup tree and move containers there
Containers will now carry a label (normally derived from the root
directory name, but configurable by the user), and the container's root
cgroup is /machine/<label>. This label is called "machine name", and can
cover both containers and VMs (as soon as libvirt also makes use of
/machine/).

libsystemd-login can be used to query the machine name from a process.

This patch also includes numerous clean-ups for the cgroup code.
2013-04-16 04:41:21 +02:00
Zbigniew Jędrzejewski-Szmek
b92bea5d2a Use initalization instead of explicit zeroing
Before, we would initialize many fields twice: first
by filling the structure with zeros, and then a second
time with the real values. We can let the compiler do
the job for us, avoiding one copy.

A downside of this patch is that text gets slightly
bigger. This is because all zero() calls are effectively
inlined:

$ size build/.libs/systemd
         text    data     bss     dec     hex filename
before 897737  107300    2560 1007597   f5fed build/.libs/systemd
after  897873  107300    2560 1007733   f6075 build/.libs/systemd

… actually less than 1‰.

A few asserts that the parameter is not null had to be removed. I
don't think this changes much, because first, it is quite unlikely
for the assert to fail, and second, an immediate SEGV is almost as
good as an assert.
2013-04-05 19:50:57 -04:00
Lennart Poettering
574d5f2dfc util: rename write_one_line_file() to write_string_file()
You can write much more than just one line with this call (and we
frequently do), so let's correct the naming.
2013-04-03 20:12:56 +02:00
Zbigniew Jędrzejewski-Szmek
10d18763ec nspawn, machine-id-setup: warn if read-only mount call fails
They are not crucial, but they shouldn't fail.
2013-03-31 14:32:48 -04:00
Lennart Poettering
9d60cb63d6 nspawn: don't make assumptions about the size of pid_t 2013-03-15 16:49:08 +01:00
Lennart Poettering
f2d88580b5 nspawn: create a separate devpts namespace for nspawn containers 2013-03-07 13:34:07 +01:00
Zbigniew Jędrzejewski-Szmek
5674767ec2 nspawn: environment would be truncated with TERM unset 2013-02-27 21:55:00 -05:00
Lennart Poettering
17fe052346 nspawn: add --bind= and --bind-ro= to bind mount host paths into the container 2013-02-25 20:08:07 +01:00
Michal Schmidt
1ddf879acf Revert "nspawn: catch config mistake of specifying -b and args"
This reverts commit cb96a2c69a.

It is not a mistake to pass args when -b is specified. They will simply
be passed on to the container's init.

The manpage needs fixing, that's true.
2013-02-25 18:39:16 +01:00
Zbigniew Jędrzejewski-Szmek
cb96a2c69a nspawn: catch config mistake of specifying -b and args 2013-02-24 14:11:11 +01:00
Zbigniew Jędrzejewski-Szmek
5659774c57 nspawn: fail if unable to close pipe 2013-02-14 15:26:33 -05:00
Zbigniew Jędrzejewski-Szmek
1fd961211d nspawn: print PID and show how to enter the namespace
systemd-nspawn will now print the PID of the child.
An example showing how to enter the container is added
to the man page.

Support for nsenter without an explicit command was
added in https://github.com/karelzak/util-linux/commit/5758069
(post v2.22.2). So this example requires both a new kernel
and the latest util-linux.
2013-02-14 10:40:45 -05:00
Harald Hoyer
a5c32cff1f honor SELinux labels, when creating and writing config files
Also split out some fileio functions to fileio.c and provide a SELinux
aware pendant in fileio-label.c

see https://bugzilla.redhat.com/show_bug.cgi?id=881577
2013-02-14 16:19:38 +01:00
Michal Schmidt
f2956e80c9 nspawn: assume stdout is always writable if it does not support epoll
stdout can be redirected to a regular file. Regular files don't support epoll.
nspawn failed with: "Failed to register fds in epoll: Operation not permitted".

If stdout does not support epoll, assume it's always writable.
2013-01-26 00:16:13 +01:00
Lennart Poettering
88d04e31ce nspawn: add audit caps to default set to keep
Due to the brokeness of much of the userspace audit code we cannot
really start too many systems without the audit caps set. To make nspawn
easier to use just add the audit caps by default.

To boot up containers successfully the kernel's auditing needs to be
turned off still (use "audit=0" on the kernel command line), but at
least no manual caps have to be passed anymore.

In the long run auditing will be fixed for containers and ve virtualized
properly at which time it should be safe to enable these caps anyway.
2013-01-18 18:23:20 +01:00
Zbigniew Jędrzejewski-Szmek
acbeb42770 nspawn: add --version 2013-01-11 16:03:49 -05:00
Lennart Poettering
57cb4adf4e nspawn: try to orderly shutdown container when receiving SIGTERM 2012-12-22 22:17:58 +01:00
Lennart Poettering
842f3b0fc9 nspawn: allow passing socket activation fds through nspawn 2012-12-22 22:17:58 +01:00
Lennart Poettering
51d88d1b4f nspawn: allow nspawn to be invoked without tty
This allows invoking nspawn containers as systemd services, to create a
minimal, light-weight OS container solution for servers.
2012-12-22 22:17:58 +01:00
Lennart Poettering
3c957acf86 nspawn: reset supplementary and main group id before entering nspawn 2012-11-22 00:45:22 +01:00
Zbigniew Jędrzejewski-Szmek
27407a01c6 nspawn: use automatic cleanup and provide debug info
The documentation for --link-journal is also reworded.
2012-10-02 14:56:26 +02:00
Lennart Poettering
963ddb917d log: fix repeated invocation of vsnprintf()/vaprintf() in log_struct()
https://bugs.freedesktop.org/show_bug.cgi?id=55213
2012-09-24 23:26:46 +02:00
Lennart Poettering
77e63fafa5 nspawn: document why we don't check resolv.conf mount errors 2012-09-21 16:55:56 +02:00
Lennart Poettering
d40361453b nspawn: we can't overmount /etc/localtime anymore since it's usually a symlink now
Create the right symlink if possible for /etc/localtime
2012-09-21 16:54:54 +02:00
Zbigniew Jędrzejewski-Szmek
89154bd4ac nspawn: fix memleak introduced with automatic cleanup
6b2d0e8 introduced a memleak instead of fixing one.
Fix both.
2012-09-16 16:33:20 +02:00
Zbigniew Jędrzejewski-Szmek
25ea79fe07 nspawn: use automatic cleanup for umask 2012-09-16 16:20:09 +02:00
Zbigniew Jędrzejewski-Szmek
ed8b7a3ee5 nspawn: _cleanup_free_ more 2012-09-16 16:20:09 +02:00
Zbigniew Jędrzejewski-Szmek
6b2d0e85dc nspawn: use automatic cleanup
This one actually clears up a (totally harmless) memleak.
2012-09-16 16:20:09 +02:00
Zbigniew Jędrzejewski-Szmek
ede89845a4 nspawn: mount tmpfs on /dev/shm
Most things seem to function fine without /dev/shm, but it is expected
to be there (quoting linux/Documentation/filesystems/tmpfs.txt:
glibc 2.2 and above expects tmpfs to be mounted at /dev/shm for POSIX
shared memory (shm_open, shm_unlink)).

Since /tmp/ is already mounted as tmpfs, it would be enough to mkdir
/tmp/shm and chmod it. Mounting it separately has the advantage that
it can be easily remounted to change the quota.
2012-09-16 16:20:09 +02:00
Lennart Poettering
d87be9b0af nspawn: handle poweroff/reboot nicely in containers 2012-09-05 16:23:41 -07:00
Lennart Poettering
3eabccc46c nspawn: don't provide /dev/rtc0 in the container
Since RTCs are hardware devices and are very much shared resources we
should avoid to provide them in each container.
2012-09-05 15:27:07 -07:00
Lennart Poettering
04bc4a3f47 nspawn: generate a new randomized boot ID for each container 2012-09-05 14:39:16 -07:00
Lennart Poettering
9c1c7f712d nspawn: if a file system comes pre-mounted, still do the read-only remounts 2012-09-05 14:16:41 -07:00
Lennart Poettering
014a9c777b nspawn: skip mounts if already mounted 2012-09-04 16:33:13 -07:00
Lennart Poettering
e65aec12ae nspawn: mount a clean instance of sysfs 2012-09-04 16:32:43 -07:00
Dave Reisner
4fc9982cb0 nspawn: add /dev FD symlinks in container setup
This creates /dev/fd, /dev/stdin, /dev/stdout, /dev/stderr, and
/dev/core as symlinks to /proc on container creation. Except for
/dev/core, these are needed for shells like bash to be fully functional.
2012-08-21 17:19:38 +02:00
Lennart Poettering
1e41be2015 nspawn,namespaces: make sure we recursively bind mount things in
We want to make sure that everything from the host is also visible in
the sandbox.
2012-08-13 16:25:03 +02:00
Lennart Poettering
b4c59701f8 nspawn: unset a few unnecessary params to mount() 2012-08-13 16:23:31 +02:00
Lennart Poettering
6f67a45d8e nspawn: inherit mounts from real root, don't propagate mounts to real root 2012-08-13 15:23:10 +02:00
Shawn Landden
0d0f0c50d3 log.h: new log_oom() -> int -ENOMEM, use it
also a number of minor fixups and bug fixes: spelling, oom errors
that didn't print errors, not properly forwarding error codes,
few more consistency issues, et cetera
2012-07-26 11:48:26 +02:00
Shawn Landden
669241a076 use "Out of memory." consistantly (or with "\n")
glibc/glib both use "out of memory" consistantly so maybe we should
consider that instead of this.

Eliminates one string out of a number of binaries. Also fixes extra newline
in udev/scsi_id
2012-07-25 11:23:57 +02:00
Lennart Poettering
db7feb7e9c nspawn: generate proper error messages in the child 2012-07-19 02:03:42 +02:00
Lennart Poettering
57fb9fb56d nspawn: introduce new --link-journal= switch to link container journals into host 2012-07-19 02:02:39 +02:00
Lennart Poettering
d05c5031ad unit: introduce %s specifier for the user shell 2012-07-16 12:34:54 +02:00
Lennart Poettering
5076f0ccfd nspawn: introduce new --capabilities= flag and make use of it in the nspawn test case 2012-06-28 14:05:16 +02:00
Kay Sievers
d2e54fae5c mkdir: append _label to all mkdir() calls that explicitly set the selinux context 2012-05-31 12:40:20 +02:00
Lennart Poettering
ec8927ca59 main: add configuration option to alter capability bounding set for PID 1
This also ensures that caps dropped from the bounding set are also
dropped from the inheritable set, to be extra-secure. Usually that should
change very little though as the inheritable set is empty for all our uses
anyway.
2012-05-24 04:00:56 +02:00
Kay Sievers
9eb977db5b util: split-out path-util.[ch] 2012-05-08 02:33:10 +02:00
Lennart Poettering
bc2f673ec2 nspawn: add --read-only switch 2012-04-25 15:11:20 +02:00
Lennart Poettering
2547bb414c nspawn: bind mount /etc/resolv.conf from the host by default 2012-04-25 15:08:00 +02:00
Lennart Poettering
144f0fc0c8 nspawn: add --uuid= switch to allow setting the machine id for the container 2012-04-22 14:48:21 +02:00
Lennart Poettering
0f0dbc46cc nspawn: add -b switch to automatically look for an init binary 2012-04-22 14:11:32 +02:00
Lennart Poettering
3a74cea5e4 nspawn: be more careful when initializing the hostname from the directory name 2012-04-22 01:01:22 +02:00
Lennart Poettering
f1e5dfe2c0 nspawn: make /dev/kmsg unavailable in the container, but allow access to /proc/kmsg 2012-04-22 00:32:53 +02:00
Kay Sievers
4d46fec56d remove MS_* which can not be combined with current kernel code
MS_BIND|MS_MOVE can not be combined:
  do_mount()
    else if (flags & MS_BIND)
      do_loopback(&path, dev_name, flags & MS_REC);
    [...]
    else if (flags & MS_MOVE)
      do_move_mount(&path, dev_name);

MS_REMOUNT|MS_UNBINDABLE can not be combined:
  do_mount()
    if (flags & MS_REMOUNT)
      do_remount(&path, flags & ~MS_REMOUNT, mnt_flags, data_page);
    [...]
    else if (flags & (MS_SHARED | MS_PRIVATE | MS_SLAVE | MS_UNBINDABLE))
      do_change_type(&path, flags);
2012-04-18 13:37:45 +02:00
Lennart Poettering
b562f5a57d build-sys: add stub makefiles to all subdirs to ease development with emacs 2012-04-13 21:37:59 +02:00
Lennart Poettering
9537eab070 nspawn: add missing include lines 2012-04-13 21:37:59 +02:00
Lennart Poettering
e58a12770c nspawn: fake /dev/kmsg and /proc/kmsg as fifo 2012-04-13 18:52:52 +02:00
Kay Sievers
dce818b390 move all tools to subdirs 2012-04-12 17:54:42 +02:00