systemd

mirror of https://github.com/systemd/systemd.git synced 2024-11-06 16:59:03 +03:00

Author	SHA1	Message	Date
Zbigniew Jędrzejewski-Szmek	dadd6ecfa5	Merge pull request #3728 from poettering/dynamic-users	2016-07-25 16:40:26 -04:00
Michael Olbrich	87d41d6244	automount: don't cancel mount/umount request on reload/reexec (#3670 ) All pending tokens are already serialized correctly and will be handled when the mount unit is done. Without this a 'daemon-reload' cancels all pending tokens. Any process waiting for the mount will continue with EHOSTDOWN. This can happen when the mount unit waits for it's dependencies, e.g. network, devices, fsck, etc.	2016-07-25 20:04:02 +02:00
Michael Olbrich	2de0b9e913	transaction: don't cancel jobs for units with IgnoreOnIsolate=true (#3671 ) This is important if a job was queued for a unit but not yet started. Without this, the job will be canceled and is never executed even though IgnoreOnIsolate it set to 'true'.	2016-07-25 20:02:55 +02:00
Lennart Poettering	43eb109aa9	core: change ExecStart=! syntax to ExecStart=+ (#3797 ) As suggested by @mbiebl we already use the "!" special char in unit file assignments for negation, hence we should not use it in a different context for privileged execution. Let's use "+" instead.	2016-07-25 16:53:33 +02:00
Zbigniew Jędrzejewski-Szmek	31b14fdb6f	Merge pull request #3777 from poettering/id128-rework uuid/id128 code rework	2016-07-22 21:18:41 -04:00
Lennart Poettering	5052c4eadd	Merge pull request #3753 from poettering/tasks-max-scale Add support for relative TasksMax= specifications, and bump default for services	2016-07-22 17:40:12 +02:00
Alessandro Puccetti	0d9e799102	cgroup: whitelist inaccessible devices for "auto" and "closed" DevicePolicy. https://github.com/systemd/systemd/pull/3685 introduced /run/systemd/inaccessible/{chr,blk} to map inacessible devices, this patch allows systemd running inside a nspawn container to create /run/systemd/inaccessible/{chr,blk}.	2016-07-22 16:08:31 +02:00
Lennart Poettering	409093fe10	nss: add new "nss-systemd" NSS module for mapping dynamic users With this NSS module all dynamic service users will be resolvable via NSS like any real user.	2016-07-22 15:53:45 +02:00
Lennart Poettering	6f3e79859d	core: enforce user/group name validity also when creating transient units	2016-07-22 15:53:45 +02:00
Lennart Poettering	29206d4619	core: add a concept of "dynamic" user ids, that are allocated as long as a service is running This adds a new boolean setting DynamicUser= to service files. If set, a new user will be allocated dynamically when the unit is started, and released when it is stopped. The user ID is allocated from the range 61184..65519. The user will not be added to /etc/passwd (but an NSS module to be added later should make it show up in getent passwd). For now, care should be taken that the service writes no files to disk, since this might result in files owned by UIDs that might get assigned dynamically to a different service later on. Later patches will tighten sandboxing in order to ensure that this cannot happen, except for a few selected directories. A simple way to test this is: systemd-run -p DynamicUser=1 /bin/sleep 99999	2016-07-22 15:53:45 +02:00
Lennart Poettering	66dccd8d85	core: be stricter when parsing User=/Group= fields Let's verify the validity of the syntax of the user/group names set.	2016-07-22 15:53:45 +02:00
Lennart Poettering	b3785cd5e6	core: check for overflow when handling scaled MemoryLimit= settings Just in case...	2016-07-22 15:33:13 +02:00
Harald Hoyer	2424b6bd71	macros.systemd.in: add %systemd_ordering (#3776 ) To remove the hard dependency on systemd, for packages, which function without a running systemd the %systemd_ordering macro can be used to ensure ordering in the rpm transaction. %systemd_ordering makes sure, the systemd rpm is installed prior to the package, so the %pre/%post scripts can execute the systemd parts. Installing systemd afterwards though, does not result in the same outcome.	2016-07-22 09:33:13 -04:00
Lennart Poettering	79baeeb96d	core: change TasksMax= default for system services to 15% As it turns out 512 is max number of tasks per service is hit by too many applications, hence let's bump it a bit, and make it relative to the system's maximum number of PIDs. With this change the new default is 15%. At the kernel's default pids_max value of 32768 this translates to 4915. At machined's default TasksMax= setting of 16384 this translates to 2457. Why 15%? Because it sounds like a round number and is close enough to 4096 which I was going for, i.e. an eight-fold increase over the old 512 Summary: \| on the host \| in a container old default \| 512 \| 512 new default \| 4915 \| 2457	2016-07-22 15:33:13 +02:00
Lennart Poettering	84af7821b6	main: simplify things a bit by moving container check into fixup_environment()	2016-07-22 15:33:12 +02:00
Lennart Poettering	f7903e8db6	core: rename MemoryLimitByPhysicalMemory transient property to MemoryLimitScale That way, we can neatly keep this in line with the new TasksMaxScale= option. Note that we didn't release a version with MemoryLimitByPhysicalMemory= yet, hence this change should be unproblematic without breaking API.	2016-07-22 15:33:12 +02:00
Lennart Poettering	83f8e80857	core: support percentage specifications on TasksMax= This adds support for a TasksMax=40% syntax for specifying values relative to the system's configured maximum number of processes. This is useful in order to neatly subdivide the available room for tasks within containers.	2016-07-22 15:33:12 +02:00
Lennart Poettering	4b1afed01f	core: rework machine-id-setup.c to use the calls from id128-util.[ch] This allows us to delete quite a bit of code and make the whole thing a lot shorter.	2016-07-22 12:59:36 +02:00
Lennart Poettering	e042eab720	main: make sure set_machine_id() doesn't clobber arg_machine_id on failure	2016-07-22 12:59:36 +02:00
Lennart Poettering	15b1248a6b	machine-id-setup: port machine_id_commit() to new id128-util.c APIs	2016-07-22 12:59:36 +02:00
Lennart Poettering	910fd145f4	sd-id128: split UUID file read/write code into new id128-util.[ch] We currently have code to read and write files containing UUIDs at various places. Unify this in id128-util.[ch], and move some other stuff there too. The new files are located in src/libsystemd/sd-id128/ (instead of src/shared/), because they are actually the backend of sd_id128_get_machine() and sd_id128_get_boot(). In follow-up patches we can use this reduce the code in nspawn and machine-id-setup by adopted the common implementation.	2016-07-22 12:59:36 +02:00
Martin Pitt	bf3dd08a81	Merge pull request #3762 from poettering/sigkill-log log about all processes we forcibly kill	2016-07-22 09:18:30 +02:00
Martin Pitt	5c3c778014	Merge pull request #3764 from poettering/assorted-stuff-2 Assorted fixes	2016-07-22 09:10:04 +02:00
Alessandro Puccetti	31d28eabc1	nspawn: enable major=0/minor=0 devices inside the container (#3773 ) https://github.com/systemd/systemd/pull/3685 introduced /run/systemd/inaccessible/{chr,blk} to map inacessible devices, this patch allows systemd running inside a nspawn container to create /run/systemd/inaccessible/{chr,blk}.	2016-07-21 17:39:38 +02:00
Thomas H. P. Andersen	f8298f7be3	core: remove duplicate includes (#3771 )	2016-07-21 10:52:07 +02:00
Topi Miettinen	176e51b710	namespace: fix wrong return value from mount(2) (#3758 ) Fix bug introduced by #3263: mount(2) return value is 0 or -1, not errno. Thanks to Evgeny Vereshchagin (@evverx) for reporting.	2016-07-20 17:43:21 +03:00
Lennart Poettering	33df919d5c	execute: make sure JoinsNamespaceOf= doesn't leak ns fds to executed processes	2016-07-20 14:53:15 +02:00
Lennart Poettering	fe048ce56a	namespace: add a (void) cast	2016-07-20 14:53:15 +02:00
Lennart Poettering	9ce9347880	core: normalize header inclusion in execute.h a bit We don't actually need any functionality from cgroup.h in execute.h, hence don't include that. However, we do need the Unit structure from unit.h, hence include that, and move it as late as possible, since it needs the definitions from execute.h.	2016-07-20 14:53:15 +02:00
Lennart Poettering	7a1ab780c4	execute: normalize connect_logger_as() parameters slightly All other functions in execute.c that need the unit id take a Unit* parameter as first argument. Let's change connect_logger_as() to follow a similar logic.	2016-07-20 14:53:15 +02:00
Lennart Poettering	3862e809d0	core: when a scope was abandoned, always log about processes we kill After all, if a unit is abandoned, all processes inside of it may be considered "left over" and are something we should better log about.	2016-07-20 14:35:15 +02:00
Lennart Poettering	f4b0fb236b	core: make sure RequestStop signal is send directed This was accidentally left commented out for debugging purposes, let's fix that and make the signal directed again.	2016-07-20 14:35:15 +02:00
Lennart Poettering	1d98fef17d	core: when forcibly killing/aborting left-over unit processes log about it Let's lot at LOG_NOTICE about any processes that we are going to SIGKILL/SIGABRT because clean termination of them didn't work. This turns the various boolean flag parameters to cg_kill(), cg_migrate() and related calls into a single binary flags parameter, simply because the function now gained even more parameters and the parameter listed shouldn't get too long. Logging for killing processes is done either when the kill signal is SIGABRT or SIGKILL, or on explicit request if KILL_TERMINATE_AND_LOG instead of LOG_TERMINATE is passed. This isn't used yet in this patch, but is made use of in a later patch.	2016-07-20 14:35:15 +02:00
Lennart Poettering	5fd7cf6fe2	namespace: minor improvements We generally try to avoid strerror(), due to its threads-unsafety, let's do this here, too. Also, let's be tiny bit more explanatory with the log messages, and let's shorten a few things.	2016-07-20 08:57:25 +02:00
Lennart Poettering	d724118e20	core: hide legacy bus properties We usually hide legacy bus properties from introspection. Let's do that for the InaccessibleDirectories= properties too. The properties stay accessible if requested, but they won't be listed anymore if people introspect the unit.	2016-07-20 08:55:50 +02:00
Alessandro Puccetti	2a624c36e6	doc,core: Read{Write,Only}Paths= and InaccessiblePaths= This patch renames Read{Write,Only}Directories= and InaccessibleDirectories= to Read{Write,Only}Paths= and InaccessiblePaths=, previous names are kept as aliases but they are not advertised in the documentation. Renamed variables: `read_write_dirs` --> `read_write_paths` `read_only_dirs` --> `read_only_paths` `inaccessible_dirs` --> `inaccessible_paths`	2016-07-19 17:22:02 +02:00
Alessandro Puccetti	c4b4170746	namespace: unify limit behavior on non-directory paths Despite the name, `Read{Write,Only}Directories=` already allows for regular file paths to be masked. This commit adds the same behavior to `InaccessibleDirectories=` and makes it explicit in the doc. This patch introduces `/run/systemd/inaccessible/{reg,dir,chr,blk,fifo,sock}` {dile,device}nodes and mounts on the appropriate one the paths specified in `InacessibleDirectories=`. Based on Luca's patch from https://github.com/systemd/systemd/pull/3327	2016-07-19 17:22:02 +02:00
Lukáš Nykrýn	ccc2c98e1b	manager: don't skip sigchld handler for main and control pid for services (#3738 ) During stop when service has one "regular" pid one main pid and one control pid and the sighld for the regular one is processed first the unit_tidy_watch_pids will skip the main and control pid and does not remove them from u->pids(). But then we skip the sigchld event because we already did one in the iteration and there are two pids in u->pids. v2: Use general unit_main_pid() and unit_control_pid() instead of reaching directly to service structure.	2016-07-16 15:04:13 -04:00
Zbigniew Jędrzejewski-Szmek	2ed968802c	tree-wide: get rid of selinux_context_t (#3732 ) `9eb9c93275` deprecated selinux_context_t. Replace with a simple char* everywhere. Alternative fix for #3719.	2016-07-15 18:44:02 +02:00
Zbigniew Jędrzejewski-Szmek	1071fd0823	macros: provide %_systemdgeneratordir and %_systemdusergeneratordir (#3672 ) ... as requested in https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/DJ7HDNRM5JGBSA4HL3UWW5ZGLQDJ6Y7M/. Adding the macro makes it marginally easier to create generators for outside projects. I opted for "generatordir" and "usergeneratordir" to match %unitdir and %userunitdir. OTOH, "_systemd" prefix makes it obvious that this is related to systemd. "%_generatordir" would be to generic of a name.	2016-07-15 09:35:49 +02:00
Lennart Poettering	2e79d1828a	shutdown: already sync IO before we enter the final killing spree This way, slow IO journald has to wait for can't cause it to reach the killing spree timeout and is hit by SIGKILL in addition to SIGTERM.	2016-07-12 17:38:19 +02:00
Lennart Poettering	d450612953	shutdown: use 90s SIGKILL timeout There's really no reason to use 10s here, let's instead default to 90s like we do for everything else. The SIGKILL during the final killing spree is in most regards the fourth level of a safety net, after all: any normal service should have already been stopped during the normal service shutdown logic, first via SIGTERM and then SIGKILL, and then also via SIGTERM during the finall killing spree before we send SIGKILL. And as a fourth level safety net it should only be required in exceptional cases, which means it's safe to rais the default timeout, as normal shutdowns should never be delayed by it. Note that journald excludes itself from the normal service shutdown, and relies on the final killing spree to terminate it (this is because it wants to cover the normal shutdown phase's complete logging). If the system's IO is excessively slow, then the 10s might not be enough for journald to sync everything to disk and logs might get lost during shutdown.	2016-07-12 17:32:30 +02:00
Michael Biebl	595bfe7df2	Various fixes for typos found by lintian (#3705 )	2016-07-12 12:52:11 +02:00
Luca Bruno	391b81cd03	seccomp: only abort on syscall name resolution failures (#3701 ) seccomp_syscall_resolve_name() can return a mix of positive and negative (pseudo-) syscall numbers, while errors are signaled via __NR_SCMP_ERROR. This commit lets the syscall filter parser only abort on real parsing failures, letting libseccomp handle pseudo-syscall number on its own and allowing proper multiplexed syscalls filtering.	2016-07-12 11:55:26 +02:00
Torstein Husebø	61233823aa	treewide: fix typos and remove accidental repetition of words	2016-07-11 16:18:43 +02:00
Evgeny Vereshchagin	224d3d8266	Merge pull request #3680 from joukewitteveen/pam-env Follow up on #3503 (pass service env vars to PAM sessions)	2016-07-08 17:33:12 +03:00
Jouke Witteveen	84eada2f7f	execute: Do not alter call-by-ref parameter on failure Prevent free from being called on (a part of) the call-by-reference variable env when setup_pam fails.	2016-07-08 09:42:48 +02:00
David Michael	4f952a3f07	core: queue loading transient units after setting their properties (#3676 ) The unit load queue can be processed in the middle of setting the unit's properties, so its load_state would no longer be UNIT_STUB for the check in bus_unit_set_properties(), which would cause it to incorrectly return an error.	2016-07-08 05:43:01 +02:00
Daniel Mack	78a4ee591a	cgroup: fix memory cgroup limit regression on kernel 3.10 (#3673 ) Commit `da4d897e` ("core: add cgroup memory controller support on the unified hierarchy (#3315)") changed the code in src/core/cgroup.c to always write the real numeric value from the cgroup parameters to the "memory.limit_in_bytes" attribute file. For parameters set to CGROUP_LIMIT_MAX, this results in the string "18446744073709551615" being written into that file, which is UINT64_MAX. Before that commit, CGROUP_LIMIT_MAX was special-cased to the string "-1". This causes a regression on CentOS 7, which is based on kernel 3.10, as the value is interpreted as signed 64 bit, and clamped to 0: [root@n54 ~]# echo 18446744073709551615 >/sys/fs/cgroup/memory/user.slice/memory.limit_in_bytes [root@n54 ~]# cat /sys/fs/cgroup/memory/user.slice/memory.limit_in_bytes 0 [root@n54 ~]# echo -1 >/sys/fs/cgroup/memory/user.slice/memory.limit_in_bytes [root@n54 ~]# cat /sys/fs/cgroup/memory/user.slice/memory.limit_in_bytes 9223372036854775807 Hence, all units that are subject to the limits enforced by the memory controller will crash immediately, even though they have no actual limit set. This happens to for the user.slice, for instance: [ 453.577153] Hardware name: SeaMicro SM15000-64-CC-AA-1Ox1/AMD Server CRB, BIOS Estoc.3.72.19.0018 08/19/2014 [ 453.587024] ffff880810c56780 00000000aae9501f ffff880813d7fcd0 ffffffff816360fc [ 453.594544] ffff880813d7fd60 ffffffff8163109c ffff88080ffc5000 ffff880813d7fd28 [ 453.602120] ffffffff00000202 fffeefff00000000 0000000000000001 ffff880810c56c03 [ 453.609680] Call Trace: [ 453.612156] [<ffffffff816360fc>] dump_stack+0x19/0x1b [ 453.617324] [<ffffffff8163109c>] dump_header+0x8e/0x214 [ 453.622671] [<ffffffff8116d20e>] oom_kill_process+0x24e/0x3b0 [ 453.628559] [<ffffffff81088dae>] ? has_capability_noaudit+0x1e/0x30 [ 453.634969] [<ffffffff811d4155>] mem_cgroup_oom_synchronize+0x575/0x5a0 [ 453.641721] [<ffffffff811d3520>] ? mem_cgroup_charge_common+0xc0/0xc0 [ 453.648299] [<ffffffff8116da84>] pagefault_out_of_memory+0x14/0x90 [ 453.654621] [<ffffffff8162f4cc>] mm_fault_error+0x68/0x12b [ 453.660233] [<ffffffff81642012>] __do_page_fault+0x3e2/0x450 [ 453.666017] [<ffffffff816420a3>] do_page_fault+0x23/0x80 [ 453.671467] [<ffffffff8163e308>] page_fault+0x28/0x30 [ 453.676656] Task in /user.slice/user-0.slice/user@0.service killed as a result of limit of /user.slice/user-0.slice/user@0.service [ 453.688477] memory: usage 0kB, limit 0kB, failcnt 7 [ 453.693391] memory+swap: usage 0kB, limit 9007199254740991kB, failcnt 0 [ 453.700039] kmem: usage 0kB, limit 9007199254740991kB, failcnt 0 [ 453.706076] Memory cgroup stats for /user.slice/user-0.slice/user@0.service: cache:0KB rss:0KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB [ 453.725702] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name [ 453.733614] [ 2837] 0 2837 11950 899 23 0 0 (systemd) [ 453.741919] Memory cgroup out of memory: Kill process 2837 ((systemd)) score 1 or sacrifice child [ 453.750831] Killed process 2837 ((systemd)) total-vm:47800kB, anon-rss:3188kB, file-rss:408kB Fix this issue by special-casing the UINT64_MAX case again.	2016-07-07 19:29:35 -07:00
Jouke Witteveen	1280503b7e	execute: Cleanup the environment early By cleaning up before setting up PAM we maintain control of overriding behavior in setting variables. Otherwise, pam_putenv is in control. This also makes sure we use a cleaned up environment in replacing variables in argv.	2016-07-07 14:15:50 +02:00

1 2 3 4 5 ...

2611 Commits