linux

iv/linux

History

Linus Torvalds fda31c5029 signal: avoid double atomic counter increments for user accounting When queueing a signal, we increment both the users count of pending signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount of the user struct itself (because we keep a reference to the user in the signal structure in order to correctly account for it when freeing). That turns out to be fairly expensive, because both of them are atomic updates, and particularly under extreme signal handling pressure on big machines, you can get a lot of cache contention on the user struct. That can then cause horrid cacheline ping-pong when you do these multiple accesses. So change the reference counting to only pin the user for the _first_ pending signal, and to unpin it when the last pending signal is dequeued. That means that when a user sees a lot of concurrent signal queuing - which is the only situation when this matters - the only atomic access needed is generally the 'sigpending' count update. This was noticed because of a particularly odd timing artifact on a dual-socket 96C/192T Cascade Lake platform: when you get into bad contention, on that machine for some reason seems to be much worse when the contention happens in the upper 32-byte half of the cacheline. As a result, the kernel test robot will-it-scale 'signal1' benchmark had an odd performance regression simply due to random alignment of the 'struct user_struct' (and pointed to a completely unrelated and apparently nonsensical commit for the regression). Avoiding the double increments (and decrements on the dequeueing side, of course) makes for much less contention and hugely improved performance on that will-it-scale microbenchmark. Quoting Feng Tang: "It makes a big difference, that the performance score is tripled! bump from original 17000 to 54000. Also the gap between 5.0-rc6 and 5.0-rc6+Jiri's patch is reduced to around 2%" [ The "2% gap" is the odd cacheline placement difference on that platform: under the extreme contention case, the effect of which half of the cacheline was hot was 5%, so with the reduced contention the odd timing artifact is reduced too ] It does help in the non-contended case too, but is not nearly as noticeable. Reported-and-tested-by: Feng Tang <feng.tang@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Philip Li <philip.li@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2020-02-26 09:54:03 -08:00
..
bpf	bpf: Fix a potential deadlock with bpf_map_do_batch	2020-02-19 16:01:25 -08:00
cgroup	Merge branch 'for-5.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup	2020-02-10 17:07:05 -08:00
configs
debug	Revert "kdb: Get rid of confusing diag msg from "rd" if current task has no regs"	2020-02-06 11:40:09 +00:00
dma	dma-direct: improve DMA mask overflow reporting	2020-02-05 18:53:41 +01:00
events	A set of fixes and improvements for the perf subsystem:	2020-02-09 12:04:09 -08:00
gcov	Revert "um: Enable CONFIG_CONSTRUCTORS"	2020-01-19 22:42:06 +01:00
irq	genirq/proc: Reject invalid affinity masks (again)	2020-02-14 09:43:17 +01:00
livepatch	New tracing features:	2019-11-27 11:42:01 -08:00
locking	proc: convert everything to "struct proc_ops"	2020-02-04 03:05:26 +00:00
power	ACPI: PM: s2idle: Avoid possible race related to the EC GPE	2020-02-11 10:11:02 +01:00
printk	printk: fix exclusive_console replaying	2020-01-02 16:15:04 +01:00
rcu	rcu: Forgive slow expedited grace periods at boot time	2020-01-25 12:00:40 -08:00
sched	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2020-02-15 12:51:22 -08:00
time	y2038: remove unused time32 interfaces	2020-02-21 11:22:15 -08:00
trace	Various fixes:	2020-02-11 16:39:18 -08:00
.gitignore
acct.c	acct: stop using get_seconds()	2019-12-18 18:07:31 +01:00
async.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
audit_fsnotify.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157	2019-05-30 11:26:37 -07:00
audit_tree.c
audit_watch.c	audit_get_nd(): don't unlock parent too early	2019-11-10 11:56:55 -05:00
audit.c	audit: Add __rcu annotation to RCU pointer	2019-12-09 15:19:03 -05:00
audit.h	audit/stable-5.3 PR 20190702	2019-07-08 18:55:42 -07:00
auditfilter.c	audit/stable-5.3 PR 20190702	2019-07-08 18:55:42 -07:00
auditsc.c	Revert "bpf: Emit audit messages upon successful prog load and unload"	2019-11-23 09:56:02 -08:00
backtracetest.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
bounds.c
capability.c
compat.c	y2038: remove unused time32 interfaces	2020-02-21 11:22:15 -08:00
configs.c	proc: convert everything to "struct proc_ops"	2020-02-04 03:05:26 +00:00
context_tracking.c	context_tracking: Rename context_tracking_is_enabled() => context_tracking_enabled()	2019-10-29 10:01:12 +01:00
cpu_pm.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282	2019-06-05 17:36:37 +02:00
cpu.c	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2020-01-28 10:07:09 -08:00
crash_core.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 230	2019-06-19 17:09:06 +02:00
crash_dump.c
cred.c	Merge branch 'dhowells' (patches from DavidH)	2020-01-14 09:56:31 -08:00
delayacct.c
dma.c
elfcore.c	kernel/elfcore.c: include proper prototypes	2019-09-25 17:51:39 -07:00
exec_domain.c
exit.c	for-linus-2020-01-03	2020-01-03 11:17:14 -08:00
extable.c	bpf: Allow to resolve bpf trampoline and dispatcher in unwind	2020-01-25 07:12:40 -08:00
fail_function.c	fail_function: no need to check return value of debugfs_create functions	2019-06-03 15:49:06 +02:00
fork.c	hmm related patches for 5.6	2020-01-29 19:56:50 -08:00
freezer.c	Revert "libata, freezer: avoid block device removal while system is frozen"	2019-10-06 09:11:37 -06:00
futex.c	futex: Fix kernel-doc notation warning	2020-01-09 13:23:40 +01:00
gen_kheaders.sh	kheaders: explain why include/config/autoconf.h is excluded from md5sum	2019-11-11 20:10:01 +09:00
groups.c
hung_task.c
iomem.c	mm/nvdimm: add is_ioremap_addr and use that to check ioremap address	2019-07-12 11:05:40 -07:00
irq_work.c	irq_work: Fix IRQ_WORK_BUSY bit clearing	2019-11-15 10:48:37 +01:00
jump_label.c	jump_label: Don't warn on __exit jump entries	2019-08-29 15:10:10 +01:00
kallsyms.c	Kbuild updates for v5.6 (2nd)	2020-02-09 16:05:50 -08:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks	sched/rt, locking: Use CONFIG_PREEMPTION	2019-12-08 14:37:36 +01:00
Kconfig.preempt	sched/Kconfig: Fix spelling mistake in user-visible help text	2019-11-12 11:35:32 +01:00
kcov.c	kcov: remote coverage support	2019-12-04 19:44:14 -08:00
kexec_core.c	kexec: add machine_kexec_post_load()	2020-01-08 16:32:55 +00:00
kexec_elf.c	kexec_elf: support 32 bit ELF files	2019-09-06 23:58:44 +02:00
kexec_file.c	kexec: add machine_kexec_post_load()	2020-01-08 16:32:55 +00:00
kexec_internal.h	kexec: add machine_kexec_post_load()	2020-01-08 16:32:55 +00:00
kexec.c	kexec: add machine_kexec_post_load()	2020-01-08 16:32:55 +00:00
kheaders.c
kmod.c
kprobes.c	kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic	2020-01-09 12:40:13 +01:00
ksysfs.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 170	2019-05-30 11:26:39 -07:00
kthread.c	kthread: make __kthread_queue_delayed_work static	2019-10-16 09:20:58 -07:00
latencytop.c	proc: convert everything to "struct proc_ops"	2020-02-04 03:05:26 +00:00
Makefile	kcov: ignore fault-inject and stacktrace	2020-01-31 10:30:41 -08:00
module_signature.c	MODSIGN: Export module signature definitions	2019-08-05 18:39:56 -04:00
module_signing.c	MODSIGN: Export module signature definitions	2019-08-05 18:39:56 -04:00
module-internal.h
module.c	proc: convert everything to "struct proc_ops"	2020-02-04 03:05:26 +00:00
notifier.c	kernel/notifier.c: remove blocking_notifier_chain_cond_register()	2019-12-04 19:44:12 -08:00
nsproxy.c	ns: Introduce Time Namespace	2020-01-14 12:20:48 +01:00
padata.c	padata: update documentation	2019-12-11 16:37:02 +08:00
panic.c	locking/refcount: Remove unused 'refcount_error_report()' function	2019-11-25 09:15:42 +01:00
params.c	lockdown: Lock down module params that specify hardware parameters (eg. ioport)	2019-08-19 21:54:16 -07:00
pid_namespace.c	fork: extend clone3() to support setting a PID	2019-11-15 23:49:22 +01:00
pid.c	pid: Implement pidfd_getfd syscall	2020-01-13 21:49:36 +01:00
profile.c	proc: convert everything to "struct proc_ops"	2020-02-04 03:05:26 +00:00
ptrace.c	ptrace: reintroduce usage of subjective credentials in ptrace_has_cap()	2020-01-18 13:51:39 +01:00
range.c
reboot.c
relay.c
resource.c	mm/memory_hotplug.c: use PFN_UP / PFN_DOWN in walk_system_ram_range()	2019-09-24 15:54:09 -07:00
rseq.c	rseq: Reject unknown flags on rseq unregister	2019-12-25 10:41:20 +01:00
seccomp.c	seccomp: Check that seccomp_notif is zeroed out by the user	2020-01-02 13:03:45 -08:00
signal.c	signal: avoid double atomic counter increments for user accounting	2020-02-26 09:54:03 -08:00
smp.c	smp: Remove superfluous cond_func check in smp_call_function_many_cond()	2020-01-28 15:43:00 +01:00
smpboot.c
smpboot.h
softirq.c	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2019-07-08 11:01:13 -07:00
stackleak.c
stacktrace.c	stacktrace: Get rid of unneeded '!!' pattern	2019-11-11 10:30:59 +01:00
stop_machine.c	stop_machine: Make stop_cpus() static	2020-01-17 10:19:21 +01:00
sys_ni.c	y2038: allow disabling time32 system calls	2019-11-15 14:38:30 +01:00
sys.c	prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim	2020-01-28 10:09:51 +01:00
sysctl_binary.c	sysctl: Remove the sysctl system call	2019-11-26 13:03:56 -06:00
sysctl-test.c	kunit: allow kunit tests to be loaded as a module	2020-01-09 16:42:29 -07:00
sysctl.c	s390: remove obsolete ieee_emulation_warnings	2020-02-19 13:51:46 +01:00
task_work.c
taskstats.c	taskstats: fix data-race	2019-12-04 15:18:39 +01:00
test_kprobes.c
torture.c	torture: Remove exporting of internal functions	2019-08-01 14:30:22 -07:00
tracepoint.c	The main changes in this release include:	2019-07-18 11:51:00 -07:00
tsacct.c	tsacct: add 64-bit btime field	2019-12-18 18:07:31 +01:00
ucount.c	proc/sysctl: add shared variables for range check	2019-07-18 17:08:07 -07:00
uid16.c
uid16.h
umh.c
up.c	smp/up: Make smp_call_function_single() match SMP semantics	2020-02-07 15:34:12 +01:00
user_namespace.c	Keyrings namespacing	2019-07-08 19:36:47 -07:00
user-return-notifier.c
user.c	Keyrings namespacing	2019-07-08 19:36:47 -07:00
utsname_sysctl.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
utsname.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441	2019-06-05 17:37:17 +02:00
watchdog_hld.c
watchdog.c	watchdog/softlockup: Enforce that timestamp is valid on boot	2020-01-17 11:19:22 +01:00
workqueue_internal.h
workqueue.c	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2020-01-28 10:07:09 -08:00