linux

iv/linux

Go to file

Linus Torvalds d8a4a55bdc signal: avoid double atomic counter increments for user accounting

[ Upstream commit fda31c50292a5062332fa0343c084bd9f46604d9 ]

When queueing a signal, we increment both the users count of pending
signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount
of the user struct itself (because we keep a reference to the user in
the signal structure in order to correctly account for it when freeing).

That turns out to be fairly expensive, because both of them are atomic
updates, and particularly under extreme signal handling pressure on big
machines, you can get a lot of cache contention on the user struct.
That can then cause horrid cacheline ping-pong when you do these
multiple accesses.

So change the reference counting to only pin the user for the _first_
pending signal, and to unpin it when the last pending signal is
dequeued.  That means that when a user sees a lot of concurrent signal
queuing - which is the only situation when this matters - the only
atomic access needed is generally the 'sigpending' count update.

This was noticed because of a particularly odd timing artifact on a
dual-socket 96C/192T Cascade Lake platform: when you get into bad
contention, on that machine for some reason seems to be much worse when
the contention happens in the upper 32-byte half of the cacheline.

As a result, the kernel test robot will-it-scale 'signal1' benchmark had
an odd performance regression simply due to random alignment of the
'struct user_struct' (and pointed to a completely unrelated and
apparently nonsensical commit for the regression).

Avoiding the double increments (and decrements on the dequeueing side,
of course) makes for much less contention and hugely improved
performance on that will-it-scale microbenchmark.

Quoting Feng Tang:

 "It makes a big difference, that the performance score is tripled! bump
  from original 17000 to 54000. Also the gap between 5.0-rc6 and
  5.0-rc6+Jiri's patch is reduced to around 2%"

[ The "2% gap" is the odd cacheline placement difference on that
  platform: under the extreme contention case, the effect of which half
  of the cacheline was hot was 5%, so with the reduced contention the
  odd timing artifact is reduced too ]

It does help in the non-contended case too, but is not nearly as
noticeable.

Reported-and-tested-by: Feng Tang <feng.tang@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Huang, Ying <ying.huang@intel.com>
Cc: Philip Li <philip.li@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

2020-03-20 10:54:25 +01:00

arch

perf/amd/uncore: Replace manual sampling check with CAP_NO_INTERRUPT flag

2020-03-20 10:54:23 +01:00

block

block: don't use bio->bi_vcnt to figure out segment number

2020-01-27 14:46:19 +01:00

certs

Replace magic for trusting the secondary keyring with #define

2018-09-09 19:55:54 +02:00

crypto

crypto: api - Fix race condition in crypto_spawn_alg

2020-02-14 16:32:14 -05:00

Documentation

ACPI: watchdog: Allow disabling WDAT at boot

2020-03-20 10:54:23 +01:00

drivers

net: ks8851-ml: Fix IRQ handling and locking

2020-03-20 10:54:24 +01:00

firmware

Fix built-in early-load Intel microcode alignment

2020-01-23 08:20:30 +01:00

gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache

2020-03-20 10:54:16 +01:00

include

cgroup: Iterate tasks that did not finish do_exit()

2020-03-20 10:54:14 +01:00

init

fork: fix some -Wmissing-prototypes warnings

2019-12-05 15:37:52 +01:00

ipc

Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"

2020-02-28 16:36:12 +01:00

kernel

signal: avoid double atomic counter increments for user accounting

2020-03-20 10:54:25 +01:00

lib

lib/stackdepot.c: fix global out-of-bounds in stack_slabs

2020-02-28 16:36:13 +01:00

net: memcg: late association of sock to memcg

2020-03-20 10:54:09 +01:00

net

mac80211: rx: avoid RCU list traversal under mutex

2020-03-20 10:54:24 +01:00

samples

samples/bpf: Don't try to remove user's homedir on clean

2020-02-14 16:32:13 -05:00

scripts

kconfig: fix broken dependency in randconfig-generated .config

2020-02-28 16:35:59 +01:00

security

selinux: ensure we cleanup the internal AVC counters on error in avc_update()

2020-02-28 16:36:09 +01:00

sound

ASoC: topology: Fix memleak in soc_tplg_manifest_load()

2020-03-11 18:03:09 +01:00

tools

ktest: Add timeout for ssh sync testing

2020-03-20 10:54:16 +01:00

usr

kbuild: clean compressed initramfs image

2019-10-07 18:55:14 +02:00

virt

KVM: Check for a bad hva before dropping into the ghc slow path

2020-03-11 18:02:53 +01:00

.cocciconfig

…

.get_maintainer.ignore

…

.gitattributes

.gitattributes: set git diff driver for C source code files

2016-10-07 18:46:30 -07:00

.gitignore

kbuild: rpm-pkg: keep spec file until make mrproper

2018-02-13 10:19:46 +01:00

.mailmap

.mailmap: Add Maciej W. Rozycki's Imagination e-mail address

2017-11-10 12:16:15 -08:00

COPYING

…

CREDITS

MAINTAINERS: update TPM driver infrastructure changes

2017-11-09 17:58:40 -08:00

Kbuild

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

2017-11-02 11:10:55 +01:00

Kconfig

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

2017-11-02 11:10:55 +01:00

MAINTAINERS

MAINTAINERS: Update drm/i915 bug filing URL

2020-02-28 16:36:12 +01:00

Makefile

Linux 4.14.173

2020-03-11 18:03:09 +01:00

README

README: add a new README file, pointing to the Documentation/

2016-10-24 08:12:35 -02:00

README

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.

Languages

C 97.6%

Assembly 1%

Shell 0.5%

Python 0.3%

Makefile 0.3%