1089693 Commits

Author SHA1 Message Date
Heiko Carstens
4c25f0ff63 s390/entry: workaround llvm's IAS limitations
llvm's integrated assembler cannot handle immediate values which are
calculated with two local labels:

<instantiation>:3:13: error: invalid operand for instruction
 clgfi %r14,.Lsie_done - .Lsie_gmap

Workaround this by adding clang specific code which reads the specific
value from memory. Since this code is within the hot paths of the kernel
and adds an additional memory reference, keep the original code, and add
ifdef'ed code.

Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Link: https://lore.kernel.org/r/20220511120532.2228616-5-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-17 15:16:28 +02:00
Heiko Carstens
e6ed91fd07 s390/alternatives: remove padding generation code
clang fails to handle ".if" statements in inline assembly which are heavily
used in the alternatives code.

To work around this remove this code, and enforce that users of
alternatives must specify original and alternative instruction sequences
which have identical sizes. Add a compile time check with two ".org"
statements similar to arm64.

In result not only clang can handle this, but also quite a lot of code can
be removed.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1356
Link: https://lore.kernel.org/r/20220511120532.2228616-3-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-17 15:16:28 +02:00
Heiko Carstens
fad442d3ab s390/alternatives: provide identical sized orginal/alternative sequences
Explicitly provide identical sized original/alternative instruction
sequences. This way there is no need for the s390 specific alternatives
infrastructure to generate padding sequences.
The code which generates such sequences will be removed with a follow on
patch.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20220511120532.2228616-2-hca@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-17 15:16:28 +02:00
Thomas Richter
c9311de716 s390/cpumf: add new extended counter set for IBM z16
Export the extended counter set counters of the IBM z16 via sysfs.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-16 10:58:33 +02:00
Heiko Carstens
63678eecec s390/preempt: disable __preempt_count_add() optimization for PROFILE_ALL_BRANCHES
gcc 12 does not (always) optimize away code that should only be generated
if parameters are constant and within in a certain range. This depends on
various obscure kernel config options, however in particular
PROFILE_ALL_BRANCHES can trigger this compile error:

In function ‘__atomic_add_const’,
    inlined from ‘__preempt_count_add.part.0’ at ./arch/s390/include/asm/preempt.h:50:3:
./arch/s390/include/asm/atomic_ops.h:80:9: error: impossible constraint in ‘asm’
   80 |         asm volatile(                                                   \
      |         ^~~

Workaround this by simply disabling the optimization for
PROFILE_ALL_BRANCHES, since the kernel will be so slow, that this
optimization won't matter at all.

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-11 14:40:58 +02:00
Sven Schnelle
5ace65ebb5 s390/stp: clock_delta should be signed
clock_delta is declared as unsigned long in various places. However,
the clock sync delta can be negative. This would add a huge positive
offset in clock_sync_global where clock_delta is added to clk.eitod
which is a 72 bit integer. Declare it as signed long to fix this.

Cc: stable@vger.kernel.org
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-11 14:40:57 +02:00
Sven Schnelle
03780c83c7 s390/stp: fix todoff size
The size of the TOD offset field in the stp info response is 64 bits.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-11 14:40:57 +02:00
Thomas Richter
39d62336f5 s390/pai: add support for cryptography counters
PMU device driver perf_pai_crypto supports Processor Activity
Instrumentation (PAI), available with IBM z16:
- maps a full page to lowcore address 0x1500.
- uses CR0 bit 13 to turn PAI crypto counting on and off.
- creates a sample with raw data on each context switch out when
  at context switch some mapped counters have a value of nonzero.
This device driver only supports CPU wide context, no task context
is allowed.

Support for counting:
- one or more counters can be specified using
  perf stat -e pai_crypto/xxx/
  where xxx stands for the counter event name. Multiple invocation
  of this command is possible. The counter names are listed in
  /sys/devices/pai_crypto/events directory.
- one special counters can be specified using
  perf stat -e pai_crypto/CRYPTO_ALL/
  which returns the sum of all incremented crypto counters.
- one event pai_crypto/CRYPTO_ALL/ is reserved for sampling.
  No multiple invocations are possible. The event collects data at
  context switch out and saves them in the ring buffer.

Add qpaci assembly instruction to query supported memory mapped crypto
counters. It returns the number of counters (no holes allowed in that
range).

The PAI crypto counter events are system wide and can not be executed
in parallel. Therefore some restrictions documented in function
paicrypt_busy apply.
In particular event CRYPTO_ALL for sampling must run exclusive.
Only counting events can run in parallel.

PAI crypto counter events can not be created when a CPU hot plug
add is processed. This means a CPU hot plug add does not get
the necessary PAI event to record PAI cryptography counter increments
on the newly added CPU. CPU hot plug remove removes the event and
terminates the counting of PAI counters immediately.

Co-developed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20220504062351.2954280-3-tmricht@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-09 11:50:01 +02:00
Sven Schnelle
6d97af487d entry: Rename arch_check_user_regs() to arch_enter_from_user_mode()
arch_check_user_regs() is used at the moment to verify that struct pt_regs
contains valid values when entering the kernel from userspace. s390 needs
a place in the generic entry code to modify a cpu data structure when
switching from userspace to kernel mode. As arch_check_user_regs() is
exactly this, rename it to arch_enter_from_user_mode().

When entering the kernel from userspace, arch_check_user_regs() is
used to verify that struct pt_regs contains valid values. Note that
the NMI codepath doesn't call this function. s390 needs a place in the
generic entry code to modify a cpu data structure when switching from
userspace to kernel mode. As arch_check_user_regs() is exactly this,
rename it to arch_enter_from_user_mode().

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/20220504062351.2954280-2-tmricht@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-09 11:33:38 +02:00
Heiko Carstens
fcdc03f78d s390/compat: cleanup compat_linux.h header file
Remove various declarations from former s390 specific compat system
calls which have been removed with commit fef747bab3c0 ("s390: use
generic UID16 implementation"). While at it clean up the whole small
header file.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:16 +02:00
Heiko Carstens
29b06ad7e8 s390/entry: remove broken and not needed code
LLVM's integrated assembler reports the following error when compiling
entry.S:

<instantiation>:38:5: error: unknown token in expression
 tm %r8,0x0001 # coming from user space?

The correct instruction would have been tmhh instead of tm.
The current code is doing nothing, since (with gas) it get's
translated to a tm instruction which reads from real address 8, which
again contains always zero, and therefore the conditional code is
never executed.
Note that due to the missing displacement gas translates "%r8" into
"8(%r0)".

Also code inspection reveals that this conditional code is not needed.
Therefore remove it.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
f84d88ed3b s390/boot: convert parmarea to C
Convert parmarea to C, which makes it much easier to initialize it. No need
to keep offsets in assembler code in sync with struct parmarea anymore.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
834979c27f s390/boot: convert initial lowcore to C
Convert initial lowcore to C and use proper defines and structures to
initialize it. This should make the z/VM ipl procedure a bit less magic.

Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
67a9c428ef s390/ptrace: move short psw definitions to ptrace header file
The short psw definitions are contained in compat header files, however
short psws are not compat specific. Therefore move the definitions to
ptrace header file. This also gets rid of a compat header include in kvm
code.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
aceb06d1e8 s390/head: initialize all new psws
Initialize all new psws with disabled wait psws, except for the restart new
psw. This way every unexpected exception, svc, machine check, or interrupt
is handled properly.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
84f4e1dfb2 s390/boot: change initial program check handler to disabled wait psw
The program check handler of the kernel image points to
startup_pgm_check_handler. However an early program check which happens
while loading the kernel image will jump to potentially random code, since
the code of the program check handler is not yet loaded; leading to a
program check loop.

Therefore initialize it to a disabled wait psw and let the startup code set
the proper psw when everything is in memory.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:15 +02:00
Heiko Carstens
734757976e s390/head: adjust iplstart entry point
Move iplstart entry point to 0x200 again, instead of the middle of the ipl
code. This way even the comment describing the ccw program is correct
again.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:14 +02:00
Heiko Carstens
edd4a86673 s390/boot: get rid of startup archive
The final kernel image is created by linking decompressor object files with
a startup archive. The startup archive file however does not contain only
optional code and data which can be discarded if not referenced. It also
contains mandatory object data like head.o which must never be discarded,
even if not referenced.

Move the decompresser code and linker script to the boot directory and get
rid of the startup archive so everything is kept during link time.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:14 +02:00
Heiko Carstens
964bc5dbe6 s390/vx: remove comments from macros which break LLVM's IAS
LLVM's integrated assembler does not like comments within macros:

<instantiation>:3:19: error: too many positional arguments
        GR_NUM  b2, 1       /* Base register */
                            ^
Remove them, since they are obvious anyway.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:14 +02:00
Heiko Carstens
68a971acc9 s390/extable: prefer local labels in .set directives
Use local labels in .set directives to avoid potential compile errors
with LTO + clang. See commit 334865b2915c ("x86/extable: Prefer local
labels in .set directives") for further details.

Since s390 doesn't support LTO currently this doesn't fix a real bug
for now, but helps to avoid problems as soon as required pieces have
been added to llvm.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:14 +02:00
Heiko Carstens
f9a3099f79 s390/nospec: prefer local labels in .set directives
Use local labels in .set directives to avoid potential compile errors
with LTO + clang. See commit 334865b2915c ("x86/extable: Prefer local
labels in .set directives") for further details.

Since s390 doesn't support LTO currently this doesn't fix a real bug
for now, but helps to avoid problems as soon as required pieces have
been added to llvm.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:13 +02:00
Julia Lawall
108ab40fc1 s390/hypfs: fix typos in comments
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/20220430191122.8667-5-Julia.Lawall@inria.fr
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:13 +02:00
Julia Lawall
4b03b3ee60 s390/crypto: fix typos in comments
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/20220430191122.8667-2-Julia.Lawall@inria.fr
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:13 +02:00
Guilherme G. Piccoli
4ae46db99c s390/consoles: improve panic notifiers reliability
Currently many console drivers for s390 rely on panic/reboot notifiers
to invoke callbacks on these events. The panic() function disables local
IRQs, secondary CPUs and preemption, so callbacks invoked on panic are
effectively running in atomic context.

Happens that most of these console callbacks from s390 doesn't take the
proper care with regards to atomic context, like taking spinlocks that
might be taken in other function/CPU and hence will cause a lockup
situation.

The goal for this patch is to improve the notifiers reliability, acting
on 4 console drivers, as detailed below:

(1) con3215: changed a regular spinlock to the trylock alternative.

(2) con3270: also changed a regular spinlock to its trylock counterpart,
but here we also have another problem: raw3270_activate_view() takes a
different spinlock. So, we worked a helper to validate if this other lock
is safe to acquire, and if so, raw3270_activate_view() should be safe.

Notice though that there is a functional change here: it's now possible
to continue the notifier code [reaching con3270_wait_write() and
con3270_rebuild_update()] without executing raw3270_activate_view().

(3) sclp: a global lock is used heavily in the functions called from
the notifier, so we added a check here - if the lock is taken already,
we just bail-out, preventing the lockup.

(4) sclp_vt220: same as (3), a lock validation was added to prevent the
potential lockup problem.

Besides (1)-(4), we also removed useless void functions, adding the
code called from the notifier inside its own body, and changed the
priority of such notifiers to execute late, since they are "heavyweight"
for the panic environment, so we aim to reduce risks here.
Changed return values to NOTIFY_DONE as well, the standard one.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Link: https://lore.kernel.org/r/20220427224924.592546-14-gpiccoli@igalia.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:13 +02:00
Pingfan Liu
6260f6427c s390/irq: utilize RCU instead of irq_lock_sparse() in show_msi_interrupt()
As demonstrated by commit 74bdf7815dfb ("genirq: Speedup
show_interrupts()"), irq_desc can be accessed safely in RCU read section.

Hence here resorting to rcu read lock to get rid of irq_lock_sparse().

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Link: https://lore.kernel.org/r/20220422100212.22666-1-kernelfans@gmail.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-27 12:53:34 +02:00
Ilya Leoshkevich
9a07731702 s390: add KCSAN instrumentation to barriers and spinlocks
test_barrier fails on s390 because of the missing KCSAN instrumentation
for several synchronization primitives.

Add it to barriers by defining __mb(), __rmb(), __wmb(), __dma_rmb()
and __dma_wmb(), and letting the common code in asm-generic/barrier.h
do the rest.

Spinlocks require instrumentation only on the unlock path; notify KCSAN
that the CPU cannot move memory accesses outside of the spin lock. In
reality it also cannot move stores inside of it, but this is not
important and can be omitted.

Reported-by: Tobias Huschle <huschle@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:16 +02:00
Niklas Schnelle
34fb0e7034 s390/pci: add error record for CC 2 retries
Currently it is not detectable from within Linux when PCI instructions
are retried because of a busy condition. Detecting such conditions and
especially how long they lasted can however be quite useful in problem
determination. This patch enables this by adding an s390dbf error log
when a CC 2 is first encountered as well as after the retried
instruction.

Despite being unlikely it may be possible that these added debug
messages drown out important other messages so allow setting the debug
level in zpci_err_insn*() and set their level to 1 so they can be
filtered out if need be.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Niklas Schnelle
cde8833e40 s390/pci: add PCI access type and length to error records
Currently when a PCI instruction returns a non-zero condition code it
can be very hard to tell from the s390dbf logs what kind of instruction
was executed. In case of PCI memory I/O (MIO) instructions it is even
impossible to tell if we attempted a load, store or block store or how
large the access was because only the address is logged.

Improve this by adding an indicator byte for the instruction type to the
error record and also store the length of the access for MIO
instructions where this can not be deduced from the request.

We use the following indicator values:
 - 'l': PCI load
 - 's': PCI store
 - 'b': PCI store block
 - 'L': PCI load (MIO)
 - 'S': PCI store (MIO)
 - 'B': PCI store block (MIO)
 - 'M': MPCIFC
 - 'R': RPCIT

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Niklas Schnelle
723b5a9d2b s390/pci: don't log availability events as errors
Availability events are logged in s390dbf in s390dbf/pci_error/hex_ascii
even though they don't indicate an error condition.

They have also become redundant as commit 6526a597a2e85 ("s390/pci: add
simpler s390dbf traces for events") added an s390dbf/pci_msg/sprintf log
entry for availability events which contains all non reserved fields of
struct zpci_ccdf_avail. On the other hand the availability entries in
the error log make it easy to miss actual errors and may even overwrite
error entries if the message buffer wraps.

Thus simply remove the availability events from the error log thereby
establishing the rule that any content in s390dbf/pci_error indicates
some kind of error.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Niklas Schnelle
52c79e636a s390/pci: make better use of zpci_dbg() levels
While the zpci_dbg() macro offers a level parameter this is currently
largely unused. The only instance with higher importance than 3 is the
UID checking change debug message which is not actually more important
as the UID uniqueness guarantee is already exposed in sysfs so this
should rather be 3 as well.

On the other hand the "add ..." message which shows what devices are
visible at the lowest level is essential during problem determination.
By setting its level to 1, lowering the debug level can act as a filter
to only show the available functions.

On the error side the default level is set to 6 while all existing
messages are printed at level 0. This is inconsistent and means there is
no room for having messages be invisible on the default level so instead
set the default level to 3 like for errors matching the default for
debug messages.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Thomas Huth
d4b2945dc9 s390/vfio-ap: remove superfluous MODULE_DEVICE_TABLE declaration
The vfio_ap module tries to register for the vfio_ap bus - but that's
the interface that it provides itself, so this does not make much sense,
thus let's simply drop this statement now.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Link: https://lore.kernel.org/r/20220413094416.412114-1-thuth@redhat.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Sven Schnelle
41cd81abaf s390/vdso: add vdso randomization
Randomize the address of vdso if randomize_va_space is enabled.
Note that this keeps the vdso address on the same PMD as the stack
to avoid allocating an extra page table just for vdso.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Sven Schnelle
9e37a2e854 s390/vdso: map vdso above stack
In the current code vdso is mapped below the stack. This is
problematic when programs mapped to the top of the address space
are allocating a lot of memory, because the heap will clash with
the vdso. To avoid this map the vdso above the stack and move
STACK_TOP so that it all fits into three level paging.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:14 +02:00
Sven Schnelle
57761da4dc s390/vdso: move vdso mapping to its own function
This is a preparation patch for adding vdso randomization to s390.
It adds a function vdso_size(), which will be used later in calculating
the STACK_TOP value. It also moves the vdso mapping into a new function
vdso_map(), to keep the code similar to other architectures.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:14 +02:00
Sven Schnelle
f2f47d0ef7 s390/mmap: increase stack/mmap gap to 128MB
This basically reverts commit 9e78a13bfb16 ("[S390] reduce miminum
gap between stack and mmap_base"). 32MB is not enough space
between stack and mmap for some programs. Given that compat
task aren't common these days, lets revert back to 128MB.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:14 +02:00
Harald Freudenberger
2004b57cde s390/zcrypt: code cleanup
This patch tries to fix as much as possible of the
checkpatch.pl --strict findings:
  CHECK: Logical continuations should be on the previous line
  CHECK: No space is necessary after a cast
  CHECK: Alignment should match open parenthesis
  CHECK: 'useable' may be misspelled - perhaps 'usable'?
  WARNING: Possible repeated word: 'is'
  CHECK: spaces preferred around that '*' (ctx:VxV)
  CHECK: Comparison to NULL could be written "!msg"
  CHECK: Prefer kzalloc(sizeof(*zc)...) over kzalloc(sizeof(struct...)...)
  CHECK: Unnecessary parentheses around resp_type->work
  CHECK: Avoid CamelCase: <xcRB>

There is no functional change comming with this patch, only
code cleanup, renaming, whitespaces, indenting, ... but no
semantic change in any way. Also the API (zcrypt and pkey
header file) is semantically unchanged.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:14 +02:00
Harald Freudenberger
6acb086d9f s390/zcrypt: cleanup CPRB struct definitions
This patch does a little cleanup on the CPRBX struct
in zcrypt.h and the redundant CPRB struct definition in
zcrypt_msgtype6.c. Especially some of the misleading
fields from the CPRBX struct have been removed.

There is no semantic change coming with this patch.
The field names changed in the XCRB struct are only related
to reserved fields which should never been used.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:13 +02:00
Harald Freudenberger
d9b38e9d0f s390/ap: uevent on apmask/aqpmask change
This patch introduces user space notifications for changes
on the apmask or aqmask attributes. So it could be possible
to write a udev rule to load/unload the vfio_ap kernel module
based on changes of these masks.

On chance of the apmask or aqmask an AP change event will
be produced with an uevent environment variable showing
the new APMASK or AQMASK mask.

So a change on the apmask triggers an uvevent like this:

  KERNEL[490.160396] change   /devices/ap (ap)
  ACTION=change
  DEVPATH=/devices/ap
  SUBSYSTEM=ap
  APMASK=0xffffffdfffffffffffffffffffffffffffffffffffffffffffffffffffffffff
  SEQNUM=13367

and a change on the aqmask looks like this:

  KERNEL[283.217642] change   /devices/ap (ap)
  ACTION=change
  DEVPATH=/devices/ap
  SUBSYSTEM=ap
  AQMASK=0xfbffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
  SEQNUM=13348

Only real changes to the masks are processed - the old and
new masks are compared and no action is done if the values
are equal (and thus no uevent). The emit of the uevent is
the very last action done when a mask change is processed.
However, there is no guarantee that all unbind/bind actions
caused by the apmask/aqmask changes are completed when the
apmask/aqmask change uevent is received in userspace.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:13 +02:00
Haowen Bai
4da75a7fd0 s390/cio: simplify the calculation of variables
Fix the following coccicheck warnings:
./arch/s390/include/asm/scsw.h:695:47-49: WARNING
 !A || A && B is equivalent to !A || B

I apply a readable version just to get rid of a warning.

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Link: https://lore.kernel.org/r/1649297808-5048-1-git-send-email-baihaowen@meizu.com
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:13 +02:00
Alexander Gordeev
7714e16f79 s390/smp: sort out physical vs virtual CPU0 lowcore pointer
SPX instruction called from set_prefix() expects physical
address of the lowcore to be installed, but instead the
virtual address is passed.

Note: this does not fix a bug currently, since virtual and
physical addresses are identical.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:13 +02:00
Harald Freudenberger
28d3417a94 s390/zcrypt: add display of ASYM master key verification pattern
This patch extends the sysfs attribute mkvps for CCA cards
to show the states and master key verification patterns for
the old, current and new ASYM master key registers.

With this patch now all relevant master key verification
patterns related to a CCA HSM are available with the mkvps
sysfs attribute. This is a requirement for some exploiters
like the kubernetes cex plugin or initrd code needing to
verify the master key verification patterns on HSMs before
use.

A sample output:
  cat /sys/devices/ap/card04/04.0005/mkvps
  AES NEW: empty 0x0000000000000000
  AES CUR: valid 0xe9a49a58cd039bed
  AES OLD: valid 0x7d10d17bc8a409c4
  APKA NEW: empty 0x0000000000000000
  APKA CUR: valid 0x5f2f27aaa2d59b4a
  APKA OLD: valid 0x82a5e2cd5030d5ec
  ASYM NEW: empty 0x00000000000000000000000000000000
  ASYM CUR: valid 0x650c25a89c27e716d0e692b6c83f10e5
  ASYM OLD: valid 0xf8ae2acf8bfc57f0a0957c732c16078b

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jörg Schmidbauer <jschmidb@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:13 +02:00
Alexander Egorenkov
2ba24343bd s390/kexec: set end-of-ipl flag in last diag308 call
If the facility IPL-complete-control is present then the last diag308
call made by kexec shall set the end-of-ipl flag in the subcode register
to signal the hypervisor that this is the last diag308 call made by Linux.
Only the diag308 calls made during a regular kexec need to set
the end-of-ipl flag, in all other cases the hypervisor will ignore it.

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:12 +02:00
Alexander Egorenkov
1b553839e1 s390/sclp: add detection of IPL-complete-control facility
The presence of the IPL-complete-control facility can be derived
from the hypervisor's SCLP info response.

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:12 +02:00
Linus Torvalds
af2d861d4c Linux 5.18-rc4 v5.18-rc4 2022-04-24 14:51:22 -07:00
Linus Torvalds
42740a2ff5 - Fix a corner case when calculating sched runqueue variables
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJlHhcACgkQEsHwGGHe
 VUrPyQ/9FE7zLj8euC4HJ4HPJwf7vkwaeRHz1T2gS+izChBI+QSo5Ipe5zFOKz55
 vYSfaYF0MIvVJtKSHMbnQf6f/2i+y5j0ozMjKEkHRZdYP26okPoj+M2effgbceiJ
 pOIZUsdr8SdBQv313icuUfsXIGfMv/xIw20OtIhVpOFQPB4ZbLASn6AhusZL7U6Z
 0BIcfGmmOwV6p4petOJVUXRcwkgfT812UOBLV71DEz9jzP8dXYGVvPV8ZnSYoVQW
 tm6rcmnpzsOqb3xnp7hqFHevyoIzBT31KVo4xnB80CtCoWB/tbEIPIbNjUPaREp0
 ezE8yXv6euob92+Uh5DH/+8oWuzlctKv1Pc98rFnrGGfW4ocDsr5ibsi9472Mkec
 s+waTwemZMGN3bQHH5QvjWxPGdGuPsqrNvgHbZRFGYGJcMoC+2F9p+vKOXK00fMF
 9ivhhuFqH8OVAFu3WUvvD8zO18tfnST7fQflQJNxZ/TqPumNc0+zLrpKDp+7ZE+r
 qgdvxvXO3ZRnPttiEP1/J+uKxQGNMuDEU8NcfdA7nOzEv9yPyKLLcwo2qu3IYgP0
 XuM3Gqt5/Cf38b+1ddR1LWai3KjxVTn7HV4G9YPdvYP296YcZlFjGOtzfNOfw905
 djGEFTFyGQuS7BEHKhD3OoDbegT4FvB+69k2ddy4Dut99WkDk84=
 =S7Ux
 -----END PGP SIGNATURE-----

Merge tag 'sched_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Borislav Petkov:

 - Fix a corner case when calculating sched runqueue variables

That fix also removes a check for a zero divisor in the code, without
mentioning it.  Vincent clarified that it's ok after I whined about it:

  https://lore.kernel.org/all/CAKfTPtD2QEyZ6ADd5WrwETMOX0XOwJGnVddt7VHgfURdqgOS-Q@mail.gmail.com/

* tag 'sched_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/pelt: Fix attach_entity_load_avg() corner case
2022-04-24 13:28:06 -07:00
Linus Torvalds
5206548f6e powerpc fixes for 5.18 #3
- Partly revert a change to our timer_interrupt() that caused lockups with high res
    timers disabled.
 
  - Fix a bug in KVM TCE handling that could corrupt kernel memory.
 
  - Two commits fixing Power9/Power10 perf alternative event selection.
 
 Thanks to: Alexey Kardashevskiy, Athira Rajeev, David Gibson, Frederic Barrat, Madhavan
 Srinivasan, Miguel Ojeda, Nicholas Piggin.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmJlSCATHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgM7CD/9KH+mjtSwF3hSdun/WxMcWawdNY24g
 f+eMI/vABVqN1RvmO3oC5Z1ruMUw4AxL7BMugAa/SlTjQXOyCuyHQP7vIe4ax3rZ
 4TMfsRm8W4xlgI4k9d9q/unrIHko2k1OhY/wvfGMFhFdG0LWt4qJDL5vbccG5CKb
 xikrutQ5+t8fNLtGH+fJVDeK9q2CU4inJRuyD4m3sfKnXygLI13l1GhcOebxN/p1
 W8qBIac+YJqeezbqiwLl4BC+yXAEDixvFpTh9NuvWdoJaQHdvrltYSLQxCFMIE4B
 dSp5EaBTXemalZ4F8fnGyKf4eTbtO9VIfWq3hicjfnJiFRodbFZOo7dnSpDrYlfJ
 EysGdmI2HxpmWC8DgQQFv+xwZxKW/ExvPiPYb49n+j/4hKJ724wTi9Z8r3XP5fkg
 lD/th40NDhe/sjCSPNWoK3l/UJb3gexd+Ict8iUp2fgNEq3FoJkTR4QlWGj6BeP3
 3pOBoqmWjSXR8tWNShvyK6mLn6fclD0IA7cwTIsZZVmqI+nNR4nR0fC2Ah66Rj+R
 EOof4kCBOcZ2getDyk+Hv97EFNbkDcIm6IE291Vp9hgilp0n2VnPbwwwEdexp6Jv
 KpsYCHosCchnHcu7P1VFFt9w46JFSN7/euq8YZe6znFua2qhV6AGeI7H/uA2X7yL
 KvuO+c+ORhrVKQ==
 =xieK
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Partly revert a change to our timer_interrupt() that caused lockups
   with high res timers disabled.

 - Fix a bug in KVM TCE handling that could corrupt kernel memory.

 - Two commits fixing Power9/Power10 perf alternative event selection.

Thanks to Alexey Kardashevskiy, Athira Rajeev, David Gibson, Frederic
Barrat, Madhavan Srinivasan, Miguel Ojeda, and Nicholas Piggin.

* tag 'powerpc-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/perf: Fix 32bit compile
  powerpc/perf: Fix power10 event alternatives
  powerpc/perf: Fix power9 event alternatives
  KVM: PPC: Fix TCE handling for VFIO
  powerpc/time: Always set decrementer in timer_interrupt()
2022-04-24 12:11:20 -07:00
Linus Torvalds
f48ffef19d - Add Sapphire Rapids CPU support
- Fix a perf vmalloc-ed buffer mapping error (PERF_USE_VMALLOC in use)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJlIAIACgkQEsHwGGHe
 VUrrqRAAjXX/9J6eQNAHNLHwNhAJYptDq/O2s9rkzOfittWzflfEKy/QeQqWT0Gp
 fIqo0tO+QAcj9h6PFYHL0tqAPSkTzCBAoOEbjatkBKYj1BIXETQbfkVvG/lfP5Hz
 sbtDJE2lk2mG8FHnTbQR4FJZHJR1fWsdxdDyJoFxQ/Ww8E3gB9BW8qgJAJZkqttv
 L3D8bFYd9LnTjrs5+lq7ZxyqIMNF14kfd07uxjpJW13TpuXIIQ+enz0bTGllJv/y
 uHIupgd2RHgAe3HFAkU9fWBs8qNJpqWIOZKiNNETIxOxcS9zUt+Th5hJeMonCLg7
 WAoS/Y1I5mu0qf2mxm6gGPbUJGHc/fsWO0+StA54a/MnG/MwGafYtiZv/7qYlYCE
 ia0UEuKZjCqYbNOBgrDnP8iWHFIaAtjAi4zBTRzTaIIv1+JthKTkRJ60NNArmQ+f
 vZyv0YvLLujJLBNShfSIWy77/6qap8I+nvvvbfUy2ylhm7eu3AgaTkq3kJS+1pnc
 NEOPhG1qVYITpu1vSkC8V5mpumgcAomnLxgZf8O/bfe+AQlBUyNDMgBZVI9sesyv
 5Wuh5O8obHKnvr6FJ67bUv9fOg62Qs2ywcBtQdmd+l/DmhyqGqPVfBKaeSLLoXcU
 lqP9Bp6iLq+WgSCqSUq8CPjoRYKduzzes6AZVdcSNLMFZ+P5+x8=
 =uiO0
 -----END PGP SIGNATURE-----

Merge tag 'perf_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - Add Sapphire Rapids CPU support

 - Fix a perf vmalloc-ed buffer mapping error (PERF_USE_VMALLOC in use)

* tag 'perf_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/cstate: Add SAPPHIRERAPIDS_X CPU support
  perf/core: Fix perf_mmap fail when CONFIG_PERF_USE_VMALLOC enabled
2022-04-24 12:01:16 -07:00
Linus Torvalds
b877ca4dc8 - Read the reported error count from the proper register on synopsys_edac
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJlGhAACgkQEsHwGGHe
 VUqDeA//Ud2Y6x/OlrYSb1uetOTYXpt8orCh8rFTkkB+6XXs8kyfxgrn1JOzzYtU
 5whZxo1SvsqIDLToUtstcFhmK0uO5BtuK/pfy6qElTuVLI17yuyx359yKwAWcKIu
 5xAM3S9vtoqUw9YKeXiq61sjLVEJo2qFEIq8BvGwjuZ9DBkemlM34sCPu+tu7o2F
 DI5LqdkCGQGAMbXzljyHxcLgZS2bCSgYs7LzbYPe7KqtDzlwo+4ofT9r/E9r/6iD
 PaWjR34cvy+KyyxcDhdPzuWYjvkvuAOZHhtvQmPVBsw/diCZD4NLodj02/TyNN5u
 3P2ehe4KXLxAWFDdV2XrxjnQsWni5aJrti7HfFmKT1zadh1SBb1vun4sSe1+FXia
 ej+68xG4tvk05zRCjZgy9teCLLT2bejSkYcdRGv+M/DDZZI840Uq+ub/jGmBGG8P
 wpYqGixgWvCmD9lW/4jztHpWhpkn2sdyVk1iDrgjre+M3NE9pPO7yIh0MK2B7TBq
 ORVr6z1bAXVNHm9fXwpptpQz1tcK87hKzVMX63kuEpQLf7+XBiffoJgNHscl26/h
 gVGq24lFRotsGQlsZIXc6Bfu0u13mVxsNF7yYhlU30Tlgqk45cYA/x+btlSehfWB
 6j9/nx+A21Ocjx3s2LOIDQTSZCdzTt3KVBkwPBafDBAtO55XhEE=
 =gcLL
 -----END PGP SIGNATURE-----

Merge tag 'edac_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC fix from Borislav Petkov:

 - Read the reported error count from the proper register on
   synopsys_edac

* tag 'edac_urgent_for_v5.18_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/synopsys: Read the error count from the correct register
2022-04-24 11:24:48 -07:00
Linus Torvalds
9becb68891 kvmalloc: use vmalloc_huge for vmalloc allocations
Since commit 559089e0a93d ("vmalloc: replace VM_NO_HUGE_VMAP with
VM_ALLOW_HUGE_VMAP"), the use of hugepage mappings for vmalloc is an
opt-in strategy, because it caused a number of problems that weren't
noticed until x86 enabled it too.

One of the issues was fixed by Nick Piggin in commit 3b8000ae185c
("mm/vmalloc: huge vmalloc backing pages should be split rather than
compound"), but I'm still worried about page protection issues, and
VM_FLUSH_RESET_PERMS in particular.

However, like the hash table allocation case (commit f2edd118d02d:
"page_alloc: use vmalloc_huge for large system hash"), the use of
kvmalloc() should be safe from any such games, since the returned
pointer might be a SLUB allocation, and as such no user should
reasonably be using it in any odd ways.

We also know that the allocations are fairly large, since it falls back
to the vmalloc case only when a kmalloc() fails.  So using a hugepage
mapping seems both safe and relevant.

This patch does show a weakness in the opt-in strategy: since the opt-in
flag is in the 'vm_flags', not the usual gfp_t allocation flags, very
few of the usual interfaces actually expose it.

That's not much of an issue in this case that already used one of the
fairly specialized low-level vmalloc interfaces for the allocation, but
for a lot of other vmalloc() users that might want to opt in, it's going
to be very inconvenient.

We'll either have to fix any compatibility problems, or expose it in the
gfp flags (__GFP_COMP would have made a lot of sense) to allow normal
vmalloc() users to use hugepage mappings.  That said, the cases that
really matter were probably already taken care of by the hash tabel
allocation.

Link: https://lore.kernel.org/all/20220415164413.2727220-1-song@kernel.org/
Link: https://lore.kernel.org/all/CAHk-=whao=iosX1s5Z4SF-ZGa-ebAukJoAdUJFk5SPwnofV+Vg@mail.gmail.com/
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Song Liu <songliubraving@fb.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-24 10:05:38 -07:00
Song Liu
f2edd118d0 page_alloc: use vmalloc_huge for large system hash
Use vmalloc_huge() in alloc_large_system_hash() so that large system
hash (>= PMD_SIZE) could benefit from huge pages.

Note that vmalloc_huge only allocates huge pages for systems with
HAVE_ARCH_HUGE_VMALLOC.

Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-24 10:00:54 -07:00