linux

iv/linux

Author	SHA1	Message	Date
David Hildenbrand	8043d26c46	s390/pgtable: cleanup description of swp pte layout Bit 52 and bit 55 don't have to be zero: they only trigger a translation-specifiation exception if the PTE is marked as valid, which is not the case for swap ptes. Document which bits are used for what, and which ones are unused. Link: https://lkml.kernel.org/r/20220329164329.208407-6-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Don Dutile <ddutile@redhat.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Liang Zhang <zhangliang5@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Nadav Amit <namit@vmware.com> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-05-09 18:20:46 -07:00
Thomas Richter	39d62336f5	s390/pai: add support for cryptography counters PMU device driver perf_pai_crypto supports Processor Activity Instrumentation (PAI), available with IBM z16: - maps a full page to lowcore address 0x1500. - uses CR0 bit 13 to turn PAI crypto counting on and off. - creates a sample with raw data on each context switch out when at context switch some mapped counters have a value of nonzero. This device driver only supports CPU wide context, no task context is allowed. Support for counting: - one or more counters can be specified using perf stat -e pai_crypto/xxx/ where xxx stands for the counter event name. Multiple invocation of this command is possible. The counter names are listed in /sys/devices/pai_crypto/events directory. - one special counters can be specified using perf stat -e pai_crypto/CRYPTO_ALL/ which returns the sum of all incremented crypto counters. - one event pai_crypto/CRYPTO_ALL/ is reserved for sampling. No multiple invocations are possible. The event collects data at context switch out and saves them in the ring buffer. Add qpaci assembly instruction to query supported memory mapped crypto counters. It returns the number of counters (no holes allowed in that range). The PAI crypto counter events are system wide and can not be executed in parallel. Therefore some restrictions documented in function paicrypt_busy apply. In particular event CRYPTO_ALL for sampling must run exclusive. Only counting events can run in parallel. PAI crypto counter events can not be created when a CPU hot plug add is processed. This means a CPU hot plug add does not get the necessary PAI event to record PAI cryptography counter increments on the newly added CPU. CPU hot plug remove removes the event and terminates the counting of PAI counters immediately. Co-developed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20220504062351.2954280-3-tmricht@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-09 11:50:01 +02:00
Sven Schnelle	6d97af487d	entry: Rename arch_check_user_regs() to arch_enter_from_user_mode() arch_check_user_regs() is used at the moment to verify that struct pt_regs contains valid values when entering the kernel from userspace. s390 needs a place in the generic entry code to modify a cpu data structure when switching from userspace to kernel mode. As arch_check_user_regs() is exactly this, rename it to arch_enter_from_user_mode(). When entering the kernel from userspace, arch_check_user_regs() is used to verify that struct pt_regs contains valid values. Note that the NMI codepath doesn't call this function. s390 needs a place in the generic entry code to modify a cpu data structure when switching from userspace to kernel mode. As arch_check_user_regs() is exactly this, rename it to arch_enter_from_user_mode(). Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20220504062351.2954280-2-tmricht@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-09 11:33:38 +02:00
Eric W. Biederman	5bd2e97c86	fork: Generalize PF_IO_WORKER handling Add fn and fn_arg members into struct kernel_clone_args and test for them in copy_thread (instead of testing for PF_KTHREAD \| PF_IO_WORKER). This allows any task that wants to be a user space task that only runs in kernel mode to use this functionality. The code on x86 is an exception and still retains a PF_KTHREAD test because x86 unlikely everything else handles kthreads slightly differently than user space tasks that start with a function. The functions that created tasks that start with a function have been updated to set ".fn" and ".fn_arg" instead of ".stack" and ".stack_size". These functions are fork_idle(), create_io_thread(), kernel_thread(), and user_mode_thread(). Link: https://lkml.kernel.org/r/20220506141512.516114-4-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-07 09:01:59 -05:00
Eric W. Biederman	c5febea095	fork: Pass struct kernel_clone_args into copy_thread With io_uring we have started supporting tasks that are for most purposes user space tasks that exclusively run code in kernel mode. The kernel task that exec's init and tasks that exec user mode helpers are also user mode tasks that just run kernel code until they call kernel execve. Pass kernel_clone_args into copy_thread so these oddball tasks can be supported more cleanly and easily. v2: Fix spelling of kenrel_clone_args on h8300 Link: https://lkml.kernel.org/r/20220506141512.516114-2-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-07 09:01:48 -05:00
Heiko Carstens	fcdc03f78d	s390/compat: cleanup compat_linux.h header file Remove various declarations from former s390 specific compat system calls which have been removed with commit fef747bab3c0 ("s390: use generic UID16 implementation"). While at it clean up the whole small header file. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:16 +02:00
Heiko Carstens	29b06ad7e8	s390/entry: remove broken and not needed code LLVM's integrated assembler reports the following error when compiling entry.S: <instantiation>:38:5: error: unknown token in expression tm %r8,0x0001 # coming from user space? The correct instruction would have been tmhh instead of tm. The current code is doing nothing, since (with gas) it get's translated to a tm instruction which reads from real address 8, which again contains always zero, and therefore the conditional code is never executed. Note that due to the missing displacement gas translates "%r8" into "8(%r0)". Also code inspection reveals that this conditional code is not needed. Therefore remove it. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:15 +02:00
Heiko Carstens	f84d88ed3b	s390/boot: convert parmarea to C Convert parmarea to C, which makes it much easier to initialize it. No need to keep offsets in assembler code in sync with struct parmarea anymore. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:15 +02:00
Heiko Carstens	834979c27f	s390/boot: convert initial lowcore to C Convert initial lowcore to C and use proper defines and structures to initialize it. This should make the z/VM ipl procedure a bit less magic. Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:15 +02:00
Heiko Carstens	67a9c428ef	s390/ptrace: move short psw definitions to ptrace header file The short psw definitions are contained in compat header files, however short psws are not compat specific. Therefore move the definitions to ptrace header file. This also gets rid of a compat header include in kvm code. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:15 +02:00
Heiko Carstens	aceb06d1e8	s390/head: initialize all new psws Initialize all new psws with disabled wait psws, except for the restart new psw. This way every unexpected exception, svc, machine check, or interrupt is handled properly. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:15 +02:00
Heiko Carstens	84f4e1dfb2	s390/boot: change initial program check handler to disabled wait psw The program check handler of the kernel image points to startup_pgm_check_handler. However an early program check which happens while loading the kernel image will jump to potentially random code, since the code of the program check handler is not yet loaded; leading to a program check loop. Therefore initialize it to a disabled wait psw and let the startup code set the proper psw when everything is in memory. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:15 +02:00
Heiko Carstens	734757976e	s390/head: adjust iplstart entry point Move iplstart entry point to 0x200 again, instead of the middle of the ipl code. This way even the comment describing the ccw program is correct again. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:14 +02:00
Heiko Carstens	edd4a86673	s390/boot: get rid of startup archive The final kernel image is created by linking decompressor object files with a startup archive. The startup archive file however does not contain only optional code and data which can be discarded if not referenced. It also contains mandatory object data like head.o which must never be discarded, even if not referenced. Move the decompresser code and linker script to the boot directory and get rid of the startup archive so everything is kept during link time. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:14 +02:00
Heiko Carstens	964bc5dbe6	s390/vx: remove comments from macros which break LLVM's IAS LLVM's integrated assembler does not like comments within macros: <instantiation>:3:19: error: too many positional arguments GR_NUM b2, 1 /* Base register */ ^ Remove them, since they are obvious anyway. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:14 +02:00
Heiko Carstens	68a971acc9	s390/extable: prefer local labels in .set directives Use local labels in .set directives to avoid potential compile errors with LTO + clang. See commit 334865b2915c ("x86/extable: Prefer local labels in .set directives") for further details. Since s390 doesn't support LTO currently this doesn't fix a real bug for now, but helps to avoid problems as soon as required pieces have been added to llvm. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:14 +02:00
Heiko Carstens	f9a3099f79	s390/nospec: prefer local labels in .set directives Use local labels in .set directives to avoid potential compile errors with LTO + clang. See commit 334865b2915c ("x86/extable: Prefer local labels in .set directives") for further details. Since s390 doesn't support LTO currently this doesn't fix a real bug for now, but helps to avoid problems as soon as required pieces have been added to llvm. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:13 +02:00
Julia Lawall	108ab40fc1	s390/hypfs: fix typos in comments Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Link: https://lore.kernel.org/r/20220430191122.8667-5-Julia.Lawall@inria.fr Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:13 +02:00
Julia Lawall	4b03b3ee60	s390/crypto: fix typos in comments Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Link: https://lore.kernel.org/r/20220430191122.8667-2-Julia.Lawall@inria.fr Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-06 20:45:13 +02:00
Christian Borntraeger	a06afe8383	KVM: s390: vsie/gmap: reduce gmap_rmap overhead there are cases that trigger a 2nd shadow event for the same vmaddr/raddr combination. (prefix changes, reboots, some known races) This will increase memory usages and it will result in long latencies when cleaning up, e.g. on shutdown. To avoid cases with a list that has hundreds of identical raddrs we check existing entries at insert time. As this measurably reduces the list length this will be faster than traversing the list at shutdown time. In the long run several places will be optimized to create less entries and a shrinker might be necessary. Fixes: 4be130a08420 ("s390/mm: add shadow gmap support") Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20220429151526.1560-1-borntraeger@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-03 15:15:44 +02:00
Janis Schoetterl-Glausch	b5d1274409	KVM: s390: Fix lockdep issue in vm memop Issuing a memop on a protected vm does not make sense, neither is the memory readable/writable, nor does it make sense to check storage keys. This is why the ioctl will return -EINVAL when it detects the vm to be protected. However, in order to ensure that the vm cannot become protected during the memop, the kvm->lock would need to be taken for the duration of the ioctl. This is also required because kvm_s390_pv_is_protected asserts that the lock must be held. Instead, don't try to prevent this. If user space enables secure execution concurrently with a memop it must accecpt the possibility of the memop failing. Still check if the vm is currently protected, but without locking and consider it a heuristic. Fixes: ef11c9463ae0 ("KVM: s390: Add vm IOCTL for key checked guest absolute memory access") Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20220322153204.2637400-1-scgl@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-05-02 19:45:03 +02:00
Matthew Wilcox (Oracle)	5d8de293c2	vmcore: convert copy_oldmem_page() to take an iov_iter Patch series "Convert vmcore to use an iov_iter", v5. For some reason several people have been sending bad patches to fix compiler warnings in vmcore recently. Here's how it should be done. Compile-tested only on x86. As noted in the first patch, s390 should take this conversion a bit further, but I'm not inclined to do that work myself. This patch (of 3): Instead of passing in a 'buf' and 'userbuf' argument, pass in an iov_iter. s390 needs more work to pass the iov_iter down further, or refactor, but I'd be more comfortable if someone who can test on s390 did that work. It's more convenient to convert the whole of read_from_oldmem() to take an iov_iter at the same time, so rename it to read_from_oldmem_iter() and add a temporary read_from_oldmem() wrapper that creates an iov_iter. Link: https://lkml.kernel.org/r/20220408090636.560886-1-bhe@redhat.com Link: https://lkml.kernel.org/r/20220408090636.560886-2-bhe@redhat.com Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-04-29 14:37:59 -07:00
Pingfan Liu	6260f6427c	s390/irq: utilize RCU instead of irq_lock_sparse() in show_msi_interrupt() As demonstrated by commit 74bdf7815dfb ("genirq: Speedup show_interrupts()"), irq_desc can be accessed safely in RCU read section. Hence here resorting to rcu read lock to get rid of irq_lock_sparse(). Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Link: https://lore.kernel.org/r/20220422100212.22666-1-kernelfans@gmail.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-27 12:53:34 +02:00
Sven Schnelle	8b202ee218	s390: disable -Warray-bounds gcc-12 shows a lot of array bound warnings on s390. This is caused by the S390_lowcore macro which uses a hardcoded address of 0. Wrapping that with absolute_pointer() works, but gcc no longer knows that a 12 bit displacement is sufficient to access lowcore. So it emits instructions like 'lghi %r1,0; l %rx,xxx(%r1)' instead of a single load/store instruction. As s390 stores variables often read/written in lowcore, this is considered problematic. Therefore disable -Warray-bounds on s390 for gcc-12 for the time being, until there is a better solution. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Link: https://lore.kernel.org/r/yt9dzgkelelc.fsf@linux.ibm.com Link: https://lore.kernel.org/r/20220422134308.1613610-1-svens@linux.ibm.com Link: https://lore.kernel.org/r/20220425121742.3222133-1-svens@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-27 12:53:07 +02:00
Guo Ren	84a0c977ab	asm-generic: compat: Cleanup duplicate definitions There are 7 64bit architectures that support Linux COMPAT mode to run 32bit applications. A lot of definitions are duplicate: - COMPAT_USER_HZ - COMPAT_RLIM_INFINITY - COMPAT_OFF_T_MAX - __compat_uid_t, __compat_uid_t - compat_dev_t - compat_ipc_pid_t - struct compat_flock - struct compat_flock64 - struct compat_statfs - struct compat_ipc64_perm, compat_semid64_ds, compat_msqid64_ds, compat_shmid64_ds Cleanup duplicate definitions and merge them into asm-generic. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Link: https://lore.kernel.org/r/20220405071314.3225832-7-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-04-26 13:35:54 -07:00
Guo Ren	f18ed30db2	fs: stat: compat: Add __ARCH_WANT_COMPAT_STAT RISC-V doesn't neeed compat_stat, so using __ARCH_WANT_COMPAT_STAT to exclude unnecessary SYSCALL functions. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Link: https://lore.kernel.org/r/20220405071314.3225832-6-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-04-26 13:35:45 -07:00
Guo Ren	0cbed0ee1d	arch: Add SYSVIPC_COMPAT for all architectures The existing per-arch definitions are pretty much historic cruft. Move SYSVIPC_COMPAT into init/Kconfig. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Link: https://lore.kernel.org/r/20220405071314.3225832-5-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-04-26 13:35:37 -07:00
Christoph Hellwig	3ce0f2373f	compat: consolidate the compat_flock{,64} definition Provide a single common definition for the compat_flock and compat_flock64 structures using the same tricks as for the native variants. Another extra define is added for the packing required on x86. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Helge Deller <deller@gmx.de> # parisc Link: https://lore.kernel.org/r/20220405071314.3225832-4-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-04-26 13:35:28 -07:00
Christoph Hellwig	306f7cc1e9	uapi: always define F_GETLK64/F_SETLK64/F_SETLKW64 in fcntl.h The F_GETLK64/F_SETLK64/F_SETLKW64 fcntl opcodes are only implemented for the 32-bit syscall APIs, but are also needed for compat handling on 64-bit kernels. Consolidate them in unistd.h instead of definining the internal compat definitions in compat.h, which is rather error prone (e.g. parisc gets the values wrong currently). Note that before this change they were never visible to userspace due to the fact that CONFIG_64BIT is only set for kernel builds. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20220405071314.3225832-3-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-04-26 13:35:20 -07:00
Ilya Leoshkevich	9a07731702	s390: add KCSAN instrumentation to barriers and spinlocks test_barrier fails on s390 because of the missing KCSAN instrumentation for several synchronization primitives. Add it to barriers by defining __mb(), __rmb(), __wmb(), __dma_rmb() and __dma_wmb(), and letting the common code in asm-generic/barrier.h do the rest. Spinlocks require instrumentation only on the unlock path; notify KCSAN that the CPU cannot move memory accesses outside of the spin lock. In reality it also cannot move stores inside of it, but this is not important and can be omitted. Reported-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:16 +02:00
Niklas Schnelle	34fb0e7034	s390/pci: add error record for CC 2 retries Currently it is not detectable from within Linux when PCI instructions are retried because of a busy condition. Detecting such conditions and especially how long they lasted can however be quite useful in problem determination. This patch enables this by adding an s390dbf error log when a CC 2 is first encountered as well as after the retried instruction. Despite being unlikely it may be possible that these added debug messages drown out important other messages so allow setting the debug level in zpci_err_insn*() and set their level to 1 so they can be filtered out if need be. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:15 +02:00
Niklas Schnelle	cde8833e40	s390/pci: add PCI access type and length to error records Currently when a PCI instruction returns a non-zero condition code it can be very hard to tell from the s390dbf logs what kind of instruction was executed. In case of PCI memory I/O (MIO) instructions it is even impossible to tell if we attempted a load, store or block store or how large the access was because only the address is logged. Improve this by adding an indicator byte for the instruction type to the error record and also store the length of the access for MIO instructions where this can not be deduced from the request. We use the following indicator values: - 'l': PCI load - 's': PCI store - 'b': PCI store block - 'L': PCI load (MIO) - 'S': PCI store (MIO) - 'B': PCI store block (MIO) - 'M': MPCIFC - 'R': RPCIT Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:15 +02:00
Niklas Schnelle	723b5a9d2b	s390/pci: don't log availability events as errors Availability events are logged in s390dbf in s390dbf/pci_error/hex_ascii even though they don't indicate an error condition. They have also become redundant as commit 6526a597a2e85 ("s390/pci: add simpler s390dbf traces for events") added an s390dbf/pci_msg/sprintf log entry for availability events which contains all non reserved fields of struct zpci_ccdf_avail. On the other hand the availability entries in the error log make it easy to miss actual errors and may even overwrite error entries if the message buffer wraps. Thus simply remove the availability events from the error log thereby establishing the rule that any content in s390dbf/pci_error indicates some kind of error. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:15 +02:00
Niklas Schnelle	52c79e636a	s390/pci: make better use of zpci_dbg() levels While the zpci_dbg() macro offers a level parameter this is currently largely unused. The only instance with higher importance than 3 is the UID checking change debug message which is not actually more important as the UID uniqueness guarantee is already exposed in sysfs so this should rather be 3 as well. On the other hand the "add ..." message which shows what devices are visible at the lowest level is essential during problem determination. By setting its level to 1, lowering the debug level can act as a filter to only show the available functions. On the error side the default level is set to 6 while all existing messages are printed at level 0. This is inconsistent and means there is no room for having messages be invisible on the default level so instead set the default level to 3 like for errors matching the default for debug messages. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:15 +02:00
Sven Schnelle	41cd81abaf	s390/vdso: add vdso randomization Randomize the address of vdso if randomize_va_space is enabled. Note that this keeps the vdso address on the same PMD as the stack to avoid allocating an extra page table just for vdso. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:15 +02:00
Sven Schnelle	9e37a2e854	s390/vdso: map vdso above stack In the current code vdso is mapped below the stack. This is problematic when programs mapped to the top of the address space are allocating a lot of memory, because the heap will clash with the vdso. To avoid this map the vdso above the stack and move STACK_TOP so that it all fits into three level paging. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:14 +02:00
Sven Schnelle	57761da4dc	s390/vdso: move vdso mapping to its own function This is a preparation patch for adding vdso randomization to s390. It adds a function vdso_size(), which will be used later in calculating the STACK_TOP value. It also moves the vdso mapping into a new function vdso_map(), to keep the code similar to other architectures. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:14 +02:00
Sven Schnelle	f2f47d0ef7	s390/mmap: increase stack/mmap gap to 128MB This basically reverts commit 9e78a13bfb16 ("[S390] reduce miminum gap between stack and mmap_base"). 32MB is not enough space between stack and mmap for some programs. Given that compat task aren't common these days, lets revert back to 128MB. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:14 +02:00
Harald Freudenberger	2004b57cde	s390/zcrypt: code cleanup This patch tries to fix as much as possible of the checkpatch.pl --strict findings: CHECK: Logical continuations should be on the previous line CHECK: No space is necessary after a cast CHECK: Alignment should match open parenthesis CHECK: 'useable' may be misspelled - perhaps 'usable'? WARNING: Possible repeated word: 'is' CHECK: spaces preferred around that '' (ctx:VxV) CHECK: Comparison to NULL could be written "!msg" CHECK: Prefer kzalloc(sizeof(zc)...) over kzalloc(sizeof(struct...)...) CHECK: Unnecessary parentheses around resp_type->work CHECK: Avoid CamelCase: <xcRB> There is no functional change comming with this patch, only code cleanup, renaming, whitespaces, indenting, ... but no semantic change in any way. Also the API (zcrypt and pkey header file) is semantically unchanged. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:14 +02:00
Harald Freudenberger	6acb086d9f	s390/zcrypt: cleanup CPRB struct definitions This patch does a little cleanup on the CPRBX struct in zcrypt.h and the redundant CPRB struct definition in zcrypt_msgtype6.c. Especially some of the misleading fields from the CPRBX struct have been removed. There is no semantic change coming with this patch. The field names changed in the XCRB struct are only related to reserved fields which should never been used. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:13 +02:00
Haowen Bai	4da75a7fd0	s390/cio: simplify the calculation of variables Fix the following coccicheck warnings: ./arch/s390/include/asm/scsw.h:695:47-49: WARNING !A \|\| A && B is equivalent to !A \|\| B I apply a readable version just to get rid of a warning. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Link: https://lore.kernel.org/r/1649297808-5048-1-git-send-email-baihaowen@meizu.com Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:13 +02:00
Alexander Gordeev	7714e16f79	s390/smp: sort out physical vs virtual CPU0 lowcore pointer SPX instruction called from set_prefix() expects physical address of the lowcore to be installed, but instead the virtual address is passed. Note: this does not fix a bug currently, since virtual and physical addresses are identical. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:13 +02:00
Alexander Egorenkov	2ba24343bd	s390/kexec: set end-of-ipl flag in last diag308 call If the facility IPL-complete-control is present then the last diag308 call made by kexec shall set the end-of-ipl flag in the subcode register to signal the hypervisor that this is the last diag308 call made by Linux. Only the diag308 calls made during a regular kexec need to set the end-of-ipl flag, in all other cases the hypervisor will ignore it. Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:12 +02:00
Alexander Egorenkov	1b553839e1	s390/sclp: add detection of IPL-complete-control facility The presence of the IPL-complete-control facility can be derived from the hypervisor's SCLP info response. Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-25 13:54:12 +02:00
Linus Torvalds	bb4ce2c658	RISC-V: * Remove 's' & 'u' as valid ISA extension * Do not allow disabling the base extensions 'i'/'m'/'a'/'c' x86: * Fix NMI watchdog in guests on AMD * Fix for SEV cache incoherency issues * Don't re-acquire SRCU lock in complete_emulated_io() * Avoid NULL pointer deref if VM creation fails * Fix race conditions between APICv disabling and vCPU creation * Bugfixes for disabling of APICv * Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume selftests: * Do not use bitfields larger than 32-bits, they differ between GCC and clang -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJi3KUUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMhvQf/Yncfg3MkOvKsVxnCe7diKDTI/E2n wBGNIcL8r7L9oIltHL4Mh7JQTacHFQOZ9PQ30NO1p+pznZ03e8LR59IF1JpP7VOU sWrLZ5a4bIAEjOpA7Jxcee6hUBwewBauDgFLbb+YAI2lAahiH7jVfywDRife/c3k N2LjeA75K8UvMiDCfjxxxerFJK91zaqjWlUNF2OhtFp/5pnMfS+nli9Q8QS837pZ oUf+0Beb2RpSHan+wbYVU7X3ZLwtpR0M3w3uXOG+X3as56wDf26znXS02aSwa45x lfX+pqJfmb4vCJJDXt6avH27EVgTq0Vew+BhQHG3VLRO6uxZ+smX6qmsuw== =kvbw -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "The main and larger change here is a workaround for AMD's lack of cache coherency for encrypted-memory guests. I have another patch pending, but it's waiting for review from the architecture maintainers. RISC-V: - Remove 's' & 'u' as valid ISA extension - Do not allow disabling the base extensions 'i'/'m'/'a'/'c' x86: - Fix NMI watchdog in guests on AMD - Fix for SEV cache incoherency issues - Don't re-acquire SRCU lock in complete_emulated_io() - Avoid NULL pointer deref if VM creation fails - Fix race conditions between APICv disabling and vCPU creation - Bugfixes for disabling of APICv - Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume selftests: - Do not use bitfields larger than 32-bits, they differ between GCC and clang" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: selftests: introduce and use more page size-related constants kvm: selftests: do not use bitfields larger than 32-bits for PTEs KVM: SEV: add cache flush to solve SEV cache incoherency issues KVM: SVM: Flush when freeing encrypted pages even on SME_COHERENT CPUs KVM: SVM: Simplify and harden helper to flush SEV guest page(s) KVM: selftests: Silence compiler warning in the kvm_page_table_test KVM: x86/pmu: Update AMD PMC sample period to fix guest NMI-watchdog x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume KVM: SPDX style and spelling fixes KVM: x86: Skip KVM_GUESTDBG_BLOCKIRQ APICv update if APICv is disabled KVM: x86: Pend KVM_REQ_APICV_UPDATE during vCPU creation to fix a race KVM: nVMX: Defer APICv updates while L2 is active until L1 is active KVM: x86: Tag APICv DISABLE inhibit, not ABSENT, if APICv is disabled KVM: Initialize debugfs_dentry when a VM is created to avoid NULL deref KVM: Add helpers to wrap vcpu->srcu_idx and yell if it's abused KVM: RISC-V: Use kvm_vcpu.srcu_idx, drop RISC-V's unnecessary copy KVM: x86: Don't re-acquire SRCU lock in complete_emulated_io() RISC-V: KVM: Restrict the extensions that can be disabled RISC-V: KVM: Remove 's' & 'u' as valid ISA extension	2022-04-22 17:58:36 -07:00
Sean Christopherson	2031f28768	KVM: Add helpers to wrap vcpu->srcu_idx and yell if it's abused Add wrappers to acquire/release KVM's SRCU lock when stashing the index in vcpu->src_idx, along with rudimentary detection of illegal usage, e.g. re-acquiring SRCU and thus overwriting vcpu->src_idx. Because the SRCU index is (currently) either 0 or 1, illegal nesting bugs can go unnoticed for quite some time and only cause problems when the nested lock happens to get a different index. Wrap the WARNs in PROVE_RCU=y, and make them ONCE, otherwise KVM will likely yell so loudly that it will bring the kernel to its knees. Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20220415004343.2203171-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-21 13:16:11 -04:00
Song Liu	559089e0a9	vmalloc: replace VM_NO_HUGE_VMAP with VM_ALLOW_HUGE_VMAP Huge page backed vmalloc memory could benefit performance in many cases. However, some users of vmalloc may not be ready to handle huge pages for various reasons: hardware constraints, potential pages split, etc. VM_NO_HUGE_VMAP was introduced to allow vmalloc users to opt-out huge pages. However, it is not easy to track down all the users that require the opt-out, as the allocation are passed different stacks and may cause issues in different layers. To address this issue, replace VM_NO_HUGE_VMAP with an opt-in flag, VM_ALLOW_HUGE_VMAP, so that users that benefit from huge pages could ask specificially. Also, remove vmalloc_no_huge() and add opt-in helper vmalloc_huge(). Fixes: fac54e2bfb5b ("x86/Kconfig: Select HAVE_ARCH_HUGE_VMALLOC with HAVE_ARCH_HUGE_VMAP") Link: https://lore.kernel.org/netdev/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg.de/" Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org> Reviewed-by: Rik van Riel <riel@surriel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-19 12:08:57 -07:00
Christoph Hellwig	c6af2aa9ff	swiotlb: make the swiotlb_init interface more useful Pass a boolean flag to indicate if swiotlb needs to be enabled based on the addressing needs, and replace the verbose argument with a set of flags, including one to force enable bounce buffering. Note that this patch removes the possibility to force xen-swiotlb use with the swiotlb=force parameter on the command line on x86 (arm and arm64 never supported that), but this interface will be restored shortly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2022-04-18 07:21:11 +02:00
Sven Schnelle	c68c634293	s390: enable CONFIG_HARDENED_USERCOPY in debug_defconfig Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-12 11:56:08 +02:00
Sven Schnelle	30de14b188	s390: current_stack_pointer shouldn't be a function s390 defines current_stack_pointer as function while all other architectures use 'register unsigned long asm("<stackptr reg>"). This make codes like the following from check_stack_object() fail: if (IS_ENABLED(CONFIG_STACK_GROWSUP)) { if ((void )current_stack_pointer < obj + len) return BAD_STACK; } else { if (obj < (void )current_stack_pointer) return BAD_STACK; } because this would compare the address of current_stack_pointer() and not the stackpointer value. Reported-by: Karsten Graul <kgraul@linux.ibm.com> Fixes: 2792d84e6da5 ("usercopy: Check valid lifetime via stack depth") Cc: Kees Cook <keescook@chromium.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>	2022-04-12 11:56:08 +02:00

... 3 4 5 6 7 ...

8694 Commits