IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
There are many different ways that access_ok() is defined across
architectures, but in the end, they all just compare against the
user_addr_max() value or they accept anything.
Provide one definition that works for most architectures, checking
against TASK_SIZE_MAX for user processes or skipping the check inside
of uaccess_kernel() sections.
For architectures without CONFIG_SET_FS(), this should be the fastest
check, as it comes down to a single comparison of a pointer against a
compile-time constant, while the architecture specific versions tend to
do something more complex for historic reasons or get something wrong.
Type checking for __user annotations is handled inconsistently across
architectures, but this is easily simplified as well by using an inline
function that takes a 'const void __user *' argument. A handful of
callers need an extra __user annotation for this.
Some architectures had trick to use 33-bit or 65-bit arithmetic on the
addresses to calculate the overflow, however this simpler version uses
fewer registers, which means it can produce better object code in the
end despite needing a second (statically predicted) branch.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Mark Rutland <mark.rutland@arm.com> [arm64, asm-generic]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Stafford Horne <shorne@gmail.com>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Nine architectures are still missing __{get,put}_kernel_nofault:
alpha, ia64, microblaze, nds32, nios2, openrisc, sh, sparc32, xtensa.
Add a generic version that lets everything use the normal
copy_{from,to}_kernel_nofault() code based on these, removing the last
use of get_fs()/set_fs() from architecture-independent code.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
asm/shmbuf.h is currently excluded from the UAPI compile-test because of
the errors like follows:
HDRTEST usr/include/asm/shmbuf.h
In file included from ./usr/include/asm/shmbuf.h:6,
from <command-line>:
./usr/include/asm-generic/shmbuf.h:26:33: error: field ‘shm_perm’ has incomplete type
26 | struct ipc64_perm shm_perm; /* operation perms */
| ^~~~~~~~
./usr/include/asm-generic/shmbuf.h:27:9: error: unknown type name ‘size_t’
27 | size_t shm_segsz; /* size of segment (bytes) */
| ^~~~~~
./usr/include/asm-generic/shmbuf.h:40:9: error: unknown type name ‘__kernel_pid_t’
40 | __kernel_pid_t shm_cpid; /* pid of creator */
| ^~~~~~~~~~~~~~
./usr/include/asm-generic/shmbuf.h:41:9: error: unknown type name ‘__kernel_pid_t’
41 | __kernel_pid_t shm_lpid; /* pid of last operator */
| ^~~~~~~~~~~~~~
The errors can be fixed by replacing size_t with __kernel_size_t and by
including proper headers.
Then, remove the no-header-test entry from user/include/Makefile.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
linux/signal.h and asm/signal.h are currently excluded from the UAPI
compile-test because of the errors like follows:
HDRTEST usr/include/asm/signal.h
In file included from <command-line>:
./usr/include/asm/signal.h:103:9: error: unknown type name ‘size_t’
103 | size_t ss_size;
| ^~~~~~
The errors can be fixed by replacing size_t with __kernel_size_t.
Then, remove the no-header-test entries from user/include/Makefile.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
dereference_function_descriptor() and
dereference_kernel_function_descriptor() are identical on the
three architectures implementing them.
Make them common and put them out-of-line in kernel/extable.c
which is one of the users and has similar type of functions.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/449db09b2eba57f4ab05f80102a67d8675bc8bcd.1644928018.git.christophe.leroy@csgroup.eu
We have three architectures using function descriptors, each with its
own type and name.
Add a common typedef that can be used in generic code.
Also add a stub typedef for architecture without function descriptors,
to avoid a forest of #ifdefs.
It replaces the similar 'func_desc_t' previously defined in
arch/powerpc/kernel/module_64.c
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f1f91b142b3c1082bdc1586ce71c9bac1e75213c.1644928018.git.christophe.leroy@csgroup.eu
Replace HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR by a config option
named CONFIG_HAVE_FUNCTION_DESCRIPTORS and use it instead of
'dereference_function_descriptor' macro to know whether an
arch has function descriptors.
To limit churn in one of the following patches, use
an #ifdef/#else construct with empty first part
instead of an #ifndef in asm-generic/sections.h
On powerpc, make sure the config option matches the ABI used
by the compiler with a BUILD_BUG_ON() and add missing _CALL_ELF=2
when calling 'sparse' so that sparse sees the same piece of
code as GCC.
And include a helper to check whether an arch has function
descriptors or not : have_function_descriptors()
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4a0f11fb0ea74a3197bc44dd7ba25e53a24fd03d.1644928018.git.christophe.leroy@csgroup.eu
After commit 4b9d2a731c3d ("parisc: Switch user access functions
to signal errors in r29 instead of r8") bash suddenly started
to report those warnings after login:
-bash: cannot set terminal process group (-1): Bad file descriptor
-bash: no job control in this shell
It turned out, that a function call inside a put_user(), e.g.:
put_user(vt_do_kdgkbmode(console), (int __user *)arg);
clobbered the error register (r29) and thus the put_user() call itself
seem to have failed.
Rearrange the C-code to pre-calculate the intermediate value
and then do the put_user().
Additionally prefer the "+" constraint on pu_err and gu_err registers
to tell the compiler that those operands are both read and written by
the assembly instruction.
Reported-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 4b9d2a731c3d ("parisc: Switch user access functions to signal errors in r29 instead of r8")
Signed-off-by: Helge Deller <deller@gmx.de>
It happens quite often that people use the wrong compiler to build the
kernel:
make ARCH=parisc -> builds the 32-bit kernel
make ARCH=parisc64 -> builds the 64-bit kernel
This patch adds a sanity check which errors out with an instruction how
use the correct ARCH= option.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.15+
Add the SO_TXREHASH socket option to control hash rethink behavior per socket.
When default mode is set, sockets disable rehash at initialization and use
sysctl option when entering listen state. setsockopt() overrides default
behavior.
Signed-off-by: Akhmat Karakotov <hmukos@yandex-team.ru>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The empty unmap_page_from_agp() macro causes a warning when
building with 'make W=1' on a couple of architectures:
drivers/char/agp/generic.c: In function 'agp_generic_destroy_page':
drivers/char/agp/generic.c:1265:28: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
1265 | unmap_page_from_agp(page);
Change the definitions to a 'do { } while (0)' construct to
make these more reliable.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Helge Deller <deller@gmx.de>
-----BEGIN PGP SIGNATURE-----
iQHJBAABCgAzFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmHi+xgVHHl1cnkubm9y
b3ZAZ21haWwuY29tAAoJELFEgP06H77IxdoMAMf3E+L51Ys/4iAiyJQNVoT3aIBC
A8ZVOB9he1OA3o3wBNIRKmICHk+ovnfCWcXTr9fG/Ade2wJz88NAsGPQ1Phywb+s
iGlpySllFN72RT9ZqtJhLEzgoHHOL0CzTW07TN9GJy4gQA2h2G9CTP+OmsQdnVqE
m9Fn3PSlJ5lhzePlKfnln8rGZFgrriJakfEFPC79n/7an4+2Hvkb5rWigo7KQc4Z
9YNqYUcHWZFUgq80adxEb9LlbMXdD+Z/8fCjOrAatuwVkD4RDt6iKD0mFGjHXGL7
MZ9KRS8AfZXawmetk3jjtsV+/QkeS+Deuu7k0FoO0Th2QV7BGSDhsLXAS5By/MOC
nfSyHhnXHzCsBMyVNrJHmNhEZoN29+tRwI84JX9lWcf/OLANcCofnP6f2UIX7tZY
CAZAgVELp+0YQXdybrfzTQ8BT3TinjS/aZtCrYijRendI1GwUXcyl69vdOKqAHuk
5jy8k/xHyp+ZWu6v+PyAAAEGowY++qhL0fmszA==
=RKW4
-----END PGP SIGNATURE-----
Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linux
Pull bitmap updates from Yury Norov:
- introduce for_each_set_bitrange()
- use find_first_*_bit() instead of find_next_*_bit() where possible
- unify for_each_bit() macros
* tag 'bitmap-5.17-rc1' of git://github.com/norov/linux:
vsprintf: rework bitmap_list_string
lib: bitmap: add performance test for bitmap_print_to_pagebuf
bitmap: unify find_bit operations
mm/percpu: micro-optimize pcpu_is_populated()
Replace for_each_*_bit_from() with for_each_*_bit() where appropriate
find: micro-optimize for_each_{set,clear}_bit()
include/linux: move for_each_bit() macros from bitops.h to find.h
cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
tools: sync tools/bitmap with mother linux
all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate
cpumask: use find_first_and_bit()
lib: add find_first_and_bit()
arch: remove GENERIC_FIND_FIRST_BIT entirely
include: move find.h from asm_generic to linux
bitops: move find_bit_*_le functions from le.h to find.h
bitops: protect find_first_{,zero}_bit properly
Fix a missing prototype warning noticed by the kernel test robot.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
find_bit API and bitmap API are closely related, but inclusion paths
are different - include/asm-generic and include/linux, correspondingly.
In the past it made a lot of troubles due to circular dependencies
and/or undefined symbols. Fix this by moving find.h under include/linux.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
This patch fixes the following build error for source file
drivers/scsi/pcmcia/sym53c500_cs.c:
In file included from ./include/linux/bug.h:5,
from ./include/linux/cpumask.h:14,
from ./include/linux/mm_types_task.h:14,
from ./include/linux/mm_types.h:5,
from ./include/linux/buildid.h:5,
from ./include/linux/module.h:14,
from drivers/scsi/pcmcia/sym53c500_cs.c:42:
drivers/scsi/pcmcia/sym53c500_cs.c: In function ‘SYM53C500_intr’:
./arch/parisc/include/asm/bug.h:28:2: error: expected expression before ‘do’
28 | do { \
| ^~
./arch/parisc/include/asm/io.h:276:20: note: in expansion of macro ‘BUG’
276 | #define outb(x, y) BUG()
| ^~~
drivers/scsi/pcmcia/sym53c500_cs.c:124:19: note: in expansion of macro ‘outb’
124 | #define REG0(x) (outb(C4_IMG, (x) + CONFIG4))
| ^~~~
drivers/scsi/pcmcia/sym53c500_cs.c:362:2: note: in expansion of macro ‘REG0’
362 | REG0(port_base);
| ^~~~
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Helge Deller <deller@gmx.de>
The definitions for pdc_toc_pim_11 and pdc_toc_pim_20 are wrong since they
include an entry for a hversion field which doesn't exist in the specification.
Fix this and clean up some whitespaces so that the whole file will be in
sync with it's copy in the SeaBIOS-hppa sources.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.16
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Use register r29 instead of register r8 to signal faults when accessing
user memory. In case of faults, the fixup routine will store -EFAULT in
this register.
This change saves up to 752 bytes on a 32bit kernel, partly because the
compiler doesn't need to save and restore the old r8 value on the stack.
bloat-o-meter results for usage with r29 register:
add/remove: 0/0 grow/shrink: 23/86 up/down: 228/-980 (-752)
bloat-o-meter results for usage with r28 register:
add/remove: 0/0 grow/shrink: 28/83 up/down: 296/-956 (-660)
Signed-off-by: Helge Deller <deller@gmx.de>
While working on the rewrite to the light-weight syscall and futex code, I
experimented with using a hash index based on the user physical address of
atomic variable. This exposed two problems with the lpa and lpa_user defines.
Because of the copy instruction, the pa argument needs to be an early clobber
argument. This prevents gcc from allocating the va and pa arguments to the same
register.
Secondly, the lpa instruction can cause a page fault so we need to catch
exceptions.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Fixes: 116d753308cf ("parisc: Use lpa instruction to load physical addresses in driver code")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.2+
The depi instruction is similar to the extru instruction on 64-bit machines.
It leaves the most-significant 32 bits of the target register in an undefined
state. On 64-bit machines, the macro uses depdi to perform safe deposits in
the least-significant 32 bits.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
The address bits used to select the futex spinlock need to match those used in
the LWS code in syscall.S. The mask 0x3f8 only selects 7 bits. It should
select 8 bits.
This change fixes the glibc nptl/tst-cond24 and nptl/tst-cond25 tests.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Fixes: 53a42b6324b8 ("parisc: Switch to more fine grained lws locks")
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Helge Deller <deller@gmx.de>
- Fix compilation warnings on csky and sparc
- Rename multipage folios to large folios
- Rename AS_THP_SUPPORT and FS_THP_SUPPORT
- Add functions to zero portions of a folio
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmGem6wACgkQDpNsjXcp
gj7uvwgAjNqDWOVgwYU98daN6nKQQf5Vv35f0bzeKcKcHIOEWZ2+MUeXkI55h8TD
ss5L3O86sPtQmpKUQJJChZC4AhpIPRyjPA0JW6vYqXQd912M331WpGgFFyX5eI+3
OxfKLRULmopeWP1RjWmkWqlhYQHL5OLgAMC4VaBSfDHd1UMRf+F9JNm9qR7GCp9Q
Vb0qcmBMaQYt/K5sWRQyPUACVTF+27RLKAs+Om37NGekv1UqgOPMzi9nAyi9RjCi
rRY6oGupNgC+Y41jzlpaNoL71RPS92H769FBh/Fe4qu55VSPjfcN77qAnVhX5Ykn
4RhzZcEUoqlx9xG9xynk0mmbx2Bf4g==
=kvqM
-----END PGP SIGNATURE-----
Merge tag 'folio-5.16b' of git://git.infradead.org/users/willy/pagecache
Pull folio fixes from Matthew Wilcox:
"In the course of preparing the folio changes for iomap for next merge
window, we discovered some problems that would be nice to address now:
- Renaming multi-page folios to large folios.
mapping_multi_page_folio_support() is just a little too long, so we
settled on mapping_large_folio_support(). That meant renaming, eg
folio_test_multi() to folio_test_large().
Rename AS_THP_SUPPORT to match
- I hadn't included folio wrappers for zero_user_segments(), etc.
Also, multi-page^W^W large folio support is now independent of
CONFIG_TRANSPARENT_HUGEPAGE, so machines with HIGHMEM always need
to fall back to the out-of-line zero_user_segments().
Remove FS_THP_SUPPORT to match
- The build bots finally got round to telling me that I missed a
couple of architectures when adding flush_dcache_folio(). Christoph
suggested that we just add linux/cacheflush.h and not rely on
asm-generic/cacheflush.h"
* tag 'folio-5.16b' of git://git.infradead.org/users/willy/pagecache:
mm: Add functions to zero portions of a folio
fs: Rename AS_THP_SUPPORT and mapping_thp_support
fs: Remove FS_THP_SUPPORT
mm: Remove folio_test_single
mm: Rename folio_test_multi to folio_test_large
Add linux/cacheflush.h
The extru instruction leaves the most significant 32 bits of the
target register in an undefined state on PA 2.0 systems.
Provide a macro to safely use extru on 32- and 64-bit machines.
Suggested-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
This reverts commit e4f2006f1287e7ea17660490569cff323772dac4.
This patch shows problems with signal handling. Revert it for now.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.15
Building allmodconfig shows errors in the gpu/drm/msm snapdragon drivers,
because a COND() define is used there which conflicts with the COND() for
PA-RISC assembly. Although the snapdragon driver isn't relevant for parisc, it
is nevertheless compiled when CONFIG_COMPILE_TEST is defined.
Move the COND() define and other PA-RISC mnemonics inside the #ifdef
__ASSEMBLY__ part to avoid this conflict.
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
Include stringify.h to avoid this build error:
arch/parisc/include/asm/jump_label.h: error: expected ':' before '__stringify'
arch/parisc/include/asm/jump_label.h: error: label 'l_yes' defined but not used [-Werror=unused-label]
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
Many architectures do not include asm-generic/cacheflush.h, so turn
the includes on their head and add linux/cacheflush.h which includes
asm/cacheflush.h.
Move the flush_dcache_folio() declaration from asm-generic/cacheflush.h
to linux/cacheflush.h and change linux/highmem.h to include
linux/cacheflush.h instead of asm/cacheflush.h so that all necessary
places will see flush_dcache_folio().
More functions should have their default implementations moved in the
future, but those are for follow-on patches. This fixes csky, sparc and
sparc64 which were missed in the commit which added flush_dcache_folio().
Fixes: 08b0b0059bf1 ("mm: Add flush_dcache_folio()")
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
For years, there have been random segmentation faults in userspace on
SMP PA-RISC machines. It occurred to me that this might be a problem in
set_pte_at(). MIPS and some other architectures do cache flushes when
installing PTEs with the present bit set.
Here I have adapted the code in update_mmu_cache() to flush the kernel
mapping when the kernel flush is deferred, or when the kernel mapping
may alias with the user mapping. This simplifies calls to
update_mmu_cache().
I also changed the barrier in set_pte() from a compiler barrier to a
full memory barrier. I know this change is not sufficient to fix the
problem. It might not be needed.
I have had a few days of operation with 5.14.16 to 5.15.1 and haven't
seen any random segmentation faults on rp3440 or c8000 so far.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@kernel.org # 5.12+
In commit 2214c0e77259 ("parisc: Move thread_info into task struct")
PA-RISC gained support for THREAD_INFO_IN_TASK while changes were
already underway to keep the CPU field in thread_info rather than move
it into task_struct when THREAD_INFO_IN_TASK is enabled. The result is a
broken build for all PA-RISC configs that enable SMP.
So let's partially revert that commit, and get rid of the ugly hack to
get at the offset of task_struct::cpu without having to include
linux/sched.h, and put the CPU field back where it was before.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: bcf9033e5449 ("sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>
I no longer think interrupts can be disabled in the futex and cmpxchg
operations because of COW breaks. This not ideal but I suspect it's the
best we can do.
For the cmpxchg operations in syscall.S, we rely on the code to not
schedule off the gateway page. For the futex, I added code to disable
preemption.
So far, I haven't seen the warnings with the attached change but the
change is only lightly tested.
Signed-off-by: Dave Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
- Remove socket skb caches
- Add a SO_RESERVE_MEM socket op to forward allocate buffer space
and avoid memory accounting overhead on each message sent
- Introduce managed neighbor entries - added by control plane and
resolved by the kernel for use in acceleration paths (BPF / XDP
right now, HW offload users will benefit as well)
- Make neighbor eviction on link down controllable by userspace
to work around WiFi networks with bad roaming implementations
- vrf: Rework interaction with netfilter/conntrack
- fq_codel: implement L4S style ce_threshold_ect1 marking
- sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
BPF:
- Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
as implemented in LLVM14
- Introduce bpf_get_branch_snapshot() to capture Last Branch Records
- Implement variadic trace_printk helper
- Add a new Bloomfilter map type
- Track <8-byte scalar spill and refill
- Access hw timestamp through BPF's __sk_buff
- Disallow unprivileged BPF by default
- Document BPF licensing
Netfilter:
- Introduce egress hook for looking at raw outgoing packets
- Allow matching on and modifying inner headers / payload data
- Add NFT_META_IFTYPE to match on the interface type either from
ingress or egress
Protocols:
- Multi-Path TCP:
- increase default max additional subflows to 2
- rework forward memory allocation
- add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
- MCTP flow support allowing lower layer drivers to configure msg
muxing as needed
- Automatic Multicast Tunneling (AMT) driver based on RFC7450
- HSR support the redbox supervision frames (IEC-62439-3:2018)
- Support for the ip6ip6 encapsulation of IOAM
- Netlink interface for CAN-FD's Transmitter Delay Compensation
- Support SMC-Rv2 eliminating the current same-subnet restriction,
by exploiting the UDP encapsulation feature of RoCE adapters
- TLS: add SM4 GCM/CCM crypto support
- Bluetooth: initial support for link quality and audio/codec
offload
Driver APIs:
- Add a batched interface for RX buffer allocation in AF_XDP
buffer pool
- ethtool: Add ability to control transceiver modules' power mode
- phy: Introduce supported interfaces bitmap to express MAC
capabilities and simplify PHY code
- Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
New drivers:
- WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
- Ethernet driver for ASIX AX88796C SPI device (x88796c)
Drivers:
- Broadcom PHYs
- support 72165, 7712 16nm PHYs
- support IDDQ-SR for additional power savings
- PHY support for QCA8081, QCA9561 PHYs
- NXP DPAA2: support for IRQ coalescing
- NXP Ethernet (enetc): support for software TCP segmentation
- Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
Gigabit-capable IP found on RZ/G2L SoC
- Intel 100G Ethernet
- support for eswitch offload of TC/OvS flow API, including
offload of GRE, VxLAN, Geneve tunneling
- support application device queues - ability to assign Rx and Tx
queues to application threads
- PTP and PPS (pulse-per-second) extensions
- Broadcom Ethernet (bnxt)
- devlink health reporting and device reload extensions
- Mellanox Ethernet (mlx5)
- offload macvlan interfaces
- support HW offload of TC rules involving OVS internal ports
- support HW-GRO and header/data split
- support application device queues
- Marvell OcteonTx2:
- add XDP support for PF
- add PTP support for VF
- Qualcomm Ethernet switch (qca8k): support for QCA8328
- Realtek Ethernet DSA switch (rtl8366rb)
- support bridge offload
- support STP, fast aging, disabling address learning
- support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
- Mellanox Ethernet/IB switch (mlxsw)
- multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
- offload root TBF qdisc as port shaper
- support multiple routing interface MAC address prefixes
- support for IP-in-IP with IPv6 underlay
- MediaTek WiFi (mt76)
- mt7921 - ASPM, 6GHz, SDIO and testmode support
- mt7915 - LED and TWT support
- Qualcomm WiFi (ath11k)
- include channel rx and tx time in survey dump statistics
- support for 80P80 and 160 MHz bandwidths
- support channel 2 in 6 GHz band
- spectral scan support for QCN9074
- support for rx decapsulation offload (data frames in 802.3
format)
- Qualcomm phone SoC WiFi (wcn36xx)
- enable Idle Mode Power Save (IMPS) to reduce power consumption
during idle
- Bluetooth driver support for MediaTek MT7922 and MT7921
- Enable support for AOSP Bluetooth extension in Qualcomm WCN399x
and Realtek 8822C/8852A
- Microsoft vNIC driver (mana)
- support hibernation and kexec
- Google vNIC driver (gve)
- support for jumbo frames
- implement Rx page reuse
Refactor:
- Make all writes to netdev->dev_addr go thru helpers, so that we
can add this address to the address rbtree and handle the updates
- Various TCP cleanups and optimizations including improvements
to CPU cache use
- Simplify the gnet_stats, Qdisc stats' handling and remove
qdisc->running sequence counter
- Driver changes and API updates to address devlink locking
deficiencies
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGAzX4ACgkQMUZtbf5S
IrvW3g//Q0ZLrOuHK9pZ8sCXMMhDj8qL6ajm0otMddHWA/+1UglwVBKFhsajfxOf
wJ/5LZis+XKLpLqKTU5chKVfn39HuDGe/D3l+egi01Gv5BW0+XzEhagfyR5tJX5z
wsGG5CXO/we/laVSzRiFtwwVEKHKN20YC+tIQwYOYP5Wy3q4G7qDsFhT7GqgsGCS
n74QUEAIB5Tz0ODWFqLtbsySzIurXrskibwt5T9bvAAlPw/lCU68mmG+NVJ7VddO
lBbNkLMOo8yW9Ci20H09SrYd4jZTmMARo9tsFO1tAvAMk7qpn0Wd8pnOYTjFFoMD
+qjiFSVMh7E0JGb8Y7NCvwaB99suAK5rfGP68Xwe62DfP7vYWEx4pZGxBP19F4ld
6Kn1ME33BX9rUF9tBecf0bdKfJUwB2Q2Xou/b9laG04bwiqsc9iG5FQq1C46lnLZ
QdzNiS1My4dJMczkWt66HF3Kx30ibwHfvKMIHjf4PqkzEatkv6Y6SBZ57KXL+Lde
0BQSFhbf0tm2Gf55etzrczLElI3uqHSFWUNZZ2Bt6WmzO1e6tpV9nAtRWF4C/dFg
QDpLJtOOOY65uq+qz09zoPfv2lem868SrCAuFrVn99bEpYjx/CGNFDeEI02l6jyr
84eUxd364UcbIk3fc+eTGdXHLQNVk30G0AHVBBxaWNIidwfqXeE=
=srde
-----END PGP SIGNATURE-----
Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Remove socket skb caches
- Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
avoid memory accounting overhead on each message sent
- Introduce managed neighbor entries - added by control plane and
resolved by the kernel for use in acceleration paths (BPF / XDP
right now, HW offload users will benefit as well)
- Make neighbor eviction on link down controllable by userspace to
work around WiFi networks with bad roaming implementations
- vrf: Rework interaction with netfilter/conntrack
- fq_codel: implement L4S style ce_threshold_ect1 marking
- sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
BPF:
- Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
as implemented in LLVM14
- Introduce bpf_get_branch_snapshot() to capture Last Branch Records
- Implement variadic trace_printk helper
- Add a new Bloomfilter map type
- Track <8-byte scalar spill and refill
- Access hw timestamp through BPF's __sk_buff
- Disallow unprivileged BPF by default
- Document BPF licensing
Netfilter:
- Introduce egress hook for looking at raw outgoing packets
- Allow matching on and modifying inner headers / payload data
- Add NFT_META_IFTYPE to match on the interface type either from
ingress or egress
Protocols:
- Multi-Path TCP:
- increase default max additional subflows to 2
- rework forward memory allocation
- add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
- MCTP flow support allowing lower layer drivers to configure msg
muxing as needed
- Automatic Multicast Tunneling (AMT) driver based on RFC7450
- HSR support the redbox supervision frames (IEC-62439-3:2018)
- Support for the ip6ip6 encapsulation of IOAM
- Netlink interface for CAN-FD's Transmitter Delay Compensation
- Support SMC-Rv2 eliminating the current same-subnet restriction, by
exploiting the UDP encapsulation feature of RoCE adapters
- TLS: add SM4 GCM/CCM crypto support
- Bluetooth: initial support for link quality and audio/codec offload
Driver APIs:
- Add a batched interface for RX buffer allocation in AF_XDP buffer
pool
- ethtool: Add ability to control transceiver modules' power mode
- phy: Introduce supported interfaces bitmap to express MAC
capabilities and simplify PHY code
- Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
New drivers:
- WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
- Ethernet driver for ASIX AX88796C SPI device (x88796c)
Drivers:
- Broadcom PHYs
- support 72165, 7712 16nm PHYs
- support IDDQ-SR for additional power savings
- PHY support for QCA8081, QCA9561 PHYs
- NXP DPAA2: support for IRQ coalescing
- NXP Ethernet (enetc): support for software TCP segmentation
- Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
Gigabit-capable IP found on RZ/G2L SoC
- Intel 100G Ethernet
- support for eswitch offload of TC/OvS flow API, including
offload of GRE, VxLAN, Geneve tunneling
- support application device queues - ability to assign Rx and Tx
queues to application threads
- PTP and PPS (pulse-per-second) extensions
- Broadcom Ethernet (bnxt)
- devlink health reporting and device reload extensions
- Mellanox Ethernet (mlx5)
- offload macvlan interfaces
- support HW offload of TC rules involving OVS internal ports
- support HW-GRO and header/data split
- support application device queues
- Marvell OcteonTx2:
- add XDP support for PF
- add PTP support for VF
- Qualcomm Ethernet switch (qca8k): support for QCA8328
- Realtek Ethernet DSA switch (rtl8366rb)
- support bridge offload
- support STP, fast aging, disabling address learning
- support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
- Mellanox Ethernet/IB switch (mlxsw)
- multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
- offload root TBF qdisc as port shaper
- support multiple routing interface MAC address prefixes
- support for IP-in-IP with IPv6 underlay
- MediaTek WiFi (mt76)
- mt7921 - ASPM, 6GHz, SDIO and testmode support
- mt7915 - LED and TWT support
- Qualcomm WiFi (ath11k)
- include channel rx and tx time in survey dump statistics
- support for 80P80 and 160 MHz bandwidths
- support channel 2 in 6 GHz band
- spectral scan support for QCN9074
- support for rx decapsulation offload (data frames in 802.3
format)
- Qualcomm phone SoC WiFi (wcn36xx)
- enable Idle Mode Power Save (IMPS) to reduce power consumption
during idle
- Bluetooth driver support for MediaTek MT7922 and MT7921
- Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
Realtek 8822C/8852A
- Microsoft vNIC driver (mana)
- support hibernation and kexec
- Google vNIC driver (gve)
- support for jumbo frames
- implement Rx page reuse
Refactor:
- Make all writes to netdev->dev_addr go thru helpers, so that we can
add this address to the address rbtree and handle the updates
- Various TCP cleanups and optimizations including improvements to
CPU cache use
- Simplify the gnet_stats, Qdisc stats' handling and remove
qdisc->running sequence counter
- Driver changes and API updates to address devlink locking
deficiencies"
* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
Revert "net: avoid double accounting for pure zerocopy skbs"
selftests: net: add arp_ndisc_evict_nocarrier
net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
net: arp: introduce arp_evict_nocarrier sysctl parameter
libbpf: Deprecate AF_XDP support
kbuild: Unify options for BTF generation for vmlinux and modules
selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
net: avoid double accounting for pure zerocopy skbs
tcp: rename sk_wmem_free_skb
netdevsim: fix uninit value in nsim_drv_configure_vfs()
selftests/bpf: Fix also no-alu32 strobemeta selftest
bpf: Add missing map_delete_elem method to bloom filter map
selftests/bpf: Add bloom map success test for userspace calls
bpf: Add alignment padding for "map_extra" + consolidate holes
bpf: Bloom filter map naming fixups
selftests/bpf: Add test cases for struct_ops prog
bpf: Add dummy BPF STRUCT_OPS for test purpose
...
Lots of new features and fixes:
* Added TOC (table of content) support, which is a debugging feature which is
either initiated by pressing the TOC button or via command in the BMC. If
pressed the Linux built-in KDB/KGDB will be called (Sven Schnelle)
* Fix CONFIG_PREEMPT (Sven)
* Fix unwinder on 64-bit kernels (Sven)
* Various kgdb fixes (Sven)
* Added KFENCE support (me)
* Switch to ARCH_STACKWALK implementation (me)
* Fix ptrace check on syscall return (me)
* Fix kernel crash with fixmaps on PA1.x machines (me)
* Move thread_info into task struct, aka CONFIG_THREAD_INFO_IN_TASK (me)
* Updated defconfigs
* Smaller cleanups, including Makefile cleanups (Masahiro Yamada),
use kthread_run() macro (Cai Huoqing), use swap() macro (Yihao Han).
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYYALowAKCRD3ErUQojoP
X3o1APwK7wJBdFKAV2hMEouFNZLz2ZTkSQrCMhPTxRupkwJ71QD+JeXvyPLZBLIu
hlvi9mw9DKUKgCV+/Z65s8zjSHYC4wg=
=A4Ci
-----END PGP SIGNATURE-----
Merge tag 'for-5.16/parisc-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"Lots of new features and fixes:
- Added TOC (table of content) support, which is a debugging feature
which is either initiated by pressing the TOC button or via command
in the BMC. If pressed the Linux built-in KDB/KGDB will be called
(Sven Schnelle)
- Fix CONFIG_PREEMPT (Sven)
- Fix unwinder on 64-bit kernels (Sven)
- Various kgdb fixes (Sven)
- Added KFENCE support (me)
- Switch to ARCH_STACKWALK implementation (me)
- Fix ptrace check on syscall return (me)
- Fix kernel crash with fixmaps on PA1.x machines (me)
- Move thread_info into task struct, aka CONFIG_THREAD_INFO_IN_TASK
(me)
- Updated defconfigs
- Smaller cleanups, including Makefile cleanups (Masahiro Yamada),
use kthread_run() macro (Cai Huoqing), use swap() macro (Yihao
Han)"
* tag 'for-5.16/parisc-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (36 commits)
parisc: Fix set_fixmap() on PA1.x CPUs
parisc: Use swap() to swap values in setup_bootmem()
parisc: Update defconfigs
parisc: decompressor: clean up Makefile
parisc: decompressor: remove repeated depenency of misc.o
parisc: Remove unused constants from asm-offsets.c
parisc/ftrace: use static key to enable/disable function graph tracer
parisc/ftrace: set function trace function
parisc: Make use of the helper macro kthread_run()
parisc: mark xchg functions notrace
parisc: enhance warning regarding usage of O_NONBLOCK
parisc: Drop ifdef __KERNEL__ from non-uapi kernel headers
parisc: Use PRIV_USER and PRIV_KERNEL in ptrace.h
parisc: Use PRIV_USER in syscall.S
parisc/kgdb: add kgdb_roundup() to make kgdb work with idle polling
parisc: Move thread_info into task struct
parisc: add support for TOC (transfer of control)
parisc/firmware: add functions to retrieve TOC data
parisc: add PIM TOC data structures
parisc: move virt_map macro to assembly.h
...
- Revert the printk format based wchan() symbol resolution as it can leak
the raw value in case that the symbol is not resolvable.
- Make wchan() more robust and work with all kind of unwinders by
enforcing that the task stays blocked while unwinding is in progress.
- Prevent sched_fork() from accessing an invalid sched_task_group
- Improve asymmetric packing logic
- Extend scheduler statistics to RT and DL scheduling classes and add
statistics for bandwith burst to the SCHED_FAIR class.
- Properly account SCHED_IDLE entities
- Prevent a potential deadlock when initial priority is assigned to a
newly created kthread. A recent change to plug a race between cpuset and
__sched_setscheduler() introduced a new lock dependency which is now
triggered. Break the lock dependency chain by moving the priority
assignment to the thread function.
- Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.
- Improve idle balancing in general and especially for NOHZ enabled
systems.
- Provide proper interfaces for live patching so it does not have to
fiddle with scheduler internals.
- Add cluster aware scheduling support.
- A small set of tweaks for RT (irqwork, wait_task_inactive(), various
scheduler options and delaying mmdrop)
- The usual small tweaks and improvements all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/OUkTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoR/5D/9ikdGNpKg9osNqJ3GjAmxsK6kVkB29
iFe2k8pIpWDToWQf/wQRGih4Yj3Cl49QSnZcPIibh2/12EB1qrrW6iSPJkInz8Ec
/1LS5/Vewn2OyoxyXZjdvGC5gTXEodSbIazASvX7nvdMeI4gsAsL5etzrMJirT/t
aymqvr7zovvywrwMTQJrGjUMo9l4ewE8tafMNNhRu1BHU1U4ojM9yvThyRAAcmp7
3Xy49A+Yq3IgrvYI4u8FMK5Zh08KaxSFjiLhePGm/bF+wSfYmWop2TP1jY05W2Uo
ti8hfbJMUoFRYuMxAiEldkItnc0wV4M9PtWZZ/x+B71bs65Y4Zjt9cW+rxJv2+m1
vzV31EsQwGnOti072dzWN4c/cZqngVXAjaNtErvDwJUr+Tw1ayv9KUvuodMQqZY6
mu68bFUO2kV9EMe1CBOv51Uy1RGHyLj3rlNqrkw+Xp5ISE9Ad2vhUEiRp5bQx5Ci
V/XFhGZkGUluh0vccrdFlNYZwhj8cZEzkOPCnPSeZ+bq8SyZE6xuHH/lTP1CJCOy
s800rW1huM+kgV+zRN8adDkGXibAk9N3RtVGnQXmuEy8gB9LZmQg+JeM2wsc9B+6
i0gdqZnsjNAfoK+BBAG4holxptSL8/eOJsFH8ZNIoxQ+iqooyPx9tFX7yXnRTBQj
d2qWG7UvoseT+g==
=fgtS
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Thomas Gleixner:
- Revert the printk format based wchan() symbol resolution as it can
leak the raw value in case that the symbol is not resolvable.
- Make wchan() more robust and work with all kind of unwinders by
enforcing that the task stays blocked while unwinding is in progress.
- Prevent sched_fork() from accessing an invalid sched_task_group
- Improve asymmetric packing logic
- Extend scheduler statistics to RT and DL scheduling classes and add
statistics for bandwith burst to the SCHED_FAIR class.
- Properly account SCHED_IDLE entities
- Prevent a potential deadlock when initial priority is assigned to a
newly created kthread. A recent change to plug a race between cpuset
and __sched_setscheduler() introduced a new lock dependency which is
now triggered. Break the lock dependency chain by moving the priority
assignment to the thread function.
- Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.
- Improve idle balancing in general and especially for NOHZ enabled
systems.
- Provide proper interfaces for live patching so it does not have to
fiddle with scheduler internals.
- Add cluster aware scheduling support.
- A small set of tweaks for RT (irqwork, wait_task_inactive(), various
scheduler options and delaying mmdrop)
- The usual small tweaks and improvements all over the place
* tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits)
sched/fair: Cleanup newidle_balance
sched/fair: Remove sysctl_sched_migration_cost condition
sched/fair: Wait before decaying max_newidle_lb_cost
sched/fair: Skip update_blocked_averages if we are defering load balance
sched/fair: Account update_blocked_averages in newidle_balance cost
x86: Fix __get_wchan() for !STACKTRACE
sched,x86: Fix L2 cache mask
sched/core: Remove rq_relock()
sched: Improve wake_up_all_idle_cpus() take #2
irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT
irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT
irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support.
sched/rt: Annotate the RT balancing logic irqwork as IRQ_WORK_HARD_IRQ
sched: Add cluster scheduler level for x86
sched: Add cluster scheduler level in core and related Kconfig for ARM64
topology: Represent clusters of CPUs within a die
sched: Disable -Wunused-but-set-variable
sched: Add wrapper for get_wchan() to keep task blocked
x86: Fix get_wchan() to support the ORC unwinder
proc: Use task_is_running() for wchan in /proc/$pid/stat
...
- Move futex code into kernel/futex/ and split up the kitchen sink into
seperate files to make integration of sys_futex_waitv() simpler.
- Add a new sys_futex_waitv() syscall which allows to wait on multiple
futexes. The main use case is emulating Windows' WaitForMultipleObjects
which allows Wine to improve the performance of Windows Games. Also
native Linux games can benefit from this interface as this is a common
wait pattern for this kind of applications.
- Add context to ww_mutex_trylock() to provide a path for i915 to rework
their eviction code step by step without making lockdep upset until the
final steps of rework are completed. It's also useful for regulator and
TTM to avoid dropping locks in the non contended path.
- Lockdep and might_sleep() cleanups and improvements
- A few improvements for the RT substitutions.
- The usual small improvements and cleanups.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/FTITHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoVNZD/9vIm3Bu1Coz8tbNXz58AiCYq9Y/vp5
mzFgSzz+VJTkW5Vh8jo5Uel4rCKZyt+rL276EoaRPzYl8KFtWDbpK3qd3PrXKqTX
At49JO4ttAMJUHIBQ6vblEkykmfEd9YPU1uSWk5roJ+s7Jmr5VWnu0FEWHP00As5
tWOca/TM0ei9kof26V2fl5aecTGII4i4Zsvy+LPsXtI+TnmP0gSBcGAS/5UnZTtJ
vQRWTR3ojoYvh5iTmNqbaURYoQLe2j8yscn1DSW1CABWVmP12eDWs+N7jRP4b5S9
73xOv5P7vpva41wxrK2ir5iNkpsLE97VL2JOHTW8nm7orblfiuxHLTCkTjEdd2pO
h8blI2IBizEB3JYn2BMkOAaZQOSjN8hd6Ye/b2B4AMEGWeXEoEv6eVy/orYKCluQ
XDqGn47Vce/SYmo5vfTB8VMt6nANx8PKvOP3IvjHInYEQBgiT6QrlUw3RRkXBp5s
clQkjYYwjAMVIXowcCrdhoKjMROzi6STShVwHwGL8MaZXqr8Vl6BUO9ckU0pY+4C
F000Hzwxi8lGEQ9k+P+BnYOEzH5osCty8lloKiQ/7ciX6T+CZHGJPGK/iY4YL8P5
C3CJWMsHCqST7DodNFJmdfZt99UfIMmEhshMDduU9AAH0tHCn8vOu0U6WvCtpyBp
BvHj68zteAtlYg==
=RZ4x
-----END PGP SIGNATURE-----
Merge tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Thomas Gleixner:
- Move futex code into kernel/futex/ and split up the kitchen sink into
seperate files to make integration of sys_futex_waitv() simpler.
- Add a new sys_futex_waitv() syscall which allows to wait on multiple
futexes.
The main use case is emulating Windows' WaitForMultipleObjects which
allows Wine to improve the performance of Windows Games. Also native
Linux games can benefit from this interface as this is a common wait
pattern for this kind of applications.
- Add context to ww_mutex_trylock() to provide a path for i915 to
rework their eviction code step by step without making lockdep upset
until the final steps of rework are completed. It's also useful for
regulator and TTM to avoid dropping locks in the non contended path.
- Lockdep and might_sleep() cleanups and improvements
- A few improvements for the RT substitutions.
- The usual small improvements and cleanups.
* tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
locking: Remove spin_lock_flags() etc
locking/rwsem: Fix comments about reader optimistic lock stealing conditions
locking: Remove rcu_read_{,un}lock() for preempt_{dis,en}able()
locking/rwsem: Disable preemption for spinning region
docs: futex: Fix kernel-doc references
futex: Fix PREEMPT_RT build
futex2: Documentation: Document sys_futex_waitv() uAPI
selftests: futex: Test sys_futex_waitv() wouldblock
selftests: futex: Test sys_futex_waitv() timeout
selftests: futex: Add sys_futex_waitv() test
futex,arm: Wire up sys_futex_waitv()
futex,x86: Wire up sys_futex_waitv()
futex: Implement sys_futex_waitv()
futex: Simplify double_lock_hb()
futex: Split out wait/wake
futex: Split out requeue
futex: Rename mark_wake_futex()
futex: Rename: match_futex()
futex: Rename: hb_waiter_{inc,dec,pending}()
futex: Split out PI futex
...
Instead of showing only the very first application which needs
recompile, show all of them, but print them only once.
Includes typo fix noticed by Colin Ian King.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
This implements the CONFIG_THREAD_INFO_IN_TASK option.
With this change:
- before thread_info was part of the stack and located at the beginning of the stack
- now the thread_info struct is moved and located inside the task_struct structure
- the stack is allocated and handled like the major other platforms
- drop the cpu field of thread_info and use instead the one in task_struct
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Almost all PA-RISC machines have either a button that
is labeled with 'TOC' or a BMC function to trigger a TOC.
TOC is a non-maskable interrupt that is sent to the processor.
This can be used for diagnostic purposes like obtaining a
stack trace/register dump or to enter KDB/KGDB.
As an example, on my c8000, TOC can be used with:
CONFIG_KGDB=y
CONFIG_KGDB_KDB=y
and the 'kgdboc=ttyS0,115200' appended to the command line.
Press ^[( on serial console, which will enter the BMC command line,
and enter 'TOC s':
root@(none):/# (
cli>TOC s
Sending TOC/INIT.
<Cpu3> 2800035d03e00000 0000000040c21ac8 CC_ERR_CHECK_TOC
<Cpu0> 2800035d00e00000 0000000040c21ad0 CC_ERR_CHECK_TOC
<Cpu2> 2800035d02e00000 0000000040c21ac8 CC_ERR_CHECK_TOC
<Cpu1> 2800035d01e00000 0000000040c21ad0 CC_ERR_CHECK_TOC
<Cpu3> 37000f7303e00000 2000000000000000 CC_ERR_CPU_CHECK_SUMMARY
<Cpu0> 37000f7300e00000 2000000000000000 CC_ERR_CPU_CHECK_SUMMARY
<Cpu2> 37000f7302e00000 2000000000000000 CC_ERR_CPU_CHECK_SUMMARY
<Cpu1> 37000f7301e00000 2000000000000000 CC_ERR_CPU_CHECK_SUMMARY
<Cpu3> 4300100803e00000 c0000000001d26cc CC_MC_BR_TO_OS_TOC
<Cpu0> 4300100800e00000 c0000000001d26cc CC_MC_BR_TO_OS_TOC
<Cpu2> 4300100802e00000 c0000000001d26cc CC_MC_BR_TO_OS_TOC
<Cpu1> 4300100801e00000 c0000000001d26cc CC_MC_BR_TO_OS_TOC
Entering kdb (current=0x00000000411cef80, pid 0) on processor 0 due to NonMaskable Interrupt @ 0x40c21ad0
[0]kdb>
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Add functions to retrieve TOC data from firmware both
for 1.1 and 2.0 PDC.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
These data structures describe the TOC data we get from firmware
when issuing a PDC_PIM_TOC request.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
This macro will also be used by the TOC code, so move it
into asm/assembly.h to avoid duplication.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
We will put the stack directly behind the task struct, so
make sure that we allocate it with an alignment of 64 bytes.
Signed-off-by: Helge Deller <deller@gmx.de>
When adding kfence support, we need to tell kfence_handle_page_fault()
if the interrupted assembler statement is a read or write operation.
Signed-off-by: Helge Deller <deller@gmx.de>
parisc, ia64 and powerpc32 are the only remaining architectures that
provide custom arch_{spin,read,write}_lock_flags() functions, which are
meant to re-enable interrupts while waiting for a spinlock.
However, none of these can actually run into this codepath, because
it is only called on architectures without CONFIG_GENERIC_LOCKBREAK,
or when CONFIG_DEBUG_LOCK_ALLOC is set without CONFIG_LOCKDEP, and none
of those combinations are possible on the three architectures.
Going back in the git history, it appears that arch/mn10300 may have
been able to run into this code path, but there is a good chance that
it never worked. On the architectures that still exist, it was
already impossible to hit back in 2008 after the introduction of
CONFIG_GENERIC_LOCKBREAK, and possibly earlier.
As this is all dead code, just remove it and the helper functions built
around it. For arch/ia64, the inline asm could be cleaned up, but
it seems safer to leave it untouched.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Link: https://lore.kernel.org/r/20211022120058.1031690-1-arnd@kernel.org
This is a default implementation which calls flush_dcache_page() on
each page in the folio. If architectures can do better, they should
implement their own version of it.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Having a stable wchan means the process must be blocked and for it to
stay that way while performing stack unwinding.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm]
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Link: https://lkml.kernel.org/r/20211008111626.332092234@infradead.org