IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This reverts commit 53d862fac4a09b9c56cca0433fa9de5732fd05a1.
It turned out that flush_kernel_vmap_range() is being called with
interrupts disabled. There's no way to flush entire cache with
interrupts disabled.
Signed-off-by: Helge Deller <deller@gmx.de>
- Enforce kernel RO, and implement STRICT_MODULE_RWX for 603.
- Add support for livepatch to 32-bit.
- Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS.
- Merge vdso64 and vdso32 into a single directory.
- Fix build errors with newer binutils.
- Add support for UADDR64 relocations, which are emitted by some toolchains. This allows
powerpc to build with the latest lld.
- Fix (another) potential userspace r13 corruption in transactional memory handling.
- Cleanups of function descriptor handling & related fixes to LKDTM.
Thanks to: Abdul Haleem, Alexey Kardashevskiy, Anders Roxell, Aneesh Kumar K.V, Anton
Blanchard, Arnd Bergmann, Athira Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chen
Jingwen, Christophe JAILLET, Christophe Leroy, Corentin Labbe, Daniel Axtens, Daniel
Henrique Barboza, David Dai, Fabiano Rosas, Ganesh Goudar, Guo Zhengkui, Hangyu Hua, Haren
Myneni, Hari Bathini, Igor Zhbanov, Jakob Koschel, Jason Wang, Jeremy Kerr, Joachim
Wiberg, Jordan Niethe, Julia Lawall, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan
Srinivasan, Mamatha Inamdar, Maxime Bizon, Maxim Kiselev, Maxim Kochetkov, Michal
Suchanek, Nageswara R Sastry, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nour-eddine
Taleb, Paul Menzel, Ping Fang, Pratik R. Sampat, Randy Dunlap, Ritesh Harjani, Rohan
McLure, Russell Currey, Sachin Sant, Segher Boessenkool, Shivaprasad G Bhat, Sourabh Jain,
Thierry Reding, Tobias Waldekranz, Tyrel Datwyler, Vaibhav Jain, Vladimir Oltean, Wedson
Almeida Filho, YueHaibing.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmI9TtQTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLp2D/0dwoliEJubRCfoawYUGhxTRZuo6ZYw
EQzprOiFA/MtrZyPfbrX/FwxeeetzQWysaw2r5JAuQwx5Jb7Od9dNIrVmueFEktC
hD4fkO8YT+QuOD3Xhp/rDQTImdw4fkeofIjnWIqEAtz0XGInmiRQKOnojVe/Po7f
72Yi1u80LxYBAnkN/Hhpmi/BsVmu0Nh3wELu+JZopQXjINj4RyD49ayCBSLbmiNc
uo7oYzJ0/WsZHNTpX9kAzzCq+XmI3dKZPyf2AOCvoRxJTmUPCRZF9QCwsmQFikiI
vZOdz4fI5e+C0aYJj8ODmWMrXiS+JUQdEShjGg9t9K6EN8idC8joKWpAuXjTA9KN
kRjzXX7AvjxaMEGbLe8gjU0PmEjY3eSzMOy15Oc/C0DRRswXRzrXdx2AF+/J6bQb
MWMM4aCKfcYs5/TENkEnV0xpbOCOy4ikHM1KZbxvVrShvjSlNIL9XTOnl/pNK5BJ
XSSI2mfnjKkbI1+l0KQ4NBXIRTo6HLpu5jwY3Xh97Tq7kaEfqDbO5p2P2HoOCiLa
ZjdzmpP99zM6wnqUSj+lyvjob7btyhoq6TKmPtxfKbR6OaSfRJ760BCJ5y15Y9Hc
rHey4Y/NL7LqsVYFZxi4/T6Ncq1hNeYr2Fiis4gH+/1zjr6Cd4othnvw3Slaxhst
AaHpN3pyx1QI6g==
=8r2c
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Livepatch support for 32-bit is probably the standout new feature,
otherwise mostly just lots of bits and pieces all over the board.
There's a series of commits cleaning up function descriptor handling,
which touches a few other arches as well as LKDTM. It has acks from
Arnd, Kees and Helge.
Summary:
- Enforce kernel RO, and implement STRICT_MODULE_RWX for 603.
- Add support for livepatch to 32-bit.
- Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS.
- Merge vdso64 and vdso32 into a single directory.
- Fix build errors with newer binutils.
- Add support for UADDR64 relocations, which are emitted by some
toolchains. This allows powerpc to build with the latest lld.
- Fix (another) potential userspace r13 corruption in transactional
memory handling.
- Cleanups of function descriptor handling & related fixes to LKDTM.
Thanks to Abdul Haleem, Alexey Kardashevskiy, Anders Roxell, Aneesh
Kumar K.V, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Bhaskar
Chowdhury, Cédric Le Goater, Chen Jingwen, Christophe JAILLET,
Christophe Leroy, Corentin Labbe, Daniel Axtens, Daniel Henrique
Barboza, David Dai, Fabiano Rosas, Ganesh Goudar, Guo Zhengkui, Hangyu
Hua, Haren Myneni, Hari Bathini, Igor Zhbanov, Jakob Koschel, Jason
Wang, Jeremy Kerr, Joachim Wiberg, Jordan Niethe, Julia Lawall, Kajol
Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mamatha Inamdar,
Maxime Bizon, Maxim Kiselev, Maxim Kochetkov, Michal Suchanek,
Nageswara R Sastry, Nathan Lynch, Naveen N. Rao, Nicholas Piggin,
Nour-eddine Taleb, Paul Menzel, Ping Fang, Pratik R. Sampat, Randy
Dunlap, Ritesh Harjani, Rohan McLure, Russell Currey, Sachin Sant,
Segher Boessenkool, Shivaprasad G Bhat, Sourabh Jain, Thierry Reding,
Tobias Waldekranz, Tyrel Datwyler, Vaibhav Jain, Vladimir Oltean,
Wedson Almeida Filho, and YueHaibing"
* tag 'powerpc-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits)
powerpc/pseries: Fix use after free in remove_phb_dynamic()
powerpc/time: improve decrementer clockevent processing
powerpc/time: Fix KVM host re-arming a timer beyond decrementer range
powerpc/tm: Fix more userspace r13 corruption
powerpc/xive: fix return value of __setup handler
powerpc/64: Add UADDR64 relocation support
powerpc: 8xx: fix a return value error in mpc8xx_pic_init
powerpc/ps3: remove unneeded semicolons
powerpc/64: Force inlining of prevent_user_access() and set_kuap()
powerpc/bitops: Force inlining of fls()
powerpc: declare unmodified attribute_group usages const
powerpc/spufs: Fix build warning when CONFIG_PROC_FS=n
powerpc/secvar: fix refcount leak in format_show()
powerpc/64e: Tie PPC_BOOK3E_64 to PPC_FSL_BOOK3E
powerpc: Move C prototypes out of asm-prototypes.h
powerpc/kexec: Declare kexec_paca static
powerpc/smp: Declare current_set static
powerpc: Cleanup asm-prototypes.c
powerpc/ftrace: Use STK_GOT in ftrace_mprofile.S
powerpc/ftrace: Regroup PPC64 specific operations in ftrace_mprofile.S
...
There are three sets of updates for 5.18 in the asm-generic tree:
- The set_fs()/get_fs() infrastructure gets removed for good. This
was already gone from all major architectures, but now we can
finally remove it everywhere, which loses some particularly
tricky and error-prone code.
There is a small merge conflict against a parisc cleanup, the
solution is to use their new version.
- The nds32 architecture ends its tenure in the Linux kernel. The
hardware is still used and the code is in reasonable shape, but
the mainline port is not actively maintained any more, as all
remaining users are thought to run vendor kernels that would never
be updated to a future release.
There are some obvious conflicts against changes to the removed
files.
- A series from Masahiro Yamada cleans up some of the uapi header
files to pass the compile-time checks.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmI69BsACgkQmmx57+YA
GNn/zA//f4d5VTT0ThhRxRWTu9BdThGHoB8TUcY7iOhbsWu0X/913NItRC3UeWNl
IdmisaXgVtirg1dcC2pWUmrcHdoWOCEGfK4+Zr2NhSWfuZDWvODHK9pGWk4WLnhe
cQgUNBvIuuAMryGtrOBwHPO4TpfCyy2ioeVP36ZfcsWXdDxTrqfaq/56mk3sxIP6
sUTk1UEjut9NG4C9xIIvcSU50R3l6LryQE/H9kyTLtaSvfvTOvprcVYCq0GPmSzo
DtQ1Wwa9zbJ+4EqoMiP5RrgQwWvOTg2iRByLU8ytwlX3e/SEF0uihvMv1FQbL8zG
G8RhGUOKQSEhaBfc3lIkm8GpOVPh0uHzB6zhn7daVmAWtazRD2Nu59BMjipa+ims
a8Z58iHH7jRAnKeEkVZqXKb1CEiUxaQx/IeVPzN4QlwMhDtwrI76LY7ZJ1zCqTGY
ENG0yRLav1XselYBslOYXGtOEWcY5EZPWqLyWbp4P9vz2g0Fe0gZxoIOvPmNQc89
QnfXpCt7vm/DGkyO255myu08GOLeMkisVqUIzLDB9avlym5mri7T7vk9abBa2YyO
CRpTL5gl1/qKPWuH1UI5mvhT+sbbBE2SUHSuy84btns39ZKKKynwCtdu+hSQkKLE
h9pV30Gf1cLTD4JAE0RWlUgOmbBLVp34loTOexQj4MrLM1noOnw=
=vtCN
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"There are three sets of updates for 5.18 in the asm-generic tree:
- The set_fs()/get_fs() infrastructure gets removed for good.
This was already gone from all major architectures, but now we can
finally remove it everywhere, which loses some particularly tricky
and error-prone code. There is a small merge conflict against a
parisc cleanup, the solution is to use their new version.
- The nds32 architecture ends its tenure in the Linux kernel.
The hardware is still used and the code is in reasonable shape, but
the mainline port is not actively maintained any more, as all
remaining users are thought to run vendor kernels that would never
be updated to a future release.
- A series from Masahiro Yamada cleans up some of the uapi header
files to pass the compile-time checks"
* tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits)
nds32: Remove the architecture
uaccess: remove CONFIG_SET_FS
ia64: remove CONFIG_SET_FS support
sh: remove CONFIG_SET_FS support
sparc64: remove CONFIG_SET_FS support
lib/test_lockup: fix kernel pointer check for separate address spaces
uaccess: generalize access_ok()
uaccess: fix type mismatch warnings from access_ok()
arm64: simplify access_ok()
m68k: fix access_ok for coldfire
MIPS: use simpler access_ok()
MIPS: Handle address errors for accesses above CPU max virtual user address
uaccess: add generic __{get,put}_kernel_nofault
nios2: drop access_ok() check from __put_user()
x86: use more conventional access_ok() definition
x86: remove __range_not_ok()
sparc64: add __{get,put}_kernel_nofault()
nds32: fix access_ok() checks in get/put_user
uaccess: fix nios2 and microblaze get_user_8()
sparc64: fix building assembly files
...
Cache move-in for virtual accesses is controlled by the TLB. Thus,
we must generally purge TLB entries before flushing. The flush routines
must use TLB entries that inhibit cache move-in.
V2: Load physical address prior to flushing TLB. In flush_cache_page,
flush TLB when flushing and purging.
V3: Don't flush when start equals end.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Avoid flushing caches in __flush_cache_page() and __purge_cache_page()
if the machine hasn't data or instruction caches - as e.g. in qemu.
Signed-off-by: Helge Deller <deller@gmx.de>
This patch changes the kprobe and kretprobe feature to use another
break instruction instead of relying on the hardware single-step
feature.
That way those kprobes now work in qemu as well, because in qemu we
don't emulate yet single-stepping.
Signed-off-by: Helge Deller <deller@gmx.de>
At least the qemu virtual machine does not provide D- and I-caches,
so skip triggering SMP irqs to flush caches on such machines.
Further optimize the caching code by using static branches and making
some functions static.
Signed-off-by: Helge Deller <deller@gmx.de>
In testing the "Fix non-access data TLB cache flush faults" change,
I noticed a significant improvement in glibc build and check times.
This led me to investigate the parisc_cache_flush_threshold setting.
It determines when we switch from line flushing to whole cache flushing.
It turned out that the parisc_cache_flush_threshold setting on
mako and mako2 machines (PA8800 and PA8900 processors) was way too
small. Adjusting this setting provided almost a factor two improvement
in the glibc build and check time.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Userspace is up to now limited to 32-bit, so it's sufficient to print
only 32-bit values when showing pointer addresses.
Signed-off-by: Helge Deller <deller@gmx.de>
Convert the inline assembly code to use the automatic EFAULT exception
handler. With that the fixup code can be dropped.
The other change is to allow double-word only when a 64-bit kernel is
used instead of depending on CONFIG_PA20.
Signed-off-by: Helge Deller <deller@gmx.de>
Add minimal vDSO support, which provides the signal trampoline helpers,
but none of the userspace syscall helpers like time wrappers.
The big benefit of this vDSO implementation is, that we now don't need
an executeable stack any longer. PA-RISC is one of the last
architectures where an executeable stack was needed in oder to implement
the signal trampolines by putting assembly instructions on the stack
which then gets executed. Instead the kernel will provide the relevant
code in the vDSO page and only put the pointers to the signal
information on the stack.
By dropping the need for executable stacks we avoid running into issues
with applications which want non executable stacks for security reasons.
Additionally, alternative stacks on memory areas without exec
permissions are supported too.
This code is based on an initial implementation by Randolph Chung from 2006:
https://lore.kernel.org/linux-parisc/4544A34A.6080700@tausq.org/
I did the porting and lifted the code to current code base. Dave fixed
the unwind code so that gdb and glibc are able to backtrace through the
code. An additional patch to gdb will be pushed upstream by Dave.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Dave Anglin <dave.anglin@bell.net>
Cc: Randolph Chung <randolph@tausq.org>
Signed-off-by: Helge Deller <deller@gmx.de>
With the latest cache fix for non-access faults and the support for
non-access faults (code 17) in handle_interruption, we can remove
the fast path emulation for fdc, fic, pdc, lpa, probe and probei
instructions.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Currently, the parisc kernel does not fully support non-access TLB
fault handling for probe instructions. In the fast path, we set the
target register to zero if it is not a shadowed register. The slow
path is not implemented, so we call do_page_fault. The architecture
indicates that non-access faults should not cause a page fault from
disk.
This change adds to code to provide non-access fault support for
probe instructions. It also modifies the handling of faults on
userspace so that if the address lies in a valid VMA and the access
type matches that for the VMA, the probe target register is set to
one. Otherwise, the target register is set to zero.
This was done to make probe instructions more useful for userspace.
Probe instructions are not very useful if they set the target register
to zero whenever a page is not present in memory. Nominally, the
purpose of the probe instruction is determine whether read or write
access to a given address is allowed.
This fixes a problem in function pointer comparison noticed in the
glibc testsuite (stdio-common/tst-vfprintf-user-type). The same
problem is likely in glibc (_dl_lookup_address).
V2 adds flush and lpa instruction support to handle_nadtlb_fault.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
When a page is not present, we get non-access data TLB faults from
the fdc and fic instructions in flush_user_dcache_range_asm and
flush_user_icache_range_asm. When these occur, the cache line is
not invalidated and potentially we get memory corruption. The
problem was hidden by the nullification of the flush instructions.
These faults also affect performance. With pa8800/pa8900 processors,
there will be 32 faults per 4 KB page since the cache line is 128
bytes. There will be more faults with earlier processors.
The problem is fixed by using flush_cache_pages(). It does the flush
using a tmp alias mapping.
The flush_cache_pages() call in flush_cache_range() flushed too
large a range.
V2: Remove unnecessary preempt_disable() and preempt_enable() calls.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
There are no remaining callers of set_fs(), so CONFIG_SET_FS
can be removed globally, along with the thread_info field and
any references to it.
This turns access_ok() into a cheaper check against TASK_SIZE_MAX.
As CONFIG_SET_FS is now gone, drop all remaining references to
set_fs()/get_fs(), mm_segment_t, user_addr_max() and uaccess_kernel().
Acked-by: Sam Ravnborg <sam@ravnborg.org> # for sparc32 changes
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Tested-by: Sergey Matyukevich <sergey.matyukevich@synopsys.com> # for arc changes
Acked-by: Stafford Horne <shorne@gmail.com> # [openrisc, asm-generic]
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fix 3 bugs:
a) emulate_stw() doesn't return the error code value, so faulting
instructions are not reported and aborted.
b) Tell emulate_ldw() to handle fldw_l as floating point instruction
c) Tell emulate_ldw() to handle ldw_m as integer instruction
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Usually the kernel provides fixup routines to emulate the fldd and fstd
floating-point instructions if they load or store 8-byte from/to a not
natuarally aligned memory location.
On a 32-bit kernel I noticed that those unaligned handlers didn't worked and
instead the application got a SEGV.
While checking the code I found two problems:
First, the OPCODE_FLDD_L and OPCODE_FSTD_L cases were ifdef'ed out by the
CONFIG_PA20 option, and as such those weren't built on a pure 32-bit kernel.
This is now fixed by moving the CONFIG_PA20 #ifdef to prevent the compilation
of OPCODE_LDD_L and OPCODE_FSTD_L only, and handling the fldd and fstd
instructions.
The second problem are two bugs in the 32-bit inline assembly code, where the
wrong registers where used. The calculation of the natural alignment used %2
(vall) instead of %3 (ior), and the first word was stored back to address %1
(valh) instead of %3 (ior).
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
dereference_function_descriptor() and
dereference_kernel_function_descriptor() are identical on the
three architectures implementing them.
Make them common and put them out-of-line in kernel/extable.c
which is one of the users and has similar type of functions.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/449db09b2eba57f4ab05f80102a67d8675bc8bcd.1644928018.git.christophe.leroy@csgroup.eu
- a memory leak fix in an error path in pdc_stable (Miaoqian Lin)
- two compiler warning fixes in the TOC code
- added autodetection for currently used console type (serial or graphics)
which inserts console=<type> if it's missing
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYesyJAAKCRD3ErUQojoP
X4/fAQDSAarbWUqr3zWo3UU9iBtaCJwD85nWK44R+SSdWon7yQD/bF9YvLMbGnGR
lp8quJafFpgwUWJ9DV7PCzIroUDLCAo=
=o8u9
-----END PGP SIGNATURE-----
Merge tag 'for-5.17/parisc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull more parisc architecture updates from Helge Deller:
"Fixes and enhancements:
- a memory leak fix in an error path in pdc_stable (Miaoqian Lin)
- two compiler warning fixes in the TOC code
- added autodetection for currently used console type (serial or
graphics) which inserts console=<type> if it's missing"
* tag 'for-5.17/parisc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: pdc_stable: Fix memory leak in pdcs_register_pathentries
parisc: Fix missing prototype for 'toc_intr' warning in toc.c
parisc: Autodetect default output device and set console= kernel parameter
parisc: Use safer strscpy() in setup_cmdline()
parisc: Add visible flag to toc_stack variable
Fix a missing prototype warning noticed by the kernel test robot.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Usually palo (the PA-RISC boot loader) will check at boot time if the
machine/firmware was configured to use the serial line (ttyS0, SERIAL_x)
or the graphical display (tty0, graph) as default output device and add
the correct "console=ttyS0" or "console=tty0" Linux kernel parameter to
the kernel command line when starting the Linux kernel.
But the kernel could also have been started via the HP-UX boot loader
or directly in qemu, in which cases the console parameter is missing.
This patch fixes this problem by adding the correct console= parameter
if it's missing in the current kernel command line.
Signed-off-by: Helge Deller <deller@gmx.de>
Pull signal/exit/ptrace updates from Eric Biederman:
"This set of changes deletes some dead code, makes a lot of cleanups
which hopefully make the code easier to follow, and fixes bugs found
along the way.
The end-game which I have not yet reached yet is for fatal signals
that generate coredumps to be short-circuit deliverable from
complete_signal, for force_siginfo_to_task not to require changing
userspace configured signal delivery state, and for the ptrace stops
to always happen in locations where we can guarantee on all
architectures that the all of the registers are saved and available on
the stack.
Removal of profile_task_ext, profile_munmap, and profile_handoff_task
are the big successes for dead code removal this round.
A bunch of small bug fixes are included, as most of the issues
reported were small enough that they would not affect bisection so I
simply added the fixes and did not fold the fixes into the changes
they were fixing.
There was a bug that broke coredumps piped to systemd-coredump. I
dropped the change that caused that bug and replaced it entirely with
something much more restrained. Unfortunately that required some
rebasing.
Some successes after this set of changes: There are few enough calls
to do_exit to audit in a reasonable amount of time. The lifetime of
struct kthread now matches the lifetime of struct task, and the
pointer to struct kthread is no longer stored in set_child_tid. The
flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is
removed. Issues where task->exit_code was examined with
signal->group_exit_code should been examined were fixed.
There are several loosely related changes included because I am
cleaning up and if I don't include them they will probably get lost.
The original postings of these changes can be found at:
https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.org
https://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.org
https://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org
I trimmed back the last set of changes to only the obviously correct
once. Simply because there was less time for review than I had hoped"
* 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits)
ptrace/m68k: Stop open coding ptrace_report_syscall
ptrace: Remove unused regs argument from ptrace_report_syscall
ptrace: Remove second setting of PT_SEIZED in ptrace_attach
taskstats: Cleanup the use of task->exit_code
exit: Use the correct exit_code in /proc/<pid>/stat
exit: Fix the exit_code for wait_task_zombie
exit: Coredumps reach do_group_exit
exit: Remove profile_handoff_task
exit: Remove profile_task_exit & profile_munmap
signal: clean up kernel-doc comments
signal: Remove the helper signal_group_exit
signal: Rename group_exit_task group_exec_task
coredump: Stop setting signal->group_exit_task
signal: Remove SIGNAL_GROUP_COREDUMP
signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process
signal: Make coredump handling explicit in complete_signal
signal: Have prepare_signal detect coredumps using signal->core_state
signal: Have the oom killer detect coredumps using signal->core_state
exit: Move force_uaccess back into do_exit
exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit
...
Merge misc updates from Andrew Morton:
"146 patches.
Subsystems affected by this patch series: kthread, ia64, scripts,
ntfs, squashfs, ocfs2, vfs, and mm (slab-generic, slab, kmemleak,
dax, kasan, debug, pagecache, gup, shmem, frontswap, memremap,
memcg, selftests, pagemap, dma, vmalloc, memory-failure, hugetlb,
userfaultfd, vmscan, mempolicy, oom-kill, hugetlbfs, migration, thp,
ksm, page-poison, percpu, rmap, zswap, zram, cleanups, hmm, and
damon)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (146 commits)
mm/damon: hide kernel pointer from tracepoint event
mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log
mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging
mm/damon/dbgfs: remove an unnecessary variable
mm/damon: move the implementation of damon_insert_region to damon.h
mm/damon: add access checking for hugetlb pages
Docs/admin-guide/mm/damon/usage: update for schemes statistics
mm/damon/dbgfs: support all DAMOS stats
Docs/admin-guide/mm/damon/reclaim: document statistics parameters
mm/damon/reclaim: provide reclamation statistics
mm/damon/schemes: account how many times quota limit has exceeded
mm/damon/schemes: account scheme actions that successfully applied
mm/damon: remove a mistakenly added comment for a future feature
Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk|rm)_contexts
Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning
Docs/admin-guide/mm/damon/usage: remove redundant information
Docs/admin-guide/mm/damon/usage: update for scheme quotas and watermarks
mm/damon: convert macro functions to static inline functions
mm/damon: modify damon_rand() macro to static inline function
mm/damon: move damon_rand() definition into damon.h
...
Add the visible flag to the toc_stack variable to make it visible for
assembly code and to avoid a sparse warning.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
No need to have an own hpmc_stack. Just re-use the toc_stack of the
monarch CPU as either a TOC or a HPMC will happen at the same time.
This reduces the kernel memory footprint by 16k.
Signed-off-by: Helge Deller <deller@gmx.de>
Before this patch, the TOC code used a pre-allocated stack of 16kb for
each possible CPU. That space overhead was the reason why the TOC
feature wasn't enabled by default for 32-bit kernels.
This patch rewrites the TOC code to use a per-cpu stack. That way we use
much less memory now and as such we enable the TOC feature by default on
all kernels.
Additionally the dump of the registers and the stacktrace wasn't
serialized, which led to multiple CPUs printing the stack backtrace at
once which rendered the output unreadable.
Now the backtraces are nicely serialized by a lock.
Signed-off-by: Helge Deller <deller@gmx.de>
Add a simplistic keyboard driver for usage of PDC I/O functions
with kgdb. This driver makes it possible to use KGDB with QEMU.
Signed-off-by: Helge Deller <deller@gmx.de>
This patch adds two new LWS routines - lws_atomic_xchg and lws_atomic_store.
These are simpler than the CAS routines. Currently, we use the CAS
routines for atomic stores. This is inefficient since it requires
both winning the spinlock and a successful CAS operation.
Change has been tested on c8000 and rp3440.
In v2, I moved the code to disble/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
In handle_interruption(), we call faulthandler_disabled() to check whether the
fault handler is not disabled. If the fault handler is disabled, we immediately
call do_page_fault(). It then calls faulthandler_disabled(). If disabled,
do_page_fault() attempts to fixup the exception by jumping to no_context:
no_context:
if (!user_mode(regs) && fixup_exception(regs)) {
return;
}
parisc_terminate("Bad Address (null pointer deref?)", regs, code, address);
Apart from the error messages, the two blocks of code perform the same
function.
We can avoid two calls to faulthandler_disabled() by a simple revision
to the code in handle_interruption().
Note: I didn't try to fix the formatting of this code block.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
The completer in the "or,ev %r1,%r30,%r30" instruction is reversed, so we are
not clipping the LWS number when we are called from a 32-bit process (W=0).
We need to nulify the following depdi instruction when the least-significant
bit of %r30 is 1.
If the %r20 register is not clipped, a user process could perform a LWS call
that would branch to an undefined location in the kernel and potentially crash
the machine.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.19+
Signed-off-by: Helge Deller <deller@gmx.de>
When a trap 7 (Instruction access rights) occurs, this means the CPU
couldn't execute an instruction due to missing execute permissions on
the memory region. In this case it seems the CPU didn't even fetched
the instruction from memory and thus did not store it in the cr19 (IIR)
register before calling the trap handler. So, the trap handler will find
some random old stale value in cr19.
This patch simply overwrites the stale IIR value with a constant magic
"bad food" value (0xbaadf00d), in the hope people don't start to try to
understand the various random IIR values in trap 7 dumps.
Noticed-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
There are two big uses of do_exit. The first is it's design use to be
the guts of the exit(2) system call. The second use is to terminate
a task after something catastrophic has happened like a NULL pointer
in kernel code.
Add a function make_task_dead that is initialy exactly the same as
do_exit to cover the cases where do_exit is called to handle
catastrophic failure. In time this can probably be reduced to just a
light wrapper around do_task_dead. For now keep it exactly the same so
that there will be no behavioral differences introducing this new
concept.
Replace all of the uses of do_exit that use it for catastraphic
task cleanup with make_task_dead to make it clear what the code
is doing.
As part of this rename rewind_stack_do_exit
rewind_stack_and_make_dead.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
In commit c8c3735997a3 ("parisc: Enhance detection of synchronous cr16
clocksources") I assumed that CPUs on the same physical core are syncronous.
While booting up the kernel on two different C8000 machines, one with a
dual-core PA8800 and one with a dual-core PA8900 CPU, this turned out to be
wrong. The symptom was that I saw a jump in the internal clocks printed to the
syslog and strange overall behaviour. On machines which have 4 cores (2
dual-cores) the problem isn't visible, because the current logic already marked
the cr16 clocksource unstable in this case.
This patch now marks the cr16 interval timers unstable if we have more than one
CPU in the system, and it fixes this issue.
Fixes: c8c3735997a3 ("parisc: Enhance detection of synchronous cr16 clocksources")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.15+
This reverts commit 279917e27edc293eb645a25428c6ab3f3bca3f86.
With the CONFIG_HARDENED_USERCOPY option enabled, this patch triggers
kernel bugs at runtime:
usercopy: Kernel memory overwrite attempt detected to kernel text (offset 2084839, size 6)!
kernel BUG at mm/usercopy.c:99!
Backtrace:
IAOQ[0]: usercopy_abort+0xc4/0xe8
[<00000000406ed1c8>] __check_object_size+0x174/0x238
[<00000000407086d4>] copy_strings.isra.0+0x3e8/0x708
[<0000000040709a20>] do_execveat_common.isra.0+0x1bc/0x328
[<000000004070b760>] compat_sys_execve+0x7c/0xb8
[<0000000040303eb8>] syscall_exit+0x0/0x14
The problem is, that we have an init section of at least 2MB size which
starts at _stext and is freed after bootup.
If then later some kernel data is (temporarily) stored in this free
memory, check_kernel_text_object() will trigger a bug since the data
appears to be inside the kernel text (>=_stext) area:
if (overlaps(ptr, len, _stext, _etext))
usercopy_abort("kernel text");
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@kernel.org # 5.4+
The extru instruction leaves the most significant 32 bits of the target
register in an undefined state on PA 2.0 systems. If any of these bits
are nonzero, this will break the calculation of the lock pointer.
Fix by using extrd,u instruction via extru_safe macro on 64-bit kernels.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
This reverts commit e4f2006f1287e7ea17660490569cff323772dac4.
This patch shows problems with signal handling. Revert it for now.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.15
commit 8779e05ba8aa ("parisc: Fix ptrace check on syscall return")
fixed testing of TI_FLAGS. This uncovered a bug in the test mask.
syscall_restore_rfi is only used when the kernel needs to exit to
usespace with single or block stepping and the recovery counter
enabled. The test however used _TIF_SYSCALL_TRACE_MASK, which
includes a lot of bits that shouldn't be tested here.
Fix this by using TIF_SINGLESTEP and TIF_BLOCKSTEP directly.
I encountered this bug by enabling syscall tracepoints. Both in qemu and
on real hardware. As soon as i enabled the tracepoint (sys_exit_read,
but i guess it doesn't really matter which one), i got random page
faults in userspace almost immediately.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
For years, there have been random segmentation faults in userspace on
SMP PA-RISC machines. It occurred to me that this might be a problem in
set_pte_at(). MIPS and some other architectures do cache flushes when
installing PTEs with the present bit set.
Here I have adapted the code in update_mmu_cache() to flush the kernel
mapping when the kernel flush is deferred, or when the kernel mapping
may alias with the user mapping. This simplifies calls to
update_mmu_cache().
I also changed the barrier in set_pte() from a compiler barrier to a
full memory barrier. I know this change is not sufficient to fix the
problem. It might not be needed.
I have had a few days of operation with 5.14.16 to 5.15.1 and haven't
seen any random segmentation faults on rp3440 or c8000 so far.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@kernel.org # 5.12+