IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Jiri Slaby reported a futex state inconsistency resulting in -EINVAL during
a lock operation for a PI futex. It requires that the a lock process is
interrupted by a timeout or signal:
T1 Owns the futex in user space.
T2 Tries to acquire the futex in kernel (futex_lock_pi()). Allocates a
pi_state and attaches itself to it.
T2 Times out and removes its rt_waiter from the rt_mutex. Drops the
rtmutex lock and tries to acquire the hash bucket lock to remove
the futex_q. The lock is contended and T2 schedules out.
T1 Unlocks the futex (futex_unlock_pi()). Finds a futex_q but no
rt_waiter. Unlocks the futex (do_uncontended) and makes it available
to user space.
T3 Acquires the futex in user space.
T4 Tries to acquire the futex in kernel (futex_lock_pi()). Finds the
existing futex_q of T2 and tries to attach itself to the existing
pi_state. This (attach_to_pi_state()) fails with -EINVAL because uval
contains the TID of T3 but pi_state points to T1.
It's incorrect to unlock the futex and make it available for user space to
acquire as long as there is still an existing state attached to it in the
kernel.
T1 cannot hand over the futex to T2 because T2 already gave up and started
to clean up and is blocked on the hash bucket lock, so T2's futex_q with
the pi_state pointing to T1 is still queued.
T2 observes the futex_q, but ignores it as there is no waiter on the
corresponding rt_mutex and takes the uncontended path which allows the
subsequent caller of futex_lock_pi() (T4) to observe that stale state.
To prevent this the unlock path must dequeue all futex_q entries which
point to the same pi_state when there is no waiter on the rt mutex. This
requires obviously to make the dequeue conditional in the locking path to
prevent a double dequeue. With that it's guaranteed that user space cannot
observe an uncontended futex which has kernel state attached.
Fixes: fbeb558b0dd0d ("futex/pi: Fix recursive rt_mutex waiter state")
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20240118115451.0TkD_ZhB@linutronix.de
Closes: https://lore.kernel.org/all/4611bcf2-44d0-4c34-9b84-17406f881003@kernel.org
Xi reported that commit 5694289ce183 ("futex: Flag conversion") broke
glibc's robust futex tests.
This was narrowed down to the change of FLAGS_SHARED from 0x01 to
0x10, at which point Florian noted that handle_futex_death() has a
hardcoded flags argument of 1.
Change this to: FLAGS_SIZE_32 | FLAGS_SHARED, matching how
futex_to_flags() unconditionally sets FLAGS_SIZE_32 for all legacy
futex ops.
Reported-by: Xi Ruoyao <xry111@xry111.site>
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20231114201402.GA25315@noisy.programming.kicks-ass.net
Fixes: 5694289ce183 ("futex: Flag conversion")
Cc: <stable@vger.kernel.org>
On no-MMU, all futexes are treated as private because there is no need
to map a virtual address to physical to match the futex across
processes. This doesn't quite work though, because private futexes
include the current process's mm_struct as part of their key. This makes
it impossible for one process to wake up a shared futex being waited on
in another process.
Fix this bug by excluding the mm_struct from the key. With
a single address space, the futex address is already a unique key.
Fixes: 784bdf3bb694 ("futex: Assume all mappings are private on !MMU systems")
Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: André Almeida <andrealmeid@igalia.com>
Link: https://lore.kernel.org/r/20231019204548.1236437-2-ben.wolsieffer@hefring.com
The futex code already handles compound pages correctly, but using a folio
tells the compiler that there is already a reference to the head page and
it doesn't need to call compound_head() again.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230821142207.2537124-1-willy@infradead.org
HASH_SMALL only works when parameter numentries is 0. But the sole caller
futex_init() never calls alloc_large_system_hash() with numentries set to
0. So HASH_SMALL is obsolete and remove it.
Link: https://lkml.kernel.org/r/20230625021323.849147-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit ca16d5bee598 ("futex: Prevent robust futex exit race") addressed
two cases when tasks waiting on a robust non-PI futex remained blocked
despite the futex not being owned anymore:
* if the owner died after writing zero to the futex word, but before
waking up a waiter
* if a task waiting on the futex was woken up, but died before updating
the futex word (effectively swallowing the notification without acting
on it)
In the second case, the task could be woken up either by the previous
owner (after the futex word was reset to zero) or by the kernel (after
the OWNER_DIED bit was set and the TID part of the futex word was reset
to zero) if the previous owner died without the resetting the futex.
Because the referenced commit wakes up a potential waiter only if the
whole futex word is zero, the latter subcase remains unaddressed.
Fix this by looking only at the TID part of the futex when deciding
whether a wake up is needed.
Fixes: ca16d5bee598 ("futex: Prevent robust futex exit race")
Signed-off-by: Alexey Izbyshev <izbyshev@ispras.ru>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221111215439.248185-1-izbyshev@ispras.ru
invalidate_inode_page() is the only caller of invalidate_complete_page()
and inlining it reveals that the first check is unnecessary (because we
hold the page locked, and we just retrieved the mapping from the page).
Actually, it does make a difference, in that tail pages no longer fail
at this check, so it's now possible to remove a tail page from a mapping.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Pull signal/exit/ptrace updates from Eric Biederman:
"This set of changes deletes some dead code, makes a lot of cleanups
which hopefully make the code easier to follow, and fixes bugs found
along the way.
The end-game which I have not yet reached yet is for fatal signals
that generate coredumps to be short-circuit deliverable from
complete_signal, for force_siginfo_to_task not to require changing
userspace configured signal delivery state, and for the ptrace stops
to always happen in locations where we can guarantee on all
architectures that the all of the registers are saved and available on
the stack.
Removal of profile_task_ext, profile_munmap, and profile_handoff_task
are the big successes for dead code removal this round.
A bunch of small bug fixes are included, as most of the issues
reported were small enough that they would not affect bisection so I
simply added the fixes and did not fold the fixes into the changes
they were fixing.
There was a bug that broke coredumps piped to systemd-coredump. I
dropped the change that caused that bug and replaced it entirely with
something much more restrained. Unfortunately that required some
rebasing.
Some successes after this set of changes: There are few enough calls
to do_exit to audit in a reasonable amount of time. The lifetime of
struct kthread now matches the lifetime of struct task, and the
pointer to struct kthread is no longer stored in set_child_tid. The
flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is
removed. Issues where task->exit_code was examined with
signal->group_exit_code should been examined were fixed.
There are several loosely related changes included because I am
cleaning up and if I don't include them they will probably get lost.
The original postings of these changes can be found at:
https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.org
https://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.org
https://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org
I trimmed back the last set of changes to only the obviously correct
once. Simply because there was less time for review than I had hoped"
* 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits)
ptrace/m68k: Stop open coding ptrace_report_syscall
ptrace: Remove unused regs argument from ptrace_report_syscall
ptrace: Remove second setting of PT_SEIZED in ptrace_attach
taskstats: Cleanup the use of task->exit_code
exit: Use the correct exit_code in /proc/<pid>/stat
exit: Fix the exit_code for wait_task_zombie
exit: Coredumps reach do_group_exit
exit: Remove profile_handoff_task
exit: Remove profile_task_exit & profile_munmap
signal: clean up kernel-doc comments
signal: Remove the helper signal_group_exit
signal: Rename group_exit_task group_exec_task
coredump: Stop setting signal->group_exit_task
signal: Remove SIGNAL_GROUP_COREDUMP
signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process
signal: Make coredump handling explicit in complete_signal
signal: Have prepare_signal detect coredumps using signal->core_state
signal: Have the oom killer detect coredumps using signal->core_state
exit: Move force_uaccess back into do_exit
exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit
...
The beginning of do_exit has become cluttered and difficult to read as
it is filled with checks to handle things that can only happen when
the kernel is operating improperly.
Now that we have a dedicated function for cleaning up a task when the
kernel is operating improperly move the checks there.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Now that all architectures have a working futex implementation in any
configuration, remove the runtime detection code.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Vineet Gupta <vgupta@kernel.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lore.kernel.org/r/20211026100432.1730393-2-arnd@kernel.org
Move the wait/wake bits into their own file.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-15-andrealmeid@collabora.com
Move all the requeue bits into their own file.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-14-andrealmeid@collabora.com
In order to prepare introducing these symbols into the global
namespace; rename:
s/mark_wake_futex/futex_wake_mark/g
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-13-andrealmeid@collabora.com
In order to prepare introducing these symbols into the global
namespace; rename:
s/match_futex/futex_match/g
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-12-andrealmeid@collabora.com
In order to prepare introducing these symbols into the global
namespace; rename them:
s/hb_waiters_/futex_&/g
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-11-andrealmeid@collabora.com
Move the PI futex implementation into it's own file.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-10-andrealmeid@collabora.com
In order to prepare introducing these symbols into the global
namespace; rename them:
s/\<\([^_ ]*\)_futex_value_locked/futex_\1_value_locked/g
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-9-andrealmeid@collabora.com
In order to prepare introducing these symbols into the global
namespace; rename:
s/hash_futex/futex_hash/g
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-8-andrealmeid@collabora.com
In order to prepare introducing these symbols into the global
namespace; rename:
s/__unqueue_futex/__futex_unqueue/g
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-7-andrealmeid@collabora.com
In order to prepare introducing these symbols into the global
namespace; rename them:
s/queue_\(un\)*lock/futex_q_\1lock/g
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-6-andrealmeid@collabora.com
In order to prepare introducing these symbols into the global
namespace; rename them:
s/futex_wait_queue_me/futex_wait_queue/g
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-5-andrealmeid@collabora.com
In order to prepare introducing these symbols into the global
namespace; rename them:
s/\<\(__\)*\(un\)*queue_me/\1futex_\2queue/g
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-4-andrealmeid@collabora.com
Put the syscalls in their own little file.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-3-andrealmeid@collabora.com
In preparation for splitup..
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20210923171111.300673-2-andrealmeid@collabora.com