2019-05-19 15:08:55 +03:00
// SPDX-License-Identifier: GPL-2.0-only
2005-04-17 02:20:36 +04:00
/* Kernel thread helper functions.
* Copyright ( C ) 2004 IBM Corporation , Rusty Russell .
2020-06-11 04:41:59 +03:00
* Copyright ( C ) 2009 Red Hat , Inc .
2005-04-17 02:20:36 +04:00
*
2007-05-09 13:34:32 +04:00
* Creation is done via kthreadd , so that we get a clean environment
2005-04-17 02:20:36 +04:00
* even if we ' re invoked from userspace ( think modprobe , hotplug cpu ,
* etc . ) .
*/
2017-02-01 20:07:51 +03:00
# include <uapi/linux/sched/types.h>
2020-06-11 04:41:59 +03:00
# include <linux/mm.h>
# include <linux/mmu_context.h>
2005-04-17 02:20:36 +04:00
# include <linux/sched.h>
2020-06-11 04:41:59 +03:00
# include <linux/sched/mm.h>
2017-02-08 20:51:36 +03:00
# include <linux/sched/task.h>
2005-04-17 02:20:36 +04:00
# include <linux/kthread.h>
# include <linux/completion.h>
# include <linux/err.h>
2019-05-15 01:41:12 +03:00
# include <linux/cgroup.h>
cpuset,mm: update tasks' mems_allowed in time
Fix allocating page cache/slab object on the unallowed node when memory
spread is set by updating tasks' mems_allowed after its cpuset's mems is
changed.
In order to update tasks' mems_allowed in time, we must modify the code of
memory policy. Because the memory policy is applied in the process's
context originally. After applying this patch, one task directly
manipulates anothers mems_allowed, and we use alloc_lock in the
task_struct to protect mems_allowed and memory policy of the task.
But in the fast path, we didn't use lock to protect them, because adding a
lock may lead to performance regression. But if we don't add a lock,the
task might see no nodes when changing cpuset's mems_allowed to some
non-overlapping set. In order to avoid it, we set all new allowed nodes,
then clear newly disallowed ones.
[lee.schermerhorn@hp.com:
The rework of mpol_new() to extract the adjusting of the node mask to
apply cpuset and mpol flags "context" breaks set_mempolicy() and mbind()
with MPOL_PREFERRED and a NULL nodemask--i.e., explicit local
allocation. Fix this by adding the check for MPOL_PREFERRED and empty
node mask to mpol_new_mpolicy().
Remove the now unneeded 'nodes = NULL' from mpol_new().
Note that mpol_new_mempolicy() is always called with a non-NULL
'nodes' parameter now that it has been removed from mpol_new().
Therefore, we don't need to test nodes for NULL before testing it for
'empty'. However, just to be extra paranoid, add a VM_BUG_ON() to
verify this assumption.]
[lee.schermerhorn@hp.com:
I don't think the function name 'mpol_new_mempolicy' is descriptive
enough to differentiate it from mpol_new().
This function applies cpuset set context, usually constraining nodes
to those allowed by the cpuset. However, when the 'RELATIVE_NODES flag
is set, it also translates the nodes. So I settled on
'mpol_set_nodemask()', because the comment block for mpol_new() mentions
that we need to call this function to "set nodes".
Some additional minor line length, whitespace and typo cleanup.]
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Paul Menage <menage@google.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-17 02:31:49 +04:00
# include <linux/cpuset.h>
2005-04-17 02:20:36 +04:00
# include <linux/unistd.h>
# include <linux/file.h>
2011-05-23 22:51:41 +04:00
# include <linux/export.h>
2006-03-23 14:00:24 +03:00
# include <linux/mutex.h>
2010-06-29 12:07:09 +04:00
# include <linux/slab.h>
# include <linux/freezer.h>
2012-10-11 05:28:25 +04:00
# include <linux/ptrace.h>
2013-05-01 02:27:21 +04:00
# include <linux/uaccess.h>
2019-03-06 02:42:58 +03:00
# include <linux/numa.h>
2020-05-27 17:29:09 +03:00
# include <linux/sched/isolation.h>
2009-04-15 03:39:12 +04:00
# include <trace/events/sched.h>
2005-04-17 02:20:36 +04:00
2020-06-11 04:41:59 +03:00
2007-05-09 13:34:32 +04:00
static DEFINE_SPINLOCK ( kthread_create_lock ) ;
static LIST_HEAD ( kthread_create_list ) ;
struct task_struct * kthreadd_task ;
2005-04-17 02:20:36 +04:00
struct kthread_create_info
{
2007-05-09 13:34:32 +04:00
/* Information passed to kthread() from kthreadd. */
2005-04-17 02:20:36 +04:00
int ( * threadfn ) ( void * data ) ;
void * data ;
2011-03-23 02:30:44 +03:00
int node ;
2005-04-17 02:20:36 +04:00
2007-05-09 13:34:32 +04:00
/* Result passed back to kthread_create() from kthreadd. */
2005-04-17 02:20:36 +04:00
struct task_struct * result ;
2013-11-13 03:06:45 +04:00
struct completion * done ;
2006-11-22 17:55:48 +03:00
2007-05-09 13:34:32 +04:00
struct list_head list ;
2005-04-17 02:20:36 +04:00
} ;
2009-06-18 03:27:45 +04:00
struct kthread {
2012-07-16 14:42:36 +04:00
unsigned long flags ;
unsigned int cpu ;
2021-12-03 20:42:49 +03:00
int result ;
2020-05-06 19:09:34 +03:00
int ( * threadfn ) ( void * ) ;
2010-06-29 12:07:09 +04:00
void * data ;
2012-07-16 14:42:36 +04:00
struct completion parked ;
2009-06-18 03:27:45 +04:00
struct completion exited ;
2017-09-26 21:02:12 +03:00
# ifdef CONFIG_BLK_CGROUP
2017-09-15 00:02:04 +03:00
struct cgroup_subsys_state * blkcg_css ;
# endif
kthread: dynamically allocate memory to store kthread's full name
When I was implementing a new per-cpu kthread cfs_migration, I found the
comm of it "cfs_migration/%u" is truncated due to the limitation of
TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19
all have the same name "cfs_migration/1", which will confuse the user.
This issue is not critical, because we can get the corresponding CPU
from the task's Cpus_allowed. But for kthreads corresponding to other
hardware devices, it is not easy to get the detailed device info from
task comm, for example,
jbd2/nvme0n1p2-
xfs-reclaim/sdf
Currently there are so many truncated kthreads:
rcu_tasks_kthre
rcu_tasks_rude_
rcu_tasks_trace
poll_mpt3sas0_s
ext4-rsv-conver
xfs-reclaim/sd{a, b, c, ...}
xfs-blockgc/sd{a, b, c, ...}
xfs-inodegc/sd{a, b, c, ...}
audit_send_repl
ecryptfs-kthrea
vfio-irqfd-clea
jbd2/nvme0n1p2-
...
We can shorten these names to work around this problem, but it may be
not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-'
for example, it is a nice name, and it is not a good idea to shorten it.
One possible way to fix this issue is extending the task comm size, but
as task->comm is used in lots of places, that may cause some potential
buffer overflows. Another more conservative approach is introducing a
new pointer to store kthread's full name if it is truncated, which won't
introduce too much overhead as it is in the non-critical path. Finally
we make a dicision to use the second approach. See also the discussions
in this thread:
https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/
After this change, the full name of these truncated kthreads will be
displayed via /proc/[pid]/comm:
rcu_tasks_kthread
rcu_tasks_rude_kthread
rcu_tasks_trace_kthread
poll_mpt3sas0_statu
ext4-rsv-conversion
xfs-reclaim/sdf1
xfs-blockgc/sdf1
xfs-inodegc/sdf1
audit_send_reply
ecryptfs-kthread
vfio-irqfd-cleanup
jbd2/nvme0n1p2-8
Link: https://lkml.kernel.org/r/20211120112850.46047-1-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 05:08:43 +03:00
/* To store the full name if task comm is truncated. */
char * full_name ;
2005-04-17 02:20:36 +04:00
} ;
2012-07-16 14:42:36 +04:00
enum KTHREAD_BITS {
KTHREAD_IS_PER_CPU = 0 ,
KTHREAD_SHOULD_STOP ,
KTHREAD_SHOULD_PARK ,
} ;
2013-04-30 02:05:01 +04:00
static inline struct kthread * to_kthread ( struct task_struct * k )
{
2016-11-29 20:50:57 +03:00
WARN_ON ( ! ( k - > flags & PF_KTHREAD ) ) ;
2021-12-23 07:10:09 +03:00
return k - > worker_private ;
2013-04-30 02:05:01 +04:00
}
2021-04-20 11:18:17 +03:00
/*
* Variant of to_kthread ( ) that doesn ' t assume @ p is a kthread .
*
* Per construction ; when :
*
2021-12-23 07:10:09 +03:00
* ( p - > flags & PF_KTHREAD ) & & p - > worker_private
2021-04-20 11:18:17 +03:00
*
* the task is both a kthread and struct kthread is persistent . However
* PF_KTHREAD on it ' s own is not , kernel_thread ( ) can exec ( ) ( See umh . c and
* begin_new_exec ( ) ) .
*/
static inline struct kthread * __to_kthread ( struct task_struct * p )
{
2021-12-23 07:10:09 +03:00
void * kthread = p - > worker_private ;
2021-04-20 11:18:17 +03:00
if ( kthread & & ! ( p - > flags & PF_KTHREAD ) )
kthread = NULL ;
return kthread ;
}
kthread: dynamically allocate memory to store kthread's full name
When I was implementing a new per-cpu kthread cfs_migration, I found the
comm of it "cfs_migration/%u" is truncated due to the limitation of
TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19
all have the same name "cfs_migration/1", which will confuse the user.
This issue is not critical, because we can get the corresponding CPU
from the task's Cpus_allowed. But for kthreads corresponding to other
hardware devices, it is not easy to get the detailed device info from
task comm, for example,
jbd2/nvme0n1p2-
xfs-reclaim/sdf
Currently there are so many truncated kthreads:
rcu_tasks_kthre
rcu_tasks_rude_
rcu_tasks_trace
poll_mpt3sas0_s
ext4-rsv-conver
xfs-reclaim/sd{a, b, c, ...}
xfs-blockgc/sd{a, b, c, ...}
xfs-inodegc/sd{a, b, c, ...}
audit_send_repl
ecryptfs-kthrea
vfio-irqfd-clea
jbd2/nvme0n1p2-
...
We can shorten these names to work around this problem, but it may be
not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-'
for example, it is a nice name, and it is not a good idea to shorten it.
One possible way to fix this issue is extending the task comm size, but
as task->comm is used in lots of places, that may cause some potential
buffer overflows. Another more conservative approach is introducing a
new pointer to store kthread's full name if it is truncated, which won't
introduce too much overhead as it is in the non-critical path. Finally
we make a dicision to use the second approach. See also the discussions
in this thread:
https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/
After this change, the full name of these truncated kthreads will be
displayed via /proc/[pid]/comm:
rcu_tasks_kthread
rcu_tasks_rude_kthread
rcu_tasks_trace_kthread
poll_mpt3sas0_statu
ext4-rsv-conversion
xfs-reclaim/sdf1
xfs-blockgc/sdf1
xfs-inodegc/sdf1
audit_send_reply
ecryptfs-kthread
vfio-irqfd-cleanup
jbd2/nvme0n1p2-8
Link: https://lkml.kernel.org/r/20211120112850.46047-1-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 05:08:43 +03:00
void get_kthread_comm ( char * buf , size_t buf_size , struct task_struct * tsk )
{
struct kthread * kthread = to_kthread ( tsk ) ;
if ( ! kthread | | ! kthread - > full_name ) {
__get_task_comm ( buf , buf_size , tsk ) ;
return ;
}
strscpy_pad ( buf , kthread - > full_name , buf_size ) ;
}
2021-12-02 18:56:14 +03:00
bool set_kthread_struct ( struct task_struct * p )
2021-05-10 18:10:23 +03:00
{
struct kthread * kthread ;
2021-12-02 18:56:14 +03:00
if ( WARN_ON_ONCE ( to_kthread ( p ) ) )
return false ;
2021-05-10 18:10:23 +03:00
kthread = kzalloc ( sizeof ( * kthread ) , GFP_KERNEL ) ;
2021-12-02 18:56:14 +03:00
if ( ! kthread )
return false ;
init_completion ( & kthread - > exited ) ;
init_completion ( & kthread - > parked ) ;
p - > vfork_done = & kthread - > exited ;
2021-12-23 07:10:09 +03:00
p - > worker_private = kthread ;
2021-12-02 18:56:14 +03:00
return true ;
2021-05-10 18:10:23 +03:00
}
2016-11-29 20:50:57 +03:00
void free_kthread_struct ( struct task_struct * k )
{
2017-09-15 00:02:04 +03:00
struct kthread * kthread ;
2016-11-29 20:50:57 +03:00
/*
2021-12-02 18:56:14 +03:00
* Can be NULL if kmalloc ( ) in set_kthread_struct ( ) failed .
2016-11-29 20:50:57 +03:00
*/
2017-09-15 00:02:04 +03:00
kthread = to_kthread ( k ) ;
kthread: dynamically allocate memory to store kthread's full name
When I was implementing a new per-cpu kthread cfs_migration, I found the
comm of it "cfs_migration/%u" is truncated due to the limitation of
TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19
all have the same name "cfs_migration/1", which will confuse the user.
This issue is not critical, because we can get the corresponding CPU
from the task's Cpus_allowed. But for kthreads corresponding to other
hardware devices, it is not easy to get the detailed device info from
task comm, for example,
jbd2/nvme0n1p2-
xfs-reclaim/sdf
Currently there are so many truncated kthreads:
rcu_tasks_kthre
rcu_tasks_rude_
rcu_tasks_trace
poll_mpt3sas0_s
ext4-rsv-conver
xfs-reclaim/sd{a, b, c, ...}
xfs-blockgc/sd{a, b, c, ...}
xfs-inodegc/sd{a, b, c, ...}
audit_send_repl
ecryptfs-kthrea
vfio-irqfd-clea
jbd2/nvme0n1p2-
...
We can shorten these names to work around this problem, but it may be
not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-'
for example, it is a nice name, and it is not a good idea to shorten it.
One possible way to fix this issue is extending the task comm size, but
as task->comm is used in lots of places, that may cause some potential
buffer overflows. Another more conservative approach is introducing a
new pointer to store kthread's full name if it is truncated, which won't
introduce too much overhead as it is in the non-critical path. Finally
we make a dicision to use the second approach. See also the discussions
in this thread:
https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/
After this change, the full name of these truncated kthreads will be
displayed via /proc/[pid]/comm:
rcu_tasks_kthread
rcu_tasks_rude_kthread
rcu_tasks_trace_kthread
poll_mpt3sas0_statu
ext4-rsv-conversion
xfs-reclaim/sdf1
xfs-blockgc/sdf1
xfs-inodegc/sdf1
audit_send_reply
ecryptfs-kthread
vfio-irqfd-cleanup
jbd2/nvme0n1p2-8
Link: https://lkml.kernel.org/r/20211120112850.46047-1-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 05:08:43 +03:00
if ( ! kthread )
return ;
2017-09-26 21:02:12 +03:00
# ifdef CONFIG_BLK_CGROUP
kthread: dynamically allocate memory to store kthread's full name
When I was implementing a new per-cpu kthread cfs_migration, I found the
comm of it "cfs_migration/%u" is truncated due to the limitation of
TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19
all have the same name "cfs_migration/1", which will confuse the user.
This issue is not critical, because we can get the corresponding CPU
from the task's Cpus_allowed. But for kthreads corresponding to other
hardware devices, it is not easy to get the detailed device info from
task comm, for example,
jbd2/nvme0n1p2-
xfs-reclaim/sdf
Currently there are so many truncated kthreads:
rcu_tasks_kthre
rcu_tasks_rude_
rcu_tasks_trace
poll_mpt3sas0_s
ext4-rsv-conver
xfs-reclaim/sd{a, b, c, ...}
xfs-blockgc/sd{a, b, c, ...}
xfs-inodegc/sd{a, b, c, ...}
audit_send_repl
ecryptfs-kthrea
vfio-irqfd-clea
jbd2/nvme0n1p2-
...
We can shorten these names to work around this problem, but it may be
not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-'
for example, it is a nice name, and it is not a good idea to shorten it.
One possible way to fix this issue is extending the task comm size, but
as task->comm is used in lots of places, that may cause some potential
buffer overflows. Another more conservative approach is introducing a
new pointer to store kthread's full name if it is truncated, which won't
introduce too much overhead as it is in the non-critical path. Finally
we make a dicision to use the second approach. See also the discussions
in this thread:
https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/
After this change, the full name of these truncated kthreads will be
displayed via /proc/[pid]/comm:
rcu_tasks_kthread
rcu_tasks_rude_kthread
rcu_tasks_trace_kthread
poll_mpt3sas0_statu
ext4-rsv-conversion
xfs-reclaim/sdf1
xfs-blockgc/sdf1
xfs-inodegc/sdf1
audit_send_reply
ecryptfs-kthread
vfio-irqfd-cleanup
jbd2/nvme0n1p2-8
Link: https://lkml.kernel.org/r/20211120112850.46047-1-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 05:08:43 +03:00
WARN_ON_ONCE ( kthread - > blkcg_css ) ;
2017-09-15 00:02:04 +03:00
# endif
2021-12-23 07:10:09 +03:00
k - > worker_private = NULL ;
kthread: dynamically allocate memory to store kthread's full name
When I was implementing a new per-cpu kthread cfs_migration, I found the
comm of it "cfs_migration/%u" is truncated due to the limitation of
TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19
all have the same name "cfs_migration/1", which will confuse the user.
This issue is not critical, because we can get the corresponding CPU
from the task's Cpus_allowed. But for kthreads corresponding to other
hardware devices, it is not easy to get the detailed device info from
task comm, for example,
jbd2/nvme0n1p2-
xfs-reclaim/sdf
Currently there are so many truncated kthreads:
rcu_tasks_kthre
rcu_tasks_rude_
rcu_tasks_trace
poll_mpt3sas0_s
ext4-rsv-conver
xfs-reclaim/sd{a, b, c, ...}
xfs-blockgc/sd{a, b, c, ...}
xfs-inodegc/sd{a, b, c, ...}
audit_send_repl
ecryptfs-kthrea
vfio-irqfd-clea
jbd2/nvme0n1p2-
...
We can shorten these names to work around this problem, but it may be
not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-'
for example, it is a nice name, and it is not a good idea to shorten it.
One possible way to fix this issue is extending the task comm size, but
as task->comm is used in lots of places, that may cause some potential
buffer overflows. Another more conservative approach is introducing a
new pointer to store kthread's full name if it is truncated, which won't
introduce too much overhead as it is in the non-critical path. Finally
we make a dicision to use the second approach. See also the discussions
in this thread:
https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/
After this change, the full name of these truncated kthreads will be
displayed via /proc/[pid]/comm:
rcu_tasks_kthread
rcu_tasks_rude_kthread
rcu_tasks_trace_kthread
poll_mpt3sas0_statu
ext4-rsv-conversion
xfs-reclaim/sdf1
xfs-blockgc/sdf1
xfs-inodegc/sdf1
audit_send_reply
ecryptfs-kthread
vfio-irqfd-cleanup
jbd2/nvme0n1p2-8
Link: https://lkml.kernel.org/r/20211120112850.46047-1-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 05:08:43 +03:00
kfree ( kthread - > full_name ) ;
2017-09-15 00:02:04 +03:00
kfree ( kthread ) ;
2016-11-29 20:50:57 +03:00
}
2006-06-25 16:49:19 +04:00
/**
* kthread_should_stop - should this kthread return now ?
*
2007-02-10 12:45:59 +03:00
* When someone calls kthread_stop ( ) on your kthread , it will be woken
2006-06-25 16:49:19 +04:00
* and this will return true . You should then return , and your return
* value will be passed through to kthread_stop ( ) .
*/
2012-07-16 14:42:36 +04:00
bool kthread_should_stop ( void )
2005-04-17 02:20:36 +04:00
{
2012-07-16 14:42:36 +04:00
return test_bit ( KTHREAD_SHOULD_STOP , & to_kthread ( current ) - > flags ) ;
2005-04-17 02:20:36 +04:00
}
EXPORT_SYMBOL ( kthread_should_stop ) ;
2019-01-29 02:46:24 +03:00
bool __kthread_should_park ( struct task_struct * k )
{
return test_bit ( KTHREAD_SHOULD_PARK , & to_kthread ( k ) - > flags ) ;
}
EXPORT_SYMBOL_GPL ( __kthread_should_park ) ;
2012-07-16 14:42:36 +04:00
/**
* kthread_should_park - should this kthread park now ?
*
* When someone calls kthread_park ( ) on your kthread , it will be woken
* and this will return true . You should then do the necessary
* cleanup and call kthread_parkme ( )
*
* Similar to kthread_should_stop ( ) , but this keeps the thread alive
* and in a park position . kthread_unpark ( ) " restarts " the thread and
* calls the thread function again .
*/
bool kthread_should_park ( void )
{
2019-01-29 02:46:24 +03:00
return __kthread_should_park ( current ) ;
2012-07-16 14:42:36 +04:00
}
2015-08-07 01:46:45 +03:00
EXPORT_SYMBOL_GPL ( kthread_should_park ) ;
2012-07-16 14:42:36 +04:00
2011-11-22 00:32:23 +04:00
/**
* kthread_freezable_should_stop - should this freezable kthread return now ?
* @ was_frozen : optional out parameter , indicates whether % current was frozen
*
* kthread_should_stop ( ) for freezable kthreads , which will enter
* refrigerator if necessary . This function is safe from kthread_stop ( ) /
* freezer deadlock and freezable kthreads should use this function instead
* of calling try_to_freeze ( ) directly .
*/
bool kthread_freezable_should_stop ( bool * was_frozen )
{
bool frozen = false ;
might_sleep ( ) ;
if ( unlikely ( freezing ( current ) ) )
frozen = __refrigerator ( true ) ;
if ( was_frozen )
* was_frozen = frozen ;
return kthread_should_stop ( ) ;
}
EXPORT_SYMBOL_GPL ( kthread_freezable_should_stop ) ;
2020-05-06 19:09:34 +03:00
/**
* kthread_func - return the function specified on kthread creation
* @ task : kthread task in question
*
* Returns NULL if the task is not a kthread .
*/
void * kthread_func ( struct task_struct * task )
{
2021-04-20 11:18:17 +03:00
struct kthread * kthread = __to_kthread ( task ) ;
if ( kthread )
return kthread - > threadfn ;
2020-05-06 19:09:34 +03:00
return NULL ;
}
EXPORT_SYMBOL_GPL ( kthread_func ) ;
2010-06-29 12:07:09 +04:00
/**
* kthread_data - return data value specified on kthread creation
* @ task : kthread task in question
*
* Return the data value specified when kthread @ task was created .
* The caller is responsible for ensuring the validity of @ task when
* calling this function .
*/
void * kthread_data ( struct task_struct * task )
{
return to_kthread ( task ) - > data ;
}
2020-05-06 19:09:34 +03:00
EXPORT_SYMBOL_GPL ( kthread_data ) ;
2010-06-29 12:07:09 +04:00
2013-05-01 02:27:21 +04:00
/**
2016-10-11 23:55:17 +03:00
* kthread_probe_data - speculative version of kthread_data ( )
2013-05-01 02:27:21 +04:00
* @ task : possible kthread task in question
*
* @ task could be a kthread task . Return the data value specified when it
* was created if accessible . If @ task isn ' t a kthread task or its data is
* inaccessible for any reason , % NULL is returned . This function requires
* that @ task itself is safe to dereference .
*/
2016-10-11 23:55:17 +03:00
void * kthread_probe_data ( struct task_struct * task )
2013-05-01 02:27:21 +04:00
{
2021-04-20 11:18:17 +03:00
struct kthread * kthread = __to_kthread ( task ) ;
2013-05-01 02:27:21 +04:00
void * data = NULL ;
2021-04-20 11:18:17 +03:00
if ( kthread )
copy_from_kernel_nofault ( & data , & kthread - > data , sizeof ( data ) ) ;
2013-05-01 02:27:21 +04:00
return data ;
}
2012-07-16 14:42:36 +04:00
static void __kthread_parkme ( struct kthread * self )
{
2018-04-30 15:50:22 +03:00
for ( ; ; ) {
2018-06-07 12:45:49 +03:00
/*
* TASK_PARKED is a special state ; we must serialize against
* possible pending wakeups to avoid store - store collisions on
* task - > state .
*
* Such a collision might possibly result in the task state
* changin from TASK_PARKED and us failing the
* wait_task_inactive ( ) in kthread_park ( ) .
*/
set_special_state ( TASK_PARKED ) ;
2018-04-30 15:50:22 +03:00
if ( ! test_bit ( KTHREAD_SHOULD_PARK , & self - > flags ) )
break ;
2018-06-07 12:45:49 +03:00
2020-03-06 10:01:33 +03:00
/*
* Thread is going to call schedule ( ) , do not preempt it ,
* or the caller of kthread_park ( ) may spend more time in
* wait_task_inactive ( ) .
*/
preempt_disable ( ) ;
2018-06-07 11:55:56 +03:00
complete ( & self - > parked ) ;
2020-03-06 10:01:33 +03:00
schedule_preempt_disabled ( ) ;
preempt_enable ( ) ;
2012-07-16 14:42:36 +04:00
}
__set_current_state ( TASK_RUNNING ) ;
}
void kthread_parkme ( void )
{
__kthread_parkme ( to_kthread ( current ) ) ;
}
2015-08-07 01:46:45 +03:00
EXPORT_SYMBOL_GPL ( kthread_parkme ) ;
2012-07-16 14:42:36 +04:00
2021-11-22 19:27:36 +03:00
/**
* kthread_exit - Cause the current kthread return @ result to kthread_stop ( ) .
* @ result : The integer value to return to kthread_stop ( ) .
*
* While kthread_exit can be called directly , it exists so that
* functions which do some additional work in non - modular code such as
* module_put_and_kthread_exit can be implemented .
*
* Does not return .
*/
void __noreturn kthread_exit ( long result )
{
2021-12-03 20:42:49 +03:00
struct kthread * kthread = to_kthread ( current ) ;
kthread - > result = result ;
do_exit ( 0 ) ;
2021-11-22 19:27:36 +03:00
}
2021-11-22 20:15:19 +03:00
/**
2021-12-14 20:25:01 +03:00
* kthread_complete_and_exit - Exit the current kthread .
2021-11-22 20:15:19 +03:00
* @ comp : Completion to complete
* @ code : The integer value to return to kthread_stop ( ) .
*
* If present complete @ comp and the reuturn code to kthread_stop ( ) .
*
* A kernel thread whose module may be removed after the completion of
* @ comp can use this function exit safely .
*
* Does not return .
*/
void __noreturn kthread_complete_and_exit ( struct completion * comp , long code )
{
if ( comp )
complete ( comp ) ;
kthread_exit ( code ) ;
}
EXPORT_SYMBOL ( kthread_complete_and_exit ) ;
2005-04-17 02:20:36 +04:00
static int kthread ( void * _create )
{
2020-11-10 14:38:47 +03:00
static const struct sched_param param = { . sched_priority = 0 } ;
2009-06-18 03:27:45 +04:00
/* Copy data: it's on kthread's stack */
2005-04-17 02:20:36 +04:00
struct kthread_create_info * create = _create ;
2009-06-18 03:27:45 +04:00
int ( * threadfn ) ( void * data ) = create - > threadfn ;
void * data = create - > data ;
2013-11-13 03:06:45 +04:00
struct completion * done ;
2016-11-29 20:50:57 +03:00
struct kthread * self ;
2009-06-18 03:27:45 +04:00
int ret ;
2005-04-17 02:20:36 +04:00
2021-05-10 18:10:23 +03:00
self = to_kthread ( current ) ;
2005-04-17 02:20:36 +04:00
kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal
The comments in kernel/kthread.c create a feeling that only SIGKILL is
able to terminate the creation of kernel kthreads by
kthread_create()/_on_node()/_on_cpu() APIs.
In reality, wait_for_completion_killable() might be killed by any fatal
signal that does not have a custom handler:
(!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \
(t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL)
static inline void signal_wake_up(struct task_struct *t, bool resume)
{
signal_wake_up_state(t, resume ? TASK_WAKEKILL : 0);
}
static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
{
[...]
/*
* Found a killable thread. If the signal will be fatal,
* then start taking the whole group down immediately.
*/
if (sig_fatal(p, sig) ...) {
if (!sig_kernel_coredump(sig)) {
[...]
do {
task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
sigaddset(&t->pending.signal, SIGKILL);
signal_wake_up(t, 1);
} while_each_thread(p, t);
return;
}
}
}
Update the comments in kernel/kthread.c to make this more obvious.
The motivation for this change was debugging why a module initialization
failed. The module was being loaded from initrd. It "magically" failed
when systemd was switching to the real root. The clean up operations sent
SIGTERM to various pending processed that were started from initrd.
Link: https://lkml.kernel.org/r/20220315102444.2380-1-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-03-15 13:24:44 +03:00
/* Release the structure when caller killed by a fatal signal. */
2013-11-13 03:06:45 +04:00
done = xchg ( & create - > done , NULL ) ;
if ( ! done ) {
kfree ( create ) ;
2021-11-22 19:27:36 +03:00
kthread_exit ( - EINTR ) ;
2016-11-29 20:50:57 +03:00
}
2020-05-06 19:09:34 +03:00
self - > threadfn = threadfn ;
2016-11-29 20:50:57 +03:00
self - > data = data ;
2020-11-10 14:38:47 +03:00
/*
* The new thread inherited kthreadd ' s priority and CPU mask . Reset
* back to default in case they have been changed .
*/
sched_setscheduler_nocheck ( current , SCHED_NORMAL , & param ) ;
2022-02-07 18:59:06 +03:00
set_cpus_allowed_ptr ( current , housekeeping_cpumask ( HK_TYPE_KTHREAD ) ) ;
2020-11-10 14:38:47 +03:00
2005-04-17 02:20:36 +04:00
/* OK, tell user we're spawned, wait for stop or wakeup */
2007-05-24 00:57:27 +04:00
__set_current_state ( TASK_UNINTERRUPTIBLE ) ;
2009-04-09 19:50:35 +04:00
create - > result = current ;
2020-03-06 10:01:33 +03:00
/*
* Thread is going to call schedule ( ) , do not preempt it ,
* or the creator may spend more time in wait_task_inactive ( ) .
*/
preempt_disable ( ) ;
2013-11-13 03:06:45 +04:00
complete ( done ) ;
2020-03-06 10:01:33 +03:00
schedule_preempt_disabled ( ) ;
preempt_enable ( ) ;
2005-04-17 02:20:36 +04:00
2009-06-18 03:27:45 +04:00
ret = - EINTR ;
2016-11-29 20:50:57 +03:00
if ( ! test_bit ( KTHREAD_SHOULD_STOP , & self - > flags ) ) {
cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups
Creation of a kthread goes through a couple interlocked stages between
the kthread itself and its creator. Once the new kthread starts
running, it initializes itself and wakes up the creator. The creator
then can further configure the kthread and then let it start doing its
job by waking it up.
In this configuration-by-creator stage, the creator is the only one
that can wake it up but the kthread is visible to userland. When
altering the kthread's attributes from userland is allowed, this is
fine; however, for cases where CPU affinity is critical,
kthread_bind() is used to first disable affinity changes from userland
and then set the affinity. This also prevents the kthread from being
migrated into non-root cgroups as that can affect the CPU affinity and
many other things.
Unfortunately, the cgroup side of protection is racy. While the
PF_NO_SETAFFINITY flag prevents further migrations, userland can win
the race before the creator sets the flag with kthread_bind() and put
the kthread in a non-root cgroup, which can lead to all sorts of
problems including incorrect CPU affinity and starvation.
This bug got triggered by userland which periodically tries to migrate
all processes in the root cpuset cgroup to a non-root one. Per-cpu
workqueue workers got caught while being created and ended up with
incorrected CPU affinity breaking concurrency management and sometimes
stalling workqueue execution.
This patch adds task->no_cgroup_migration which disallows the task to
be migrated by userland. kthreadd starts with the flag set making
every child kthread start in the root cgroup with migration
disallowed. The flag is cleared after the kthread finishes
initialization by which time PF_NO_SETAFFINITY is set if the kthread
should stay in the root cgroup.
It'd be better to wait for the initialization instead of failing but I
couldn't think of a way of implementing that without adding either a
new PF flag, or sleeping and retrying from waiting side. Even if
userland depends on changing cgroup membership of a kthread, it either
has to be synchronized with kthread_create() or periodically repeat,
so it's unlikely that this would break anything.
v2: Switch to a simpler implementation using a new task_struct bit
field suggested by Oleg.
Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-and-debugged-by: Chris Mason <clm@fb.com>
Cc: stable@vger.kernel.org # v4.3+ (we can't close the race on < v4.3)
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-03-16 23:54:24 +03:00
cgroup_kthread_ready ( ) ;
2016-11-29 20:50:57 +03:00
__kthread_parkme ( self ) ;
2012-07-16 14:42:36 +04:00
ret = threadfn ( data ) ;
}
2021-11-22 19:27:36 +03:00
kthread_exit ( ret ) ;
2005-04-17 02:20:36 +04:00
}
2021-01-11 13:48:07 +03:00
/* called from kernel_clone() to get node information for about to be created task */
2011-03-23 02:30:44 +03:00
int tsk_fork_get_node ( struct task_struct * tsk )
{
# ifdef CONFIG_NUMA
if ( tsk = = kthreadd_task )
return tsk - > pref_node_fork ;
# endif
2014-04-04 01:46:25 +04:00
return NUMA_NO_NODE ;
2011-03-23 02:30:44 +03:00
}
2007-05-09 13:34:32 +04:00
static void create_kthread ( struct kthread_create_info * create )
2005-04-17 02:20:36 +04:00
{
int pid ;
2011-03-23 02:30:44 +03:00
# ifdef CONFIG_NUMA
current - > pref_node_fork = create - > node ;
# endif
2005-04-17 02:20:36 +04:00
/* We want our own signal handler (we take no signals by default). */
pid = kernel_thread ( kthread , create , CLONE_FS | CLONE_FILES | SIGCHLD ) ;
2009-06-18 03:27:43 +04:00
if ( pid < 0 ) {
kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal
The comments in kernel/kthread.c create a feeling that only SIGKILL is
able to terminate the creation of kernel kthreads by
kthread_create()/_on_node()/_on_cpu() APIs.
In reality, wait_for_completion_killable() might be killed by any fatal
signal that does not have a custom handler:
(!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \
(t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL)
static inline void signal_wake_up(struct task_struct *t, bool resume)
{
signal_wake_up_state(t, resume ? TASK_WAKEKILL : 0);
}
static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
{
[...]
/*
* Found a killable thread. If the signal will be fatal,
* then start taking the whole group down immediately.
*/
if (sig_fatal(p, sig) ...) {
if (!sig_kernel_coredump(sig)) {
[...]
do {
task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
sigaddset(&t->pending.signal, SIGKILL);
signal_wake_up(t, 1);
} while_each_thread(p, t);
return;
}
}
}
Update the comments in kernel/kthread.c to make this more obvious.
The motivation for this change was debugging why a module initialization
failed. The module was being loaded from initrd. It "magically" failed
when systemd was switching to the real root. The clean up operations sent
SIGTERM to various pending processed that were started from initrd.
Link: https://lkml.kernel.org/r/20220315102444.2380-1-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-03-15 13:24:44 +03:00
/* Release the structure when caller killed by a fatal signal. */
2013-11-13 03:06:45 +04:00
struct completion * done = xchg ( & create - > done , NULL ) ;
if ( ! done ) {
kfree ( create ) ;
return ;
}
2005-04-17 02:20:36 +04:00
create - > result = ERR_PTR ( pid ) ;
2013-11-13 03:06:45 +04:00
complete ( done ) ;
2009-06-18 03:27:43 +04:00
}
2005-04-17 02:20:36 +04:00
}
2016-12-13 03:40:39 +03:00
static __printf ( 4 , 0 )
struct task_struct * __kthread_create_on_node ( int ( * threadfn ) ( void * data ) ,
2016-10-11 23:55:27 +03:00
void * data , int node ,
const char namefmt [ ] ,
va_list args )
2005-04-17 02:20:36 +04:00
{
2013-11-13 03:06:45 +04:00
DECLARE_COMPLETION_ONSTACK ( done ) ;
struct task_struct * task ;
struct kthread_create_info * create = kmalloc ( sizeof ( * create ) ,
GFP_KERNEL ) ;
if ( ! create )
return ERR_PTR ( - ENOMEM ) ;
create - > threadfn = threadfn ;
create - > data = data ;
create - > node = node ;
create - > done = & done ;
2007-05-09 13:34:32 +04:00
spin_lock ( & kthread_create_lock ) ;
2013-11-13 03:06:45 +04:00
list_add_tail ( & create - > list , & kthread_create_list ) ;
2007-05-09 13:34:32 +04:00
spin_unlock ( & kthread_create_lock ) ;
2008-04-29 11:59:23 +04:00
wake_up_process ( kthreadd_task ) ;
2013-11-13 03:06:45 +04:00
/*
* Wait for completion in killable state , for I might be chosen by
* the OOM killer while kthreadd is trying to allocate memory for
* new kernel thread .
*/
if ( unlikely ( wait_for_completion_killable ( & done ) ) ) {
/*
kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal
The comments in kernel/kthread.c create a feeling that only SIGKILL is
able to terminate the creation of kernel kthreads by
kthread_create()/_on_node()/_on_cpu() APIs.
In reality, wait_for_completion_killable() might be killed by any fatal
signal that does not have a custom handler:
(!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \
(t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL)
static inline void signal_wake_up(struct task_struct *t, bool resume)
{
signal_wake_up_state(t, resume ? TASK_WAKEKILL : 0);
}
static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
{
[...]
/*
* Found a killable thread. If the signal will be fatal,
* then start taking the whole group down immediately.
*/
if (sig_fatal(p, sig) ...) {
if (!sig_kernel_coredump(sig)) {
[...]
do {
task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
sigaddset(&t->pending.signal, SIGKILL);
signal_wake_up(t, 1);
} while_each_thread(p, t);
return;
}
}
}
Update the comments in kernel/kthread.c to make this more obvious.
The motivation for this change was debugging why a module initialization
failed. The module was being loaded from initrd. It "magically" failed
when systemd was switching to the real root. The clean up operations sent
SIGTERM to various pending processed that were started from initrd.
Link: https://lkml.kernel.org/r/20220315102444.2380-1-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-03-15 13:24:44 +03:00
* If I was killed by a fatal signal before kthreadd ( or new
* kernel thread ) calls complete ( ) , leave the cleanup of this
* structure to that thread .
2013-11-13 03:06:45 +04:00
*/
if ( xchg ( & create - > done , NULL ) )
2014-06-05 03:05:36 +04:00
return ERR_PTR ( - EINTR ) ;
2013-11-13 03:06:45 +04:00
/*
* kthreadd ( or new kernel thread ) will call complete ( )
* shortly .
*/
wait_for_completion ( & done ) ;
}
task = create - > result ;
if ( ! IS_ERR ( task ) ) {
kthread, tracing: Don't expose half-written comm when creating kthreads
There is a window for racing when printing directly to task->comm,
allowing other threads to see a non-terminated string. The vsnprintf
function fills the buffer, counts the truncated chars, then finally
writes the \0 at the end.
creator other
vsnprintf:
fill (not terminated)
count the rest trace_sched_waking(p):
... memcpy(comm, p->comm, TASK_COMM_LEN)
write \0
The consequences depend on how 'other' uses the string. In our case,
it was copied into the tracing system's saved cmdlines, a buffer of
adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):
crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk'
0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12"
...and a strcpy out of there would cause stack corruption:
[224761.522292] Kernel panic - not syncing: stack-protector:
Kernel stack is corrupted in: ffffff9bf9783c78
crash-arm64> kbt | grep 'comm\|trace_print_context'
#6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
comm (char [16]) = "irq/497-pwr_even"
crash-arm64> rd 0xffffffd4d0e17d14 8
ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_
ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16:
ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`..
ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`..........
The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was
likely needed because of this same bug.
Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
This way, there won't be a window where comm is not terminated.
Link: http://lkml.kernel.org/r/20180726071539.188015-1-snild@sony.com
Cc: stable@vger.kernel.org
Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure")
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Snild Dolkow <snild@sony.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-07-26 10:15:39 +03:00
char name [ TASK_COMM_LEN ] ;
kthread: dynamically allocate memory to store kthread's full name
When I was implementing a new per-cpu kthread cfs_migration, I found the
comm of it "cfs_migration/%u" is truncated due to the limitation of
TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19
all have the same name "cfs_migration/1", which will confuse the user.
This issue is not critical, because we can get the corresponding CPU
from the task's Cpus_allowed. But for kthreads corresponding to other
hardware devices, it is not easy to get the detailed device info from
task comm, for example,
jbd2/nvme0n1p2-
xfs-reclaim/sdf
Currently there are so many truncated kthreads:
rcu_tasks_kthre
rcu_tasks_rude_
rcu_tasks_trace
poll_mpt3sas0_s
ext4-rsv-conver
xfs-reclaim/sd{a, b, c, ...}
xfs-blockgc/sd{a, b, c, ...}
xfs-inodegc/sd{a, b, c, ...}
audit_send_repl
ecryptfs-kthrea
vfio-irqfd-clea
jbd2/nvme0n1p2-
...
We can shorten these names to work around this problem, but it may be
not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-'
for example, it is a nice name, and it is not a good idea to shorten it.
One possible way to fix this issue is extending the task comm size, but
as task->comm is used in lots of places, that may cause some potential
buffer overflows. Another more conservative approach is introducing a
new pointer to store kthread's full name if it is truncated, which won't
introduce too much overhead as it is in the non-critical path. Finally
we make a dicision to use the second approach. See also the discussions
in this thread:
https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/
After this change, the full name of these truncated kthreads will be
displayed via /proc/[pid]/comm:
rcu_tasks_kthread
rcu_tasks_rude_kthread
rcu_tasks_trace_kthread
poll_mpt3sas0_statu
ext4-rsv-conversion
xfs-reclaim/sdf1
xfs-blockgc/sdf1
xfs-inodegc/sdf1
audit_send_reply
ecryptfs-kthread
vfio-irqfd-cleanup
jbd2/nvme0n1p2-8
Link: https://lkml.kernel.org/r/20211120112850.46047-1-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 05:08:43 +03:00
va_list aq ;
int len ;
2009-04-09 19:50:36 +04:00
kthread, tracing: Don't expose half-written comm when creating kthreads
There is a window for racing when printing directly to task->comm,
allowing other threads to see a non-terminated string. The vsnprintf
function fills the buffer, counts the truncated chars, then finally
writes the \0 at the end.
creator other
vsnprintf:
fill (not terminated)
count the rest trace_sched_waking(p):
... memcpy(comm, p->comm, TASK_COMM_LEN)
write \0
The consequences depend on how 'other' uses the string. In our case,
it was copied into the tracing system's saved cmdlines, a buffer of
adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):
crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk'
0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12"
...and a strcpy out of there would cause stack corruption:
[224761.522292] Kernel panic - not syncing: stack-protector:
Kernel stack is corrupted in: ffffff9bf9783c78
crash-arm64> kbt | grep 'comm\|trace_print_context'
#6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
comm (char [16]) = "irq/497-pwr_even"
crash-arm64> rd 0xffffffd4d0e17d14 8
ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_
ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16:
ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`..
ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`..........
The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was
likely needed because of this same bug.
Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
This way, there won't be a window where comm is not terminated.
Link: http://lkml.kernel.org/r/20180726071539.188015-1-snild@sony.com
Cc: stable@vger.kernel.org
Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure")
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Snild Dolkow <snild@sony.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-07-26 10:15:39 +03:00
/*
* task is already visible to other tasks , so updating
* COMM must be protected .
*/
kthread: dynamically allocate memory to store kthread's full name
When I was implementing a new per-cpu kthread cfs_migration, I found the
comm of it "cfs_migration/%u" is truncated due to the limitation of
TASK_COMM_LEN. For example, the comm of the percpu thread on CPU10~19
all have the same name "cfs_migration/1", which will confuse the user.
This issue is not critical, because we can get the corresponding CPU
from the task's Cpus_allowed. But for kthreads corresponding to other
hardware devices, it is not easy to get the detailed device info from
task comm, for example,
jbd2/nvme0n1p2-
xfs-reclaim/sdf
Currently there are so many truncated kthreads:
rcu_tasks_kthre
rcu_tasks_rude_
rcu_tasks_trace
poll_mpt3sas0_s
ext4-rsv-conver
xfs-reclaim/sd{a, b, c, ...}
xfs-blockgc/sd{a, b, c, ...}
xfs-inodegc/sd{a, b, c, ...}
audit_send_repl
ecryptfs-kthrea
vfio-irqfd-clea
jbd2/nvme0n1p2-
...
We can shorten these names to work around this problem, but it may be
not applied to all of the truncated kthreads. Take 'jbd2/nvme0n1p2-'
for example, it is a nice name, and it is not a good idea to shorten it.
One possible way to fix this issue is extending the task comm size, but
as task->comm is used in lots of places, that may cause some potential
buffer overflows. Another more conservative approach is introducing a
new pointer to store kthread's full name if it is truncated, which won't
introduce too much overhead as it is in the non-critical path. Finally
we make a dicision to use the second approach. See also the discussions
in this thread:
https://lore.kernel.org/lkml/20211101060419.4682-1-laoar.shao@gmail.com/
After this change, the full name of these truncated kthreads will be
displayed via /proc/[pid]/comm:
rcu_tasks_kthread
rcu_tasks_rude_kthread
rcu_tasks_trace_kthread
poll_mpt3sas0_statu
ext4-rsv-conversion
xfs-reclaim/sdf1
xfs-blockgc/sdf1
xfs-inodegc/sdf1
audit_send_reply
ecryptfs-kthread
vfio-irqfd-cleanup
jbd2/nvme0n1p2-8
Link: https://lkml.kernel.org/r/20211120112850.46047-1-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 05:08:43 +03:00
va_copy ( aq , args ) ;
len = vsnprintf ( name , sizeof ( name ) , namefmt , aq ) ;
va_end ( aq ) ;
if ( len > = TASK_COMM_LEN ) {
struct kthread * kthread = to_kthread ( task ) ;
/* leave it truncated when out of memory. */
kthread - > full_name = kvasprintf ( GFP_KERNEL , namefmt , args ) ;
}
kthread, tracing: Don't expose half-written comm when creating kthreads
There is a window for racing when printing directly to task->comm,
allowing other threads to see a non-terminated string. The vsnprintf
function fills the buffer, counts the truncated chars, then finally
writes the \0 at the end.
creator other
vsnprintf:
fill (not terminated)
count the rest trace_sched_waking(p):
... memcpy(comm, p->comm, TASK_COMM_LEN)
write \0
The consequences depend on how 'other' uses the string. In our case,
it was copied into the tracing system's saved cmdlines, a buffer of
adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):
crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk'
0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12"
...and a strcpy out of there would cause stack corruption:
[224761.522292] Kernel panic - not syncing: stack-protector:
Kernel stack is corrupted in: ffffff9bf9783c78
crash-arm64> kbt | grep 'comm\|trace_print_context'
#6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
comm (char [16]) = "irq/497-pwr_even"
crash-arm64> rd 0xffffffd4d0e17d14 8
ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_
ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16:
ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`..
ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`..........
The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was
likely needed because of this same bug.
Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
This way, there won't be a window where comm is not terminated.
Link: http://lkml.kernel.org/r/20180726071539.188015-1-snild@sony.com
Cc: stable@vger.kernel.org
Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure")
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Snild Dolkow <snild@sony.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-07-26 10:15:39 +03:00
set_task_comm ( task , name ) ;
2005-04-17 02:20:36 +04:00
}
2013-11-13 03:06:45 +04:00
kfree ( create ) ;
return task ;
2005-04-17 02:20:36 +04:00
}
2016-10-11 23:55:27 +03:00
/**
* kthread_create_on_node - create a kthread .
* @ threadfn : the function to run until signal_pending ( current ) .
* @ data : data ptr for @ threadfn .
* @ node : task and thread structures for the thread are allocated on this node
* @ namefmt : printf - style name for the thread .
*
* Description : This helper function creates and names a kernel
* thread . The thread will be stopped : use wake_up_process ( ) to start
* it . See also kthread_run ( ) . The new thread has SCHED_NORMAL policy and
* is affine to all CPUs .
*
* If thread is going to be bound on a particular cpu , give its node
* in @ node , to get NUMA affinity for kthread stack , or else give NUMA_NO_NODE .
* When woken , the thread will run @ threadfn ( ) with @ data as its
2021-10-20 20:43:58 +03:00
* argument . @ threadfn ( ) can either return directly if it is a
2016-10-11 23:55:27 +03:00
* standalone thread for which no one will call kthread_stop ( ) , or
* return when ' kthread_should_stop ( ) ' is true ( which means
* kthread_stop ( ) has been called ) . The return value should be zero
* or a negative error number ; it will be passed to kthread_stop ( ) .
*
* Returns a task_struct or ERR_PTR ( - ENOMEM ) or ERR_PTR ( - EINTR ) .
*/
struct task_struct * kthread_create_on_node ( int ( * threadfn ) ( void * data ) ,
void * data , int node ,
const char namefmt [ ] ,
. . . )
{
struct task_struct * task ;
va_list args ;
va_start ( args , namefmt ) ;
task = __kthread_create_on_node ( threadfn , data , node , namefmt , args ) ;
va_end ( args ) ;
return task ;
}
2011-03-23 02:30:44 +03:00
EXPORT_SYMBOL ( kthread_create_on_node ) ;
2005-04-17 02:20:36 +04:00
2021-06-11 11:28:17 +03:00
static void __kthread_bind_mask ( struct task_struct * p , const struct cpumask * mask , unsigned int state )
2012-07-16 14:42:36 +04:00
{
2015-05-15 18:43:34 +03:00
unsigned long flags ;
2013-04-09 11:33:34 +04:00
if ( ! wait_task_inactive ( p , state ) ) {
WARN_ON ( 1 ) ;
return ;
}
2015-05-15 18:43:34 +03:00
2012-07-16 14:42:36 +04:00
/* It's safe because the task is inactive. */
2015-05-15 18:43:34 +03:00
raw_spin_lock_irqsave ( & p - > pi_lock , flags ) ;
do_set_cpus_allowed ( p , mask ) ;
2013-03-20 00:45:20 +04:00
p - > flags | = PF_NO_SETAFFINITY ;
2015-05-15 18:43:34 +03:00
raw_spin_unlock_irqrestore ( & p - > pi_lock , flags ) ;
}
2021-06-11 11:28:17 +03:00
static void __kthread_bind ( struct task_struct * p , unsigned int cpu , unsigned int state )
2015-05-15 18:43:34 +03:00
{
__kthread_bind_mask ( p , cpumask_of ( cpu ) , state ) ;
}
void kthread_bind_mask ( struct task_struct * p , const struct cpumask * mask )
{
__kthread_bind_mask ( p , mask , TASK_UNINTERRUPTIBLE ) ;
2012-07-16 14:42:36 +04:00
}
2009-12-16 20:04:39 +03:00
/**
* kthread_bind - bind a just - created kthread to a cpu .
* @ p : thread created by kthread_create ( ) .
* @ cpu : cpu ( might not be online , must be possible ) for @ k to run on .
*
* Description : This function is equivalent to set_cpus_allowed ( ) ,
* except that @ cpu doesn ' t need to be online , and the thread must be
* stopped ( i . e . , just returned from kthread_create ( ) ) .
*/
void kthread_bind ( struct task_struct * p , unsigned int cpu )
{
2013-04-09 11:33:34 +04:00
__kthread_bind ( p , cpu , TASK_UNINTERRUPTIBLE ) ;
2009-12-16 20:04:39 +03:00
}
EXPORT_SYMBOL ( kthread_bind ) ;
2012-07-16 14:42:36 +04:00
/**
* kthread_create_on_cpu - Create a cpu bound kthread
* @ threadfn : the function to run until signal_pending ( current ) .
* @ data : data ptr for @ threadfn .
* @ cpu : The cpu on which the thread should be bound ,
* @ namefmt : printf - style name for the thread . Format is restricted
* to " name.*%u " . Code fills in cpu number .
*
* Description : This helper function creates and names a kernel thread
*/
struct task_struct * kthread_create_on_cpu ( int ( * threadfn ) ( void * data ) ,
void * data , unsigned int cpu ,
const char * namefmt )
{
struct task_struct * p ;
2014-10-10 02:26:18 +04:00
p = kthread_create_on_node ( threadfn , data , cpu_to_node ( cpu ) , namefmt ,
2012-07-16 14:42:36 +04:00
cpu ) ;
if ( IS_ERR ( p ) )
return p ;
2016-10-11 23:55:23 +03:00
kthread_bind ( p , cpu ) ;
/* CPU hotplug need to bind once again when unparking the thread. */
2012-07-16 14:42:36 +04:00
to_kthread ( p ) - > cpu = cpu ;
return p ;
}
2022-01-15 01:02:52 +03:00
EXPORT_SYMBOL ( kthread_create_on_cpu ) ;
2012-07-16 14:42:36 +04:00
2021-01-12 13:24:04 +03:00
void kthread_set_per_cpu ( struct task_struct * k , int cpu )
{
struct kthread * kthread = to_kthread ( k ) ;
if ( ! kthread )
return ;
WARN_ON_ONCE ( ! ( k - > flags & PF_NO_SETAFFINITY ) ) ;
if ( cpu < 0 ) {
clear_bit ( KTHREAD_IS_PER_CPU , & kthread - > flags ) ;
return ;
}
kthread - > cpu = cpu ;
set_bit ( KTHREAD_IS_PER_CPU , & kthread - > flags ) ;
}
2021-04-20 11:18:17 +03:00
bool kthread_is_per_cpu ( struct task_struct * p )
2021-01-12 13:24:04 +03:00
{
2021-04-20 11:18:17 +03:00
struct kthread * kthread = __to_kthread ( p ) ;
2021-01-12 13:24:04 +03:00
if ( ! kthread )
return false ;
return test_bit ( KTHREAD_IS_PER_CPU , & kthread - > flags ) ;
}
kthread: Don't use to_live_kthread() in kthread_[un]park()
Now that to_kthread() is always validm change kthread_park() and
kthread_unpark() to use it and kill to_live_kthread().
The conversion of kthread_unpark() is trivial. If KTHREAD_IS_PARKED is set
then the task has called complete(&self->parked) and there the function
cannot race against a concurrent kthread_stop() and exit.
kthread_park() is more tricky, because its semantics are not well
defined. It returns -ENOSYS if the thread exited but this can never happen
and as Roman pointed out kthread_park() can obviously block forever if it
would race with the exiting kthread.
The usage of kthread_park() in cpuhp code (cpu.c, smpboot.c, stop_machine.c)
is fine. It can never see an exiting/exited kthread, smpboot_destroy_threads()
clears *ht->store, smpboot_park_thread() checks it is not NULL under the same
smpboot_threads_lock. cpuhp_threads and cpu_stop_threads never exit, so other
callers are fine too.
But it has two more users:
- watchdog_park_threads():
The code is actually correct, get_online_cpus() ensures that
kthread_park() can't race with itself (note that kthread_park() can't
handle this race correctly), but it should not use kthread_park()
directly.
- drivers/gpu/drm/amd/scheduler/gpu_scheduler.c should not use
kthread_park() either.
kthread_park() must not be called after amd_sched_fini() which does
kthread_stop(), otherwise even to_live_kthread() is not safe because
task_struct can be already freed and sched->thread can point to nowhere.
The usage of kthread_park/unpark should either be restricted to core code
which is properly protected against the exit race or made more robust so it
is safe to use it in drivers.
To catch eventual exit issues, add a WARN_ON(PF_EXITING) for now.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chunming Zhou <David1.Zhou@amd.com>
Cc: Roman Pen <roman.penyaev@profitbricks.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20161129175107.GA5339@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-29 20:51:07 +03:00
/**
* kthread_unpark - unpark a thread created by kthread_create ( ) .
* @ k : thread created by kthread_create ( ) .
*
* Sets kthread_should_park ( ) for @ k to return false , wakes it , and
* waits for it to return . If the thread is marked percpu then its
* bound to the cpu again .
*/
void kthread_unpark ( struct task_struct * k )
2013-04-09 11:33:34 +04:00
{
kthread: Don't use to_live_kthread() in kthread_[un]park()
Now that to_kthread() is always validm change kthread_park() and
kthread_unpark() to use it and kill to_live_kthread().
The conversion of kthread_unpark() is trivial. If KTHREAD_IS_PARKED is set
then the task has called complete(&self->parked) and there the function
cannot race against a concurrent kthread_stop() and exit.
kthread_park() is more tricky, because its semantics are not well
defined. It returns -ENOSYS if the thread exited but this can never happen
and as Roman pointed out kthread_park() can obviously block forever if it
would race with the exiting kthread.
The usage of kthread_park() in cpuhp code (cpu.c, smpboot.c, stop_machine.c)
is fine. It can never see an exiting/exited kthread, smpboot_destroy_threads()
clears *ht->store, smpboot_park_thread() checks it is not NULL under the same
smpboot_threads_lock. cpuhp_threads and cpu_stop_threads never exit, so other
callers are fine too.
But it has two more users:
- watchdog_park_threads():
The code is actually correct, get_online_cpus() ensures that
kthread_park() can't race with itself (note that kthread_park() can't
handle this race correctly), but it should not use kthread_park()
directly.
- drivers/gpu/drm/amd/scheduler/gpu_scheduler.c should not use
kthread_park() either.
kthread_park() must not be called after amd_sched_fini() which does
kthread_stop(), otherwise even to_live_kthread() is not safe because
task_struct can be already freed and sched->thread can point to nowhere.
The usage of kthread_park/unpark should either be restricted to core code
which is properly protected against the exit race or made more robust so it
is safe to use it in drivers.
To catch eventual exit issues, add a WARN_ON(PF_EXITING) for now.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chunming Zhou <David1.Zhou@amd.com>
Cc: Roman Pen <roman.penyaev@profitbricks.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20161129175107.GA5339@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-29 20:51:07 +03:00
struct kthread * kthread = to_kthread ( k ) ;
2013-04-09 11:33:34 +04:00
/*
kthread, sched/wait: Fix kthread_parkme() completion issue
Even with the wait-loop fixed, there is a further issue with
kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
smpboot_park_threads() can return before all those threads are in fact
blocked, due to the placement of the complete() in __kthread_parkme().
When that happens, sched_cpu_dying() -> migrate_tasks() can end up
migrating such a still runnable task onto another CPU.
Normally the task will have hit schedule() and gone to sleep by the
time we do kthread_unpark(), which will then do __kthread_bind() to
re-bind the task to the correct CPU.
However, when we loose the initial TASK_PARKED store to the concurrent
wakeup issue described previously, do the complete(), get migrated, it
is possible to either:
- observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
the park and set TASK_RUNNING, or
- __kthread_bind()'s wait_task_inactive() to observe the competing
TASK_RUNNING store.
Either way the WARN() in __kthread_bind() will trigger and fail to
correctly set the CPU affinity.
Fix this by only issuing the complete() when the kthread has scheduled
out. This does away with all the icky 'still running' nonsense.
The alternative is to promote TASK_PARKED to a special state, this
guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
and we'll end up doing the right thing, but this preserves the whole
icky business of potentially migating the still runnable thing.
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-01 19:14:45 +03:00
* Newly created kthread was parked when the CPU was offline .
* The binding was lost and we need to set it again .
2013-04-09 11:33:34 +04:00
*/
kthread, sched/wait: Fix kthread_parkme() completion issue
Even with the wait-loop fixed, there is a further issue with
kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
smpboot_park_threads() can return before all those threads are in fact
blocked, due to the placement of the complete() in __kthread_parkme().
When that happens, sched_cpu_dying() -> migrate_tasks() can end up
migrating such a still runnable task onto another CPU.
Normally the task will have hit schedule() and gone to sleep by the
time we do kthread_unpark(), which will then do __kthread_bind() to
re-bind the task to the correct CPU.
However, when we loose the initial TASK_PARKED store to the concurrent
wakeup issue described previously, do the complete(), get migrated, it
is possible to either:
- observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
the park and set TASK_RUNNING, or
- __kthread_bind()'s wait_task_inactive() to observe the competing
TASK_RUNNING store.
Either way the WARN() in __kthread_bind() will trigger and fail to
correctly set the CPU affinity.
Fix this by only issuing the complete() when the kthread has scheduled
out. This does away with all the icky 'still running' nonsense.
The alternative is to promote TASK_PARKED to a special state, this
guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
and we'll end up doing the right thing, but this preserves the whole
icky business of potentially migating the still runnable thing.
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-01 19:14:45 +03:00
if ( test_bit ( KTHREAD_IS_PER_CPU , & kthread - > flags ) )
__kthread_bind ( k , kthread - > cpu , TASK_PARKED ) ;
clear_bit ( KTHREAD_SHOULD_PARK , & kthread - > flags ) ;
2018-06-07 12:45:49 +03:00
/*
* __kthread_parkme ( ) will either see ! SHOULD_PARK or get the wakeup .
*/
kthread, sched/wait: Fix kthread_parkme() completion issue
Even with the wait-loop fixed, there is a further issue with
kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
smpboot_park_threads() can return before all those threads are in fact
blocked, due to the placement of the complete() in __kthread_parkme().
When that happens, sched_cpu_dying() -> migrate_tasks() can end up
migrating such a still runnable task onto another CPU.
Normally the task will have hit schedule() and gone to sleep by the
time we do kthread_unpark(), which will then do __kthread_bind() to
re-bind the task to the correct CPU.
However, when we loose the initial TASK_PARKED store to the concurrent
wakeup issue described previously, do the complete(), get migrated, it
is possible to either:
- observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
the park and set TASK_RUNNING, or
- __kthread_bind()'s wait_task_inactive() to observe the competing
TASK_RUNNING store.
Either way the WARN() in __kthread_bind() will trigger and fail to
correctly set the CPU affinity.
Fix this by only issuing the complete() when the kthread has scheduled
out. This does away with all the icky 'still running' nonsense.
The alternative is to promote TASK_PARKED to a special state, this
guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
and we'll end up doing the right thing, but this preserves the whole
icky business of potentially migating the still runnable thing.
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-01 19:14:45 +03:00
wake_up_state ( k , TASK_PARKED ) ;
2013-04-09 11:33:34 +04:00
}
2015-08-07 01:46:45 +03:00
EXPORT_SYMBOL_GPL ( kthread_unpark ) ;
2012-07-16 14:42:36 +04:00
/**
* kthread_park - park a thread created by kthread_create ( ) .
* @ k : thread created by kthread_create ( ) .
*
* Sets kthread_should_park ( ) for @ k to return true , wakes it , and
* waits for it to return . This can also be called after kthread_create ( )
* instead of calling wake_up_process ( ) : the thread will park without
* calling threadfn ( ) .
*
* Returns 0 if the thread is parked , - ENOSYS if the thread exited .
* If called by the kthread itself just the park bit is set .
*/
int kthread_park ( struct task_struct * k )
{
kthread: Don't use to_live_kthread() in kthread_[un]park()
Now that to_kthread() is always validm change kthread_park() and
kthread_unpark() to use it and kill to_live_kthread().
The conversion of kthread_unpark() is trivial. If KTHREAD_IS_PARKED is set
then the task has called complete(&self->parked) and there the function
cannot race against a concurrent kthread_stop() and exit.
kthread_park() is more tricky, because its semantics are not well
defined. It returns -ENOSYS if the thread exited but this can never happen
and as Roman pointed out kthread_park() can obviously block forever if it
would race with the exiting kthread.
The usage of kthread_park() in cpuhp code (cpu.c, smpboot.c, stop_machine.c)
is fine. It can never see an exiting/exited kthread, smpboot_destroy_threads()
clears *ht->store, smpboot_park_thread() checks it is not NULL under the same
smpboot_threads_lock. cpuhp_threads and cpu_stop_threads never exit, so other
callers are fine too.
But it has two more users:
- watchdog_park_threads():
The code is actually correct, get_online_cpus() ensures that
kthread_park() can't race with itself (note that kthread_park() can't
handle this race correctly), but it should not use kthread_park()
directly.
- drivers/gpu/drm/amd/scheduler/gpu_scheduler.c should not use
kthread_park() either.
kthread_park() must not be called after amd_sched_fini() which does
kthread_stop(), otherwise even to_live_kthread() is not safe because
task_struct can be already freed and sched->thread can point to nowhere.
The usage of kthread_park/unpark should either be restricted to core code
which is properly protected against the exit race or made more robust so it
is safe to use it in drivers.
To catch eventual exit issues, add a WARN_ON(PF_EXITING) for now.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chunming Zhou <David1.Zhou@amd.com>
Cc: Roman Pen <roman.penyaev@profitbricks.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20161129175107.GA5339@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-29 20:51:07 +03:00
struct kthread * kthread = to_kthread ( k ) ;
if ( WARN_ON ( k - > flags & PF_EXITING ) )
return - ENOSYS ;
2018-06-07 11:55:56 +03:00
if ( WARN_ON_ONCE ( test_bit ( KTHREAD_SHOULD_PARK , & kthread - > flags ) ) )
return - EBUSY ;
kthread, sched/wait: Fix kthread_parkme() completion issue
Even with the wait-loop fixed, there is a further issue with
kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
smpboot_park_threads() can return before all those threads are in fact
blocked, due to the placement of the complete() in __kthread_parkme().
When that happens, sched_cpu_dying() -> migrate_tasks() can end up
migrating such a still runnable task onto another CPU.
Normally the task will have hit schedule() and gone to sleep by the
time we do kthread_unpark(), which will then do __kthread_bind() to
re-bind the task to the correct CPU.
However, when we loose the initial TASK_PARKED store to the concurrent
wakeup issue described previously, do the complete(), get migrated, it
is possible to either:
- observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
the park and set TASK_RUNNING, or
- __kthread_bind()'s wait_task_inactive() to observe the competing
TASK_RUNNING store.
Either way the WARN() in __kthread_bind() will trigger and fail to
correctly set the CPU affinity.
Fix this by only issuing the complete() when the kthread has scheduled
out. This does away with all the icky 'still running' nonsense.
The alternative is to promote TASK_PARKED to a special state, this
guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
and we'll end up doing the right thing, but this preserves the whole
icky business of potentially migating the still runnable thing.
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-01 19:14:45 +03:00
set_bit ( KTHREAD_SHOULD_PARK , & kthread - > flags ) ;
if ( k ! = current ) {
wake_up_process ( k ) ;
2018-06-07 12:45:49 +03:00
/*
* Wait for __kthread_parkme ( ) to complete ( ) , this means we
* _will_ have TASK_PARKED and are about to call schedule ( ) .
*/
kthread, sched/wait: Fix kthread_parkme() completion issue
Even with the wait-loop fixed, there is a further issue with
kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
smpboot_park_threads() can return before all those threads are in fact
blocked, due to the placement of the complete() in __kthread_parkme().
When that happens, sched_cpu_dying() -> migrate_tasks() can end up
migrating such a still runnable task onto another CPU.
Normally the task will have hit schedule() and gone to sleep by the
time we do kthread_unpark(), which will then do __kthread_bind() to
re-bind the task to the correct CPU.
However, when we loose the initial TASK_PARKED store to the concurrent
wakeup issue described previously, do the complete(), get migrated, it
is possible to either:
- observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
the park and set TASK_RUNNING, or
- __kthread_bind()'s wait_task_inactive() to observe the competing
TASK_RUNNING store.
Either way the WARN() in __kthread_bind() will trigger and fail to
correctly set the CPU affinity.
Fix this by only issuing the complete() when the kthread has scheduled
out. This does away with all the icky 'still running' nonsense.
The alternative is to promote TASK_PARKED to a special state, this
guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
and we'll end up doing the right thing, but this preserves the whole
icky business of potentially migating the still runnable thing.
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-01 19:14:45 +03:00
wait_for_completion ( & kthread - > parked ) ;
2018-06-07 12:45:49 +03:00
/*
* Now wait for that schedule ( ) to complete and the task to
* get scheduled out .
*/
WARN_ON_ONCE ( ! wait_task_inactive ( k , TASK_PARKED ) ) ;
2012-07-16 14:42:36 +04:00
}
kthread: Don't use to_live_kthread() in kthread_[un]park()
Now that to_kthread() is always validm change kthread_park() and
kthread_unpark() to use it and kill to_live_kthread().
The conversion of kthread_unpark() is trivial. If KTHREAD_IS_PARKED is set
then the task has called complete(&self->parked) and there the function
cannot race against a concurrent kthread_stop() and exit.
kthread_park() is more tricky, because its semantics are not well
defined. It returns -ENOSYS if the thread exited but this can never happen
and as Roman pointed out kthread_park() can obviously block forever if it
would race with the exiting kthread.
The usage of kthread_park() in cpuhp code (cpu.c, smpboot.c, stop_machine.c)
is fine. It can never see an exiting/exited kthread, smpboot_destroy_threads()
clears *ht->store, smpboot_park_thread() checks it is not NULL under the same
smpboot_threads_lock. cpuhp_threads and cpu_stop_threads never exit, so other
callers are fine too.
But it has two more users:
- watchdog_park_threads():
The code is actually correct, get_online_cpus() ensures that
kthread_park() can't race with itself (note that kthread_park() can't
handle this race correctly), but it should not use kthread_park()
directly.
- drivers/gpu/drm/amd/scheduler/gpu_scheduler.c should not use
kthread_park() either.
kthread_park() must not be called after amd_sched_fini() which does
kthread_stop(), otherwise even to_live_kthread() is not safe because
task_struct can be already freed and sched->thread can point to nowhere.
The usage of kthread_park/unpark should either be restricted to core code
which is properly protected against the exit race or made more robust so it
is safe to use it in drivers.
To catch eventual exit issues, add a WARN_ON(PF_EXITING) for now.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chunming Zhou <David1.Zhou@amd.com>
Cc: Roman Pen <roman.penyaev@profitbricks.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20161129175107.GA5339@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-29 20:51:07 +03:00
return 0 ;
2012-07-16 14:42:36 +04:00
}
2015-08-07 01:46:45 +03:00
EXPORT_SYMBOL_GPL ( kthread_park ) ;
2012-07-16 14:42:36 +04:00
2006-06-25 16:49:19 +04:00
/**
* kthread_stop - stop a thread created by kthread_create ( ) .
* @ k : thread created by kthread_create ( ) .
*
* Sets kthread_should_stop ( ) for @ k to return true , wakes it , and
2009-06-19 04:51:13 +04:00
* waits for it to exit . This can also be called after kthread_create ( )
* instead of calling wake_up_process ( ) : the thread will exit without
* calling threadfn ( ) .
*
2021-11-22 19:27:36 +03:00
* If threadfn ( ) may call kthread_exit ( ) itself , the caller must ensure
2009-06-19 04:51:13 +04:00
* task_struct can ' t go away .
2006-06-25 16:49:19 +04:00
*
* Returns the result of threadfn ( ) , or % - EINTR if wake_up_process ( )
* was never called .
*/
2005-04-17 02:20:36 +04:00
int kthread_stop ( struct task_struct * k )
{
2013-04-30 02:05:12 +04:00
struct kthread * kthread ;
2005-04-17 02:20:36 +04:00
int ret ;
tracing, sched: LTTng instrumentation - scheduler
Instrument the scheduler activity (sched_switch, migration, wakeups,
wait for a task, signal delivery) and process/thread
creation/destruction (fork, exit, kthread stop). Actually, kthread
creation is not instrumented in this patch because it is architecture
dependent. It allows to connect tracers such as ftrace which detects
scheduling latencies, good/bad scheduler decisions. Tools like LTTng can
export this scheduler information along with instrumentation of the rest
of the kernel activity to perform post-mortem analysis on the scheduler
activity.
About the performance impact of tracepoints (which is comparable to
markers), even without immediate values optimizations, tests done by
Hideo Aoki on ia64 show no regression. His test case was using hackbench
on a kernel where scheduler instrumentation (about 5 events in code
scheduler code) was added. See the "Tracepoints" patch header for
performance result detail.
Changelog :
- Change instrumentation location and parameter to match ftrace
instrumentation, previously done with kernel markers.
[ mingo@elte.hu: conflict resolutions ]
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: 'Peter Zijlstra' <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 20:16:17 +04:00
trace_sched_kthread_stop ( k ) ;
2013-04-30 02:05:12 +04:00
get_task_struct ( k ) ;
2016-11-29 20:51:03 +03:00
kthread = to_kthread ( k ) ;
set_bit ( KTHREAD_SHOULD_STOP , & kthread - > flags ) ;
kthread: Don't use to_live_kthread() in kthread_[un]park()
Now that to_kthread() is always validm change kthread_park() and
kthread_unpark() to use it and kill to_live_kthread().
The conversion of kthread_unpark() is trivial. If KTHREAD_IS_PARKED is set
then the task has called complete(&self->parked) and there the function
cannot race against a concurrent kthread_stop() and exit.
kthread_park() is more tricky, because its semantics are not well
defined. It returns -ENOSYS if the thread exited but this can never happen
and as Roman pointed out kthread_park() can obviously block forever if it
would race with the exiting kthread.
The usage of kthread_park() in cpuhp code (cpu.c, smpboot.c, stop_machine.c)
is fine. It can never see an exiting/exited kthread, smpboot_destroy_threads()
clears *ht->store, smpboot_park_thread() checks it is not NULL under the same
smpboot_threads_lock. cpuhp_threads and cpu_stop_threads never exit, so other
callers are fine too.
But it has two more users:
- watchdog_park_threads():
The code is actually correct, get_online_cpus() ensures that
kthread_park() can't race with itself (note that kthread_park() can't
handle this race correctly), but it should not use kthread_park()
directly.
- drivers/gpu/drm/amd/scheduler/gpu_scheduler.c should not use
kthread_park() either.
kthread_park() must not be called after amd_sched_fini() which does
kthread_stop(), otherwise even to_live_kthread() is not safe because
task_struct can be already freed and sched->thread can point to nowhere.
The usage of kthread_park/unpark should either be restricted to core code
which is properly protected against the exit race or made more robust so it
is safe to use it in drivers.
To catch eventual exit issues, add a WARN_ON(PF_EXITING) for now.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Chunming Zhou <David1.Zhou@amd.com>
Cc: Roman Pen <roman.penyaev@profitbricks.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20161129175107.GA5339@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-29 20:51:07 +03:00
kthread_unpark ( k ) ;
signal: break out of wait loops on kthread_stop()
I was recently surprised to learn that msleep_interruptible(),
wait_for_completion_interruptible_timeout(), and related functions
simply hung when I called kthread_stop() on kthreads using them. The
solution to fixing the case with msleep_interruptible() was more simply
to move to schedule_timeout_interruptible(). Why?
The reason is that msleep_interruptible(), and many functions just like
it, has a loop like this:
while (timeout && !signal_pending(current))
timeout = schedule_timeout_interruptible(timeout);
The call to kthread_stop() woke up the thread, so schedule_timeout_
interruptible() returned early, but because signal_pending() returned
true, it went back into another timeout, which was never woken up.
This wait loop pattern is common to various pieces of code, and I
suspect that the subtle misuse in a kthread that caused a deadlock in
the code I looked at last week is also found elsewhere.
So this commit causes signal_pending() to return true when
kthread_stop() is called, by setting TIF_NOTIFY_SIGNAL.
The same also probably applies to the similar kthread_park()
functionality, but that can be addressed later, as its semantics are
slightly different.
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
v1: https://lkml.kernel.org/r/20220627120020.608117-1-Jason@zx2c4.com
v2: https://lkml.kernel.org/r/20220627145716.641185-1-Jason@zx2c4.com
v3: https://lkml.kernel.org/r/20220628161441.892925-1-Jason@zx2c4.com
v4: https://lkml.kernel.org/r/20220711202136.64458-1-Jason@zx2c4.com
v5: https://lkml.kernel.org/r/20220711232123.136330-1-Jason@zx2c4.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2022-07-12 02:21:23 +03:00
set_tsk_thread_flag ( k , TIF_NOTIFY_SIGNAL ) ;
2016-11-29 20:51:03 +03:00
wake_up_process ( k ) ;
wait_for_completion ( & kthread - > exited ) ;
2021-12-03 20:42:49 +03:00
ret = kthread - > result ;
2005-04-17 02:20:36 +04:00
put_task_struct ( k ) ;
tracing, sched: LTTng instrumentation - scheduler
Instrument the scheduler activity (sched_switch, migration, wakeups,
wait for a task, signal delivery) and process/thread
creation/destruction (fork, exit, kthread stop). Actually, kthread
creation is not instrumented in this patch because it is architecture
dependent. It allows to connect tracers such as ftrace which detects
scheduling latencies, good/bad scheduler decisions. Tools like LTTng can
export this scheduler information along with instrumentation of the rest
of the kernel activity to perform post-mortem analysis on the scheduler
activity.
About the performance impact of tracepoints (which is comparable to
markers), even without immediate values optimizations, tests done by
Hideo Aoki on ia64 show no regression. His test case was using hackbench
on a kernel where scheduler instrumentation (about 5 events in code
scheduler code) was added. See the "Tracepoints" patch header for
performance result detail.
Changelog :
- Change instrumentation location and parameter to match ftrace
instrumentation, previously done with kernel markers.
[ mingo@elte.hu: conflict resolutions ]
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: 'Peter Zijlstra' <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 20:16:17 +04:00
2013-04-30 02:05:12 +04:00
trace_sched_kthread_stop_ret ( ret ) ;
2005-04-17 02:20:36 +04:00
return ret ;
}
2006-07-14 11:24:05 +04:00
EXPORT_SYMBOL ( kthread_stop ) ;
2005-04-17 02:20:36 +04:00
2007-07-31 11:39:16 +04:00
int kthreadd ( void * unused )
2005-04-17 02:20:36 +04:00
{
2007-05-09 13:34:32 +04:00
struct task_struct * tsk = current ;
2005-04-17 02:20:36 +04:00
2007-07-31 11:39:16 +04:00
/* Setup a clean context for our children to inherit. */
2007-05-09 13:34:32 +04:00
set_task_comm ( tsk , " kthreadd " ) ;
2007-05-09 13:34:37 +04:00
ignore_signals ( tsk ) ;
2022-02-07 18:59:06 +03:00
set_cpus_allowed_ptr ( tsk , housekeeping_cpumask ( HK_TYPE_KTHREAD ) ) ;
2012-12-13 01:51:39 +04:00
set_mems_allowed ( node_states [ N_MEMORY ] ) ;
2007-05-09 13:34:32 +04:00
2011-11-23 21:28:17 +04:00
current - > flags | = PF_NOFREEZE ;
cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups
Creation of a kthread goes through a couple interlocked stages between
the kthread itself and its creator. Once the new kthread starts
running, it initializes itself and wakes up the creator. The creator
then can further configure the kthread and then let it start doing its
job by waking it up.
In this configuration-by-creator stage, the creator is the only one
that can wake it up but the kthread is visible to userland. When
altering the kthread's attributes from userland is allowed, this is
fine; however, for cases where CPU affinity is critical,
kthread_bind() is used to first disable affinity changes from userland
and then set the affinity. This also prevents the kthread from being
migrated into non-root cgroups as that can affect the CPU affinity and
many other things.
Unfortunately, the cgroup side of protection is racy. While the
PF_NO_SETAFFINITY flag prevents further migrations, userland can win
the race before the creator sets the flag with kthread_bind() and put
the kthread in a non-root cgroup, which can lead to all sorts of
problems including incorrect CPU affinity and starvation.
This bug got triggered by userland which periodically tries to migrate
all processes in the root cpuset cgroup to a non-root one. Per-cpu
workqueue workers got caught while being created and ended up with
incorrected CPU affinity breaking concurrency management and sometimes
stalling workqueue execution.
This patch adds task->no_cgroup_migration which disallows the task to
be migrated by userland. kthreadd starts with the flag set making
every child kthread start in the root cgroup with migration
disallowed. The flag is cleared after the kthread finishes
initialization by which time PF_NO_SETAFFINITY is set if the kthread
should stay in the root cgroup.
It'd be better to wait for the initialization instead of failing but I
couldn't think of a way of implementing that without adding either a
new PF flag, or sleeping and retrying from waiting side. Even if
userland depends on changing cgroup membership of a kthread, it either
has to be synchronized with kthread_create() or periodically repeat,
so it's unlikely that this would break anything.
v2: Switch to a simpler implementation using a new task_struct bit
field suggested by Oleg.
Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-and-debugged-by: Chris Mason <clm@fb.com>
Cc: stable@vger.kernel.org # v4.3+ (we can't close the race on < v4.3)
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-03-16 23:54:24 +03:00
cgroup_init_kthreadd ( ) ;
2007-05-09 13:34:32 +04:00
for ( ; ; ) {
set_current_state ( TASK_INTERRUPTIBLE ) ;
if ( list_empty ( & kthread_create_list ) )
schedule ( ) ;
__set_current_state ( TASK_RUNNING ) ;
spin_lock ( & kthread_create_lock ) ;
while ( ! list_empty ( & kthread_create_list ) ) {
struct kthread_create_info * create ;
create = list_entry ( kthread_create_list . next ,
struct kthread_create_info , list ) ;
list_del_init ( & create - > list ) ;
spin_unlock ( & kthread_create_lock ) ;
create_kthread ( create ) ;
spin_lock ( & kthread_create_lock ) ;
}
spin_unlock ( & kthread_create_lock ) ;
}
return 0 ;
}
2010-06-29 12:07:09 +04:00
2016-10-11 23:55:20 +03:00
void __kthread_init_worker ( struct kthread_worker * worker ,
2010-12-22 12:27:53 +03:00
const char * name ,
struct lock_class_key * key )
{
2016-10-11 23:55:50 +03:00
memset ( worker , 0 , sizeof ( struct kthread_worker ) ) ;
2019-02-12 19:25:53 +03:00
raw_spin_lock_init ( & worker - > lock ) ;
2010-12-22 12:27:53 +03:00
lockdep_set_class_and_name ( & worker - > lock , key , name ) ;
INIT_LIST_HEAD ( & worker - > work_list ) ;
2016-10-11 23:55:40 +03:00
INIT_LIST_HEAD ( & worker - > delayed_work_list ) ;
2010-12-22 12:27:53 +03:00
}
2016-10-11 23:55:20 +03:00
EXPORT_SYMBOL_GPL ( __kthread_init_worker ) ;
2010-12-22 12:27:53 +03:00
2010-06-29 12:07:09 +04:00
/**
* kthread_worker_fn - kthread function to process kthread_worker
* @ worker_ptr : pointer to initialized kthread_worker
*
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
* This function implements the main cycle of kthread worker . It processes
* work_list until it is stopped with kthread_stop ( ) . It sleeps when the queue
* is empty .
2010-06-29 12:07:09 +04:00
*
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
* The works are not allowed to keep any locks , disable preemption or interrupts
* when they finish . There is defined a safe point for freezing when one work
* finishes and before a new one is started .
2016-10-11 23:55:36 +03:00
*
* Also the works must not be handled by more than one worker at the same time ,
* see also kthread_queue_work ( ) .
2010-06-29 12:07:09 +04:00
*/
int kthread_worker_fn ( void * worker_ptr )
{
struct kthread_worker * worker = worker_ptr ;
struct kthread_work * work ;
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
/*
* FIXME : Update the check and remove the assignment when all kthread
* worker users are created using kthread_create_worker * ( ) functions .
*/
WARN_ON ( worker - > task & & worker - > task ! = current ) ;
2010-06-29 12:07:09 +04:00
worker - > task = current ;
2016-10-11 23:55:50 +03:00
if ( worker - > flags & KTW_FREEZABLE )
set_freezable ( ) ;
2010-06-29 12:07:09 +04:00
repeat :
set_current_state ( TASK_INTERRUPTIBLE ) ; /* mb paired w/ kthread_stop */
if ( kthread_should_stop ( ) ) {
__set_current_state ( TASK_RUNNING ) ;
2019-02-12 19:25:53 +03:00
raw_spin_lock_irq ( & worker - > lock ) ;
2010-06-29 12:07:09 +04:00
worker - > task = NULL ;
2019-02-12 19:25:53 +03:00
raw_spin_unlock_irq ( & worker - > lock ) ;
2010-06-29 12:07:09 +04:00
return 0 ;
}
work = NULL ;
2019-02-12 19:25:53 +03:00
raw_spin_lock_irq ( & worker - > lock ) ;
2010-06-29 12:07:09 +04:00
if ( ! list_empty ( & worker - > work_list ) ) {
work = list_first_entry ( & worker - > work_list ,
struct kthread_work , node ) ;
list_del_init ( & work - > node ) ;
}
2012-07-20 00:52:53 +04:00
worker - > current_work = work ;
2019-02-12 19:25:53 +03:00
raw_spin_unlock_irq ( & worker - > lock ) ;
2010-06-29 12:07:09 +04:00
if ( work ) {
2020-12-15 06:03:14 +03:00
kthread_work_func_t func = work - > func ;
2010-06-29 12:07:09 +04:00
__set_current_state ( TASK_RUNNING ) ;
2020-12-15 06:03:14 +03:00
trace_sched_kthread_work_execute_start ( work ) ;
2010-06-29 12:07:09 +04:00
work - > func ( work ) ;
2020-12-15 06:03:14 +03:00
/*
* Avoid dereferencing work after this point . The trace
* event only cares about the address .
*/
trace_sched_kthread_work_execute_end ( work , func ) ;
2010-06-29 12:07:09 +04:00
} else if ( ! freezing ( current ) )
schedule ( ) ;
try_to_freeze ( ) ;
2017-09-01 02:15:23 +03:00
cond_resched ( ) ;
2010-06-29 12:07:09 +04:00
goto repeat ;
}
EXPORT_SYMBOL_GPL ( kthread_worker_fn ) ;
2016-12-13 03:40:39 +03:00
static __printf ( 3 , 0 ) struct kthread_worker *
2016-10-11 23:55:50 +03:00
__kthread_create_worker ( int cpu , unsigned int flags ,
const char namefmt [ ] , va_list args )
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
{
struct kthread_worker * worker ;
struct task_struct * task ;
2019-03-06 02:42:58 +03:00
int node = NUMA_NO_NODE ;
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
worker = kzalloc ( sizeof ( * worker ) , GFP_KERNEL ) ;
if ( ! worker )
return ERR_PTR ( - ENOMEM ) ;
kthread_init_worker ( worker ) ;
2016-11-29 20:51:10 +03:00
if ( cpu > = 0 )
node = cpu_to_node ( cpu ) ;
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
2016-11-29 20:51:10 +03:00
task = __kthread_create_on_node ( kthread_worker_fn , worker ,
node , namefmt , args ) ;
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
if ( IS_ERR ( task ) )
goto fail_task ;
2016-11-29 20:51:10 +03:00
if ( cpu > = 0 )
kthread_bind ( task , cpu ) ;
2016-10-11 23:55:50 +03:00
worker - > flags = flags ;
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
worker - > task = task ;
wake_up_process ( task ) ;
return worker ;
fail_task :
kfree ( worker ) ;
return ERR_CAST ( task ) ;
}
/**
* kthread_create_worker - create a kthread worker
2016-10-11 23:55:50 +03:00
* @ flags : flags modifying the default behavior of the worker
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
* @ namefmt : printf - style name for the kthread worker ( task ) .
*
* Returns a pointer to the allocated worker on success , ERR_PTR ( - ENOMEM )
* when the needed structures could not get allocated , and ERR_PTR ( - EINTR )
kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal
The comments in kernel/kthread.c create a feeling that only SIGKILL is
able to terminate the creation of kernel kthreads by
kthread_create()/_on_node()/_on_cpu() APIs.
In reality, wait_for_completion_killable() might be killed by any fatal
signal that does not have a custom handler:
(!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \
(t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL)
static inline void signal_wake_up(struct task_struct *t, bool resume)
{
signal_wake_up_state(t, resume ? TASK_WAKEKILL : 0);
}
static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
{
[...]
/*
* Found a killable thread. If the signal will be fatal,
* then start taking the whole group down immediately.
*/
if (sig_fatal(p, sig) ...) {
if (!sig_kernel_coredump(sig)) {
[...]
do {
task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
sigaddset(&t->pending.signal, SIGKILL);
signal_wake_up(t, 1);
} while_each_thread(p, t);
return;
}
}
}
Update the comments in kernel/kthread.c to make this more obvious.
The motivation for this change was debugging why a module initialization
failed. The module was being loaded from initrd. It "magically" failed
when systemd was switching to the real root. The clean up operations sent
SIGTERM to various pending processed that were started from initrd.
Link: https://lkml.kernel.org/r/20220315102444.2380-1-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-03-15 13:24:44 +03:00
* when the caller was killed by a fatal signal .
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
*/
struct kthread_worker *
2016-10-11 23:55:50 +03:00
kthread_create_worker ( unsigned int flags , const char namefmt [ ] , . . . )
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
{
struct kthread_worker * worker ;
va_list args ;
va_start ( args , namefmt ) ;
2016-10-11 23:55:50 +03:00
worker = __kthread_create_worker ( - 1 , flags , namefmt , args ) ;
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
va_end ( args ) ;
return worker ;
}
EXPORT_SYMBOL ( kthread_create_worker ) ;
/**
* kthread_create_worker_on_cpu - create a kthread worker and bind it
2020-10-16 06:10:28 +03:00
* to a given CPU and the associated NUMA node .
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
* @ cpu : CPU number
2016-10-11 23:55:50 +03:00
* @ flags : flags modifying the default behavior of the worker
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
* @ namefmt : printf - style name for the kthread worker ( task ) .
*
* Use a valid CPU number if you want to bind the kthread worker
* to the given CPU and the associated NUMA node .
*
* A good practice is to add the cpu number also into the worker name .
* For example , use kthread_create_worker_on_cpu ( cpu , " helper/%d " , cpu ) .
*
2020-12-15 06:03:18 +03:00
* CPU hotplug :
* The kthread worker API is simple and generic . It just provides a way
* to create , use , and destroy workers .
*
* It is up to the API user how to handle CPU hotplug . They have to decide
* how to handle pending work items , prevent queuing new ones , and
* restore the functionality when the CPU goes off and on . There are a
* few catches :
*
* - CPU affinity gets lost when it is scheduled on an offline CPU .
*
* - The worker might not exist when the CPU was off when the user
* created the workers .
*
* Good practice is to implement two CPU hotplug callbacks and to
* destroy / create the worker when the CPU goes down / up .
*
* Return :
* The pointer to the allocated worker on success , ERR_PTR ( - ENOMEM )
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
* when the needed structures could not get allocated , and ERR_PTR ( - EINTR )
kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal
The comments in kernel/kthread.c create a feeling that only SIGKILL is
able to terminate the creation of kernel kthreads by
kthread_create()/_on_node()/_on_cpu() APIs.
In reality, wait_for_completion_killable() might be killed by any fatal
signal that does not have a custom handler:
(!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \
(t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL)
static inline void signal_wake_up(struct task_struct *t, bool resume)
{
signal_wake_up_state(t, resume ? TASK_WAKEKILL : 0);
}
static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
{
[...]
/*
* Found a killable thread. If the signal will be fatal,
* then start taking the whole group down immediately.
*/
if (sig_fatal(p, sig) ...) {
if (!sig_kernel_coredump(sig)) {
[...]
do {
task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
sigaddset(&t->pending.signal, SIGKILL);
signal_wake_up(t, 1);
} while_each_thread(p, t);
return;
}
}
}
Update the comments in kernel/kthread.c to make this more obvious.
The motivation for this change was debugging why a module initialization
failed. The module was being loaded from initrd. It "magically" failed
when systemd was switching to the real root. The clean up operations sent
SIGTERM to various pending processed that were started from initrd.
Link: https://lkml.kernel.org/r/20220315102444.2380-1-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-03-15 13:24:44 +03:00
* when the caller was killed by a fatal signal .
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
*/
struct kthread_worker *
2016-10-11 23:55:50 +03:00
kthread_create_worker_on_cpu ( int cpu , unsigned int flags ,
const char namefmt [ ] , . . . )
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
{
struct kthread_worker * worker ;
va_list args ;
va_start ( args , namefmt ) ;
2016-10-11 23:55:50 +03:00
worker = __kthread_create_worker ( cpu , flags , namefmt , args ) ;
kthread: add kthread_create_worker*()
Kthread workers are currently created using the classic kthread API,
namely kthread_run(). kthread_worker_fn() is passed as the @threadfn
parameter.
This patch defines kthread_create_worker() and
kthread_create_worker_on_cpu() functions that hide implementation details.
They enforce using kthread_worker_fn() for the main thread. But I doubt
that there are any plans to create any alternative. In fact, I think that
we do not want any alternative main thread because it would be hard to
support consistency with the rest of the kthread worker API.
The naming and function of kthread_create_worker() is inspired by the
workqueues API like the rest of the kthread worker API.
The kthread_create_worker_on_cpu() variant is motivated by the original
kthread_create_on_cpu(). Note that we need to bind per-CPU kthread
workers already when they are created. It makes the life easier.
kthread_bind() could not be used later for an already running worker.
This patch does _not_ convert existing kthread workers. The kthread
worker API need more improvements first, e.g. a function to destroy the
worker.
IMPORTANT:
kthread_create_worker_on_cpu() allows to use any format of the worker
name, in compare with kthread_create_on_cpu(). The good thing is that it
is more generic. The bad thing is that most users will need to pass the
cpu number in two parameters, e.g. kthread_create_worker_on_cpu(cpu,
"helper/%d", cpu).
To be honest, the main motivation was to avoid the need for an empty
va_list. The only legal way was to create a helper function that would be
called with an empty list. Other attempts caused compilation warnings or
even errors on different architectures.
There were also other alternatives, for example, using #define or
splitting __kthread_create_worker(). The used solution looked like the
least ugly.
Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:30 +03:00
va_end ( args ) ;
return worker ;
}
EXPORT_SYMBOL ( kthread_create_worker_on_cpu ) ;
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
/*
* Returns true when the work could not be queued at the moment .
* It happens when it is already pending in a worker list
* or when it is being cancelled .
*/
static inline bool queuing_blocked ( struct kthread_worker * worker ,
struct kthread_work * work )
{
lockdep_assert_held ( & worker - > lock ) ;
return ! list_empty ( & work - > node ) | | work - > canceling ;
}
2016-10-11 23:55:36 +03:00
static void kthread_insert_work_sanity_check ( struct kthread_worker * worker ,
struct kthread_work * work )
{
lockdep_assert_held ( & worker - > lock ) ;
WARN_ON_ONCE ( ! list_empty ( & work - > node ) ) ;
/* Do not use a work with >1 worker, see kthread_queue_work() */
WARN_ON_ONCE ( work - > worker & & work - > worker ! = worker ) ;
}
2012-07-20 00:52:53 +04:00
/* insert @work before @pos in @worker */
2016-10-11 23:55:20 +03:00
static void kthread_insert_work ( struct kthread_worker * worker ,
2016-10-11 23:55:36 +03:00
struct kthread_work * work ,
struct list_head * pos )
2012-07-20 00:52:53 +04:00
{
2016-10-11 23:55:36 +03:00
kthread_insert_work_sanity_check ( worker , work ) ;
2012-07-20 00:52:53 +04:00
2020-12-15 06:03:14 +03:00
trace_sched_kthread_work_queue_work ( worker , work ) ;
2012-07-20 00:52:53 +04:00
list_add_tail ( & work - > node , pos ) ;
2012-07-20 00:52:53 +04:00
work - > worker = worker ;
2014-07-26 08:03:59 +04:00
if ( ! worker - > current_work & & likely ( worker - > task ) )
2012-07-20 00:52:53 +04:00
wake_up_process ( worker - > task ) ;
}
2010-06-29 12:07:09 +04:00
/**
2016-10-11 23:55:20 +03:00
* kthread_queue_work - queue a kthread_work
2010-06-29 12:07:09 +04:00
* @ worker : target kthread_worker
* @ work : kthread_work to queue
*
* Queue @ work to work processor @ task for async execution . @ task
* must have been created with kthread_worker_create ( ) . Returns % true
* if @ work was successfully queued , % false if it was already pending .
2016-10-11 23:55:36 +03:00
*
* Reinitialize the work if it needs to be used by another worker .
* For example , when the worker was stopped and started again .
2010-06-29 12:07:09 +04:00
*/
2016-10-11 23:55:20 +03:00
bool kthread_queue_work ( struct kthread_worker * worker ,
2010-06-29 12:07:09 +04:00
struct kthread_work * work )
{
bool ret = false ;
unsigned long flags ;
2019-02-12 19:25:53 +03:00
raw_spin_lock_irqsave ( & worker - > lock , flags ) ;
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
if ( ! queuing_blocked ( worker , work ) ) {
2016-10-11 23:55:20 +03:00
kthread_insert_work ( worker , work , & worker - > work_list ) ;
2010-06-29 12:07:09 +04:00
ret = true ;
}
2019-02-12 19:25:53 +03:00
raw_spin_unlock_irqrestore ( & worker - > lock , flags ) ;
2010-06-29 12:07:09 +04:00
return ret ;
}
2016-10-11 23:55:20 +03:00
EXPORT_SYMBOL_GPL ( kthread_queue_work ) ;
2010-06-29 12:07:09 +04:00
2016-10-11 23:55:40 +03:00
/**
* kthread_delayed_work_timer_fn - callback that queues the associated kthread
* delayed work when the timer expires .
2017-10-05 02:27:06 +03:00
* @ t : pointer to the expired timer
2016-10-11 23:55:40 +03:00
*
* The format of the function is defined by struct timer_list .
* It should have been called from irqsafe timer with irq already off .
*/
2017-10-05 02:27:06 +03:00
void kthread_delayed_work_timer_fn ( struct timer_list * t )
2016-10-11 23:55:40 +03:00
{
2017-10-05 02:27:06 +03:00
struct kthread_delayed_work * dwork = from_timer ( dwork , t , timer ) ;
2016-10-11 23:55:40 +03:00
struct kthread_work * work = & dwork - > work ;
struct kthread_worker * worker = work - > worker ;
2019-02-12 19:25:54 +03:00
unsigned long flags ;
2016-10-11 23:55:40 +03:00
/*
* This might happen when a pending work is reinitialized .
* It means that it is used a wrong way .
*/
if ( WARN_ON_ONCE ( ! worker ) )
return ;
2019-02-12 19:25:54 +03:00
raw_spin_lock_irqsave ( & worker - > lock , flags ) ;
2016-10-11 23:55:40 +03:00
/* Work must not be used with >1 worker, see kthread_queue_work(). */
WARN_ON_ONCE ( work - > worker ! = worker ) ;
/* Move the work from worker->delayed_work_list. */
WARN_ON_ONCE ( list_empty ( & work - > node ) ) ;
list_del_init ( & work - > node ) ;
2020-11-02 04:07:53 +03:00
if ( ! work - > canceling )
kthread_insert_work ( worker , work , & worker - > work_list ) ;
2016-10-11 23:55:40 +03:00
2019-02-12 19:25:54 +03:00
raw_spin_unlock_irqrestore ( & worker - > lock , flags ) ;
2016-10-11 23:55:40 +03:00
}
EXPORT_SYMBOL ( kthread_delayed_work_timer_fn ) ;
2019-10-16 14:24:58 +03:00
static void __kthread_queue_delayed_work ( struct kthread_worker * worker ,
struct kthread_delayed_work * dwork ,
unsigned long delay )
2016-10-11 23:55:40 +03:00
{
struct timer_list * timer = & dwork - > timer ;
struct kthread_work * work = & dwork - > work ;
2022-09-09 00:54:56 +03:00
WARN_ON_ONCE ( timer - > function ! = kthread_delayed_work_timer_fn ) ;
2016-10-11 23:55:40 +03:00
/*
* If @ delay is 0 , queue @ dwork - > work immediately . This is for
* both optimization and correctness . The earliest @ timer can
* expire is on the closest next tick and delayed_work users depend
* on that there ' s no such delay when @ delay is 0.
*/
if ( ! delay ) {
kthread_insert_work ( worker , work , & worker - > work_list ) ;
return ;
}
/* Be paranoid and try to detect possible races already now. */
kthread_insert_work_sanity_check ( worker , work ) ;
list_add ( & work - > node , & worker - > delayed_work_list ) ;
work - > worker = worker ;
timer - > expires = jiffies + delay ;
add_timer ( timer ) ;
}
/**
* kthread_queue_delayed_work - queue the associated kthread work
* after a delay .
* @ worker : target kthread_worker
* @ dwork : kthread_delayed_work to queue
* @ delay : number of jiffies to wait before queuing
*
* If the work has not been pending it starts a timer that will queue
* the work after the given @ delay . If @ delay is zero , it queues the
* work immediately .
*
* Return : % false if the @ work has already been pending . It means that
* either the timer was running or the work was queued . It returns % true
* otherwise .
*/
bool kthread_queue_delayed_work ( struct kthread_worker * worker ,
struct kthread_delayed_work * dwork ,
unsigned long delay )
{
struct kthread_work * work = & dwork - > work ;
unsigned long flags ;
bool ret = false ;
2019-02-12 19:25:53 +03:00
raw_spin_lock_irqsave ( & worker - > lock , flags ) ;
2016-10-11 23:55:40 +03:00
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
if ( ! queuing_blocked ( worker , work ) ) {
2016-10-11 23:55:40 +03:00
__kthread_queue_delayed_work ( worker , dwork , delay ) ;
ret = true ;
}
2019-02-12 19:25:53 +03:00
raw_spin_unlock_irqrestore ( & worker - > lock , flags ) ;
2016-10-11 23:55:40 +03:00
return ret ;
}
EXPORT_SYMBOL_GPL ( kthread_queue_delayed_work ) ;
2012-07-20 00:52:53 +04:00
struct kthread_flush_work {
struct kthread_work work ;
struct completion done ;
} ;
static void kthread_flush_work_fn ( struct kthread_work * work )
{
struct kthread_flush_work * fwork =
container_of ( work , struct kthread_flush_work , work ) ;
complete ( & fwork - > done ) ;
}
2010-06-29 12:07:09 +04:00
/**
2016-10-11 23:55:20 +03:00
* kthread_flush_work - flush a kthread_work
2010-06-29 12:07:09 +04:00
* @ work : work to flush
*
* If @ work is queued or executing , wait for it to finish execution .
*/
2016-10-11 23:55:20 +03:00
void kthread_flush_work ( struct kthread_work * work )
2010-06-29 12:07:09 +04:00
{
2012-07-20 00:52:53 +04:00
struct kthread_flush_work fwork = {
KTHREAD_WORK_INIT ( fwork . work , kthread_flush_work_fn ) ,
COMPLETION_INITIALIZER_ONSTACK ( fwork . done ) ,
} ;
struct kthread_worker * worker ;
bool noop = false ;
worker = work - > worker ;
if ( ! worker )
return ;
2010-06-29 12:07:09 +04:00
2019-02-12 19:25:53 +03:00
raw_spin_lock_irq ( & worker - > lock ) ;
2016-10-11 23:55:36 +03:00
/* Work must not be used with >1 worker, see kthread_queue_work(). */
WARN_ON_ONCE ( work - > worker ! = worker ) ;
2010-06-29 12:07:09 +04:00
2012-07-20 00:52:53 +04:00
if ( ! list_empty ( & work - > node ) )
2016-10-11 23:55:20 +03:00
kthread_insert_work ( worker , & fwork . work , work - > node . next ) ;
2012-07-20 00:52:53 +04:00
else if ( worker - > current_work = = work )
2016-10-11 23:55:20 +03:00
kthread_insert_work ( worker , & fwork . work ,
worker - > work_list . next ) ;
2012-07-20 00:52:53 +04:00
else
noop = true ;
2010-06-29 12:07:09 +04:00
2019-02-12 19:25:53 +03:00
raw_spin_unlock_irq ( & worker - > lock ) ;
2010-06-29 12:07:09 +04:00
2012-07-20 00:52:53 +04:00
if ( ! noop )
wait_for_completion ( & fwork . done ) ;
2010-06-29 12:07:09 +04:00
}
2016-10-11 23:55:20 +03:00
EXPORT_SYMBOL_GPL ( kthread_flush_work ) ;
2010-06-29 12:07:09 +04:00
2021-06-25 04:39:45 +03:00
/*
* Make sure that the timer is neither set nor running and could
* not manipulate the work list_head any longer .
*
* The function is called under worker - > lock . The lock is temporary
* released but the timer can ' t be set again in the meantime .
*/
static void kthread_cancel_delayed_work_timer ( struct kthread_work * work ,
unsigned long * flags )
{
struct kthread_delayed_work * dwork =
container_of ( work , struct kthread_delayed_work , work ) ;
struct kthread_worker * worker = work - > worker ;
/*
* del_timer_sync ( ) must be called to make sure that the timer
* callback is not running . The lock must be temporary released
* to avoid a deadlock with the callback . In the meantime ,
* any queuing is blocked by setting the canceling counter .
*/
work - > canceling + + ;
raw_spin_unlock_irqrestore ( & worker - > lock , * flags ) ;
del_timer_sync ( & dwork - > timer ) ;
raw_spin_lock_irqsave ( & worker - > lock , * flags ) ;
work - > canceling - - ;
}
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
/*
2021-06-25 04:39:48 +03:00
* This function removes the work from the worker queue .
*
* It is called under worker - > lock . The caller must make sure that
* the timer used by delayed work is not running , e . g . by calling
* kthread_cancel_delayed_work_timer ( ) .
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
*
* The work might still be in use when this function finishes . See the
* current_work proceed by the worker .
*
* Return : % true if @ work was pending and successfully canceled ,
* % false if @ work was not pending
*/
2021-06-25 04:39:48 +03:00
static bool __kthread_cancel_work ( struct kthread_work * work )
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
{
/*
* Try to remove the work from a worker list . It might either
* be from worker - > work_list or from worker - > delayed_work_list .
*/
if ( ! list_empty ( & work - > node ) ) {
list_del_init ( & work - > node ) ;
return true ;
}
return false ;
}
2016-10-11 23:55:46 +03:00
/**
* kthread_mod_delayed_work - modify delay of or queue a kthread delayed work
* @ worker : kthread worker to use
* @ dwork : kthread delayed work to queue
* @ delay : number of jiffies to wait before queuing
*
* If @ dwork is idle , equivalent to kthread_queue_delayed_work ( ) . Otherwise ,
* modify @ dwork ' s timer so that it expires after @ delay . If @ delay is zero ,
* @ work is guaranteed to be queued immediately .
*
2021-06-29 05:33:35 +03:00
* Return : % false if @ dwork was idle and queued , % true otherwise .
2016-10-11 23:55:46 +03:00
*
* A special case is when the work is being canceled in parallel .
* It might be caused either by the real kthread_cancel_delayed_work_sync ( )
* or yet another kthread_mod_delayed_work ( ) call . We let the other command
2021-06-29 05:33:35 +03:00
* win and return % true here . The return value can be used for reference
* counting and the number of queued works stays the same . Anyway , the caller
* is supposed to synchronize these operations a reasonable way .
2016-10-11 23:55:46 +03:00
*
* This function is safe to call from any context including IRQ handler .
* See __kthread_cancel_work ( ) and kthread_delayed_work_timer_fn ( )
* for details .
*/
bool kthread_mod_delayed_work ( struct kthread_worker * worker ,
struct kthread_delayed_work * dwork ,
unsigned long delay )
{
struct kthread_work * work = & dwork - > work ;
unsigned long flags ;
2021-06-29 05:33:35 +03:00
int ret ;
2016-10-11 23:55:46 +03:00
2019-02-12 19:25:53 +03:00
raw_spin_lock_irqsave ( & worker - > lock , flags ) ;
2016-10-11 23:55:46 +03:00
/* Do not bother with canceling when never queued. */
2021-06-29 05:33:35 +03:00
if ( ! work - > worker ) {
ret = false ;
2016-10-11 23:55:46 +03:00
goto fast_queue ;
2021-06-29 05:33:35 +03:00
}
2016-10-11 23:55:46 +03:00
/* Work must not be used with >1 worker, see kthread_queue_work() */
WARN_ON_ONCE ( work - > worker ! = worker ) ;
2021-06-25 04:39:48 +03:00
/*
* Temporary cancel the work but do not fight with another command
* that is canceling the work as well .
*
* It is a bit tricky because of possible races with another
* mod_delayed_work ( ) and cancel_delayed_work ( ) callers .
*
* The timer must be canceled first because worker - > lock is released
* when doing so . But the work can be removed from the queue ( list )
* only when it can be queued again so that the return value can
* be used for reference counting .
*/
kthread_cancel_delayed_work_timer ( work , & flags ) ;
2021-06-29 05:33:35 +03:00
if ( work - > canceling ) {
/* The number of works in the queue does not change. */
ret = true ;
2016-10-11 23:55:46 +03:00
goto out ;
2021-06-29 05:33:35 +03:00
}
2021-06-25 04:39:48 +03:00
ret = __kthread_cancel_work ( work ) ;
2016-10-11 23:55:46 +03:00
fast_queue :
__kthread_queue_delayed_work ( worker , dwork , delay ) ;
out :
2019-02-12 19:25:53 +03:00
raw_spin_unlock_irqrestore ( & worker - > lock , flags ) ;
2016-10-11 23:55:46 +03:00
return ret ;
}
EXPORT_SYMBOL_GPL ( kthread_mod_delayed_work ) ;
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
static bool __kthread_cancel_work_sync ( struct kthread_work * work , bool is_dwork )
{
struct kthread_worker * worker = work - > worker ;
unsigned long flags ;
int ret = false ;
if ( ! worker )
goto out ;
2019-02-12 19:25:53 +03:00
raw_spin_lock_irqsave ( & worker - > lock , flags ) ;
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
/* Work must not be used with >1 worker, see kthread_queue_work(). */
WARN_ON_ONCE ( work - > worker ! = worker ) ;
2021-06-25 04:39:48 +03:00
if ( is_dwork )
kthread_cancel_delayed_work_timer ( work , & flags ) ;
ret = __kthread_cancel_work ( work ) ;
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
if ( worker - > current_work ! = work )
goto out_fast ;
/*
* The work is in progress and we need to wait with the lock released .
* In the meantime , block any queuing by setting the canceling counter .
*/
work - > canceling + + ;
2019-02-12 19:25:53 +03:00
raw_spin_unlock_irqrestore ( & worker - > lock , flags ) ;
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
kthread_flush_work ( work ) ;
2019-02-12 19:25:53 +03:00
raw_spin_lock_irqsave ( & worker - > lock , flags ) ;
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
work - > canceling - - ;
out_fast :
2019-02-12 19:25:53 +03:00
raw_spin_unlock_irqrestore ( & worker - > lock , flags ) ;
kthread: allow to cancel kthread work
We are going to use kthread workers more widely and sometimes we will need
to make sure that the work is neither pending nor running.
This patch implements cancel_*_sync() operations as inspired by
workqueues. Well, we are synchronized against the other operations via
the worker lock, we use del_timer_sync() and a counter to count parallel
cancel operations. Therefore the implementation might be easier.
First, we check if a worker is assigned. If not, the work has newer been
queued after it was initialized.
Second, we take the worker lock. It must be the right one. The work must
not be assigned to another worker unless it is initialized in between.
Third, we try to cancel the timer when it exists. The timer is deleted
synchronously to make sure that the timer call back is not running. We
need to temporary release the worker->lock to avoid a possible deadlock
with the callback. In the meantime, we set work->canceling counter to
avoid any queuing.
Fourth, we try to remove the work from a worker list. It might be
the list of either normal or delayed works.
Fifth, if the work is running, we call kthread_flush_work(). It might
take an arbitrary time. We need to release the worker-lock again. In the
meantime, we again block any queuing by the canceling counter.
As already mentioned, the check for a pending kthread work is done under a
lock. In compare with workqueues, we do not need to fight for a single
PENDING bit to block other operations. Therefore we do not suffer from
the thundering storm problem and all parallel canceling jobs might use
kthread_flush_work(). Any queuing is blocked until the counter gets zero.
Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <bp@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 23:55:43 +03:00
out :
return ret ;
}
/**
* kthread_cancel_work_sync - cancel a kthread work and wait for it to finish
* @ work : the kthread work to cancel
*
* Cancel @ work and wait for its execution to finish . This function
* can be used even if the work re - queues itself . On return from this
* function , @ work is guaranteed to be not pending or executing on any CPU .
*
* kthread_cancel_work_sync ( & delayed_work - > work ) must not be used for
* delayed_work ' s . Use kthread_cancel_delayed_work_sync ( ) instead .
*
* The caller must ensure that the worker on which @ work was last
* queued can ' t be destroyed before this function returns .
*
* Return : % true if @ work was pending , % false otherwise .
*/
bool kthread_cancel_work_sync ( struct kthread_work * work )
{
return __kthread_cancel_work_sync ( work , false ) ;
}
EXPORT_SYMBOL_GPL ( kthread_cancel_work_sync ) ;
/**
* kthread_cancel_delayed_work_sync - cancel a kthread delayed work and
* wait for it to finish .
* @ dwork : the kthread delayed work to cancel
*
* This is kthread_cancel_work_sync ( ) for delayed works .
*
* Return : % true if @ dwork was pending , % false otherwise .
*/
bool kthread_cancel_delayed_work_sync ( struct kthread_delayed_work * dwork )
{
return __kthread_cancel_work_sync ( & dwork - > work , true ) ;
}
EXPORT_SYMBOL_GPL ( kthread_cancel_delayed_work_sync ) ;
2010-06-29 12:07:09 +04:00
/**
2016-10-11 23:55:20 +03:00
* kthread_flush_worker - flush all current works on a kthread_worker
2010-06-29 12:07:09 +04:00
* @ worker : worker to flush
*
* Wait until all currently executing or pending works on @ worker are
* finished .
*/
2016-10-11 23:55:20 +03:00
void kthread_flush_worker ( struct kthread_worker * worker )
2010-06-29 12:07:09 +04:00
{
struct kthread_flush_work fwork = {
KTHREAD_WORK_INIT ( fwork . work , kthread_flush_work_fn ) ,
COMPLETION_INITIALIZER_ONSTACK ( fwork . done ) ,
} ;
2016-10-11 23:55:20 +03:00
kthread_queue_work ( worker , & fwork . work ) ;
2010-06-29 12:07:09 +04:00
wait_for_completion ( & fwork . done ) ;
}
2016-10-11 23:55:20 +03:00
EXPORT_SYMBOL_GPL ( kthread_flush_worker ) ;
2016-10-11 23:55:33 +03:00
/**
* kthread_destroy_worker - destroy a kthread worker
* @ worker : worker to be destroyed
*
* Flush and destroy @ worker . The simple flush is enough because the kthread
* worker API is used only in trivial scenarios . There are no multi - step state
* machines needed .
*/
void kthread_destroy_worker ( struct kthread_worker * worker )
{
struct task_struct * task ;
task = worker - > task ;
if ( WARN_ON ( ! task ) )
return ;
kthread_flush_worker ( worker ) ;
kthread_stop ( task ) ;
WARN_ON ( ! list_empty ( & worker - > work_list ) ) ;
kfree ( worker ) ;
}
EXPORT_SYMBOL ( kthread_destroy_worker ) ;
2017-09-15 00:02:04 +03:00
2020-06-11 04:42:06 +03:00
/**
* kthread_use_mm - make the calling kthread operate on an address space
* @ mm : address space to operate on
2020-06-11 04:41:59 +03:00
*/
2020-06-11 04:42:06 +03:00
void kthread_use_mm ( struct mm_struct * mm )
2020-06-11 04:41:59 +03:00
{
struct mm_struct * active_mm ;
struct task_struct * tsk = current ;
2020-06-11 04:42:06 +03:00
WARN_ON_ONCE ( ! ( tsk - > flags & PF_KTHREAD ) ) ;
WARN_ON_ONCE ( tsk - > mm ) ;
2020-06-11 04:41:59 +03:00
task_lock ( tsk ) ;
2020-08-07 09:17:16 +03:00
/* Hold off tlb flush IPIs while switching mm's */
local_irq_disable ( ) ;
2020-06-11 04:41:59 +03:00
active_mm = tsk - > active_mm ;
if ( active_mm ! = mm ) {
mmgrab ( mm ) ;
tsk - > active_mm = mm ;
}
tsk - > mm = mm ;
2020-10-20 16:47:14 +03:00
membarrier_update_current_mm ( mm ) ;
2020-08-07 09:17:16 +03:00
switch_mm_irqs_off ( active_mm , mm , tsk ) ;
local_irq_enable ( ) ;
2020-06-11 04:41:59 +03:00
task_unlock ( tsk ) ;
# ifdef finish_arch_post_lock_switch
finish_arch_post_lock_switch ( ) ;
# endif
2020-10-20 16:47:14 +03:00
/*
* When a kthread starts operating on an address space , the loop
* in membarrier_ { private , global } _expedited ( ) may not observe
* that tsk - > mm , and not issue an IPI . Membarrier requires a
* memory barrier after storing to tsk - > mm , before accessing
* user - space memory . A full memory barrier for membarrier
* { PRIVATE , GLOBAL } _EXPEDITED is implicitly provided by
* mmdrop ( ) , or explicitly with smp_mb ( ) .
*/
2020-06-11 04:41:59 +03:00
if ( active_mm ! = mm )
mmdrop ( active_mm ) ;
2020-10-20 16:47:14 +03:00
else
smp_mb ( ) ;
2020-06-11 04:41:59 +03:00
}
2020-06-11 04:42:06 +03:00
EXPORT_SYMBOL_GPL ( kthread_use_mm ) ;
2020-06-11 04:41:59 +03:00
2020-06-11 04:42:06 +03:00
/**
* kthread_unuse_mm - reverse the effect of kthread_use_mm ( )
* @ mm : address space to operate on
2020-06-11 04:41:59 +03:00
*/
2020-06-11 04:42:06 +03:00
void kthread_unuse_mm ( struct mm_struct * mm )
2020-06-11 04:41:59 +03:00
{
struct task_struct * tsk = current ;
2020-06-11 04:42:06 +03:00
WARN_ON_ONCE ( ! ( tsk - > flags & PF_KTHREAD ) ) ;
WARN_ON_ONCE ( ! tsk - > mm ) ;
2020-06-11 04:41:59 +03:00
task_lock ( tsk ) ;
2020-10-20 16:47:14 +03:00
/*
* When a kthread stops operating on an address space , the loop
* in membarrier_ { private , global } _expedited ( ) may not observe
* that tsk - > mm , and not issue an IPI . Membarrier requires a
* memory barrier after accessing user - space memory , before
* clearing tsk - > mm .
*/
smp_mb__after_spinlock ( ) ;
2020-06-11 04:41:59 +03:00
sync_mm_rss ( mm ) ;
2020-08-07 09:17:16 +03:00
local_irq_disable ( ) ;
2020-06-11 04:41:59 +03:00
tsk - > mm = NULL ;
2020-10-20 16:47:14 +03:00
membarrier_update_current_mm ( NULL ) ;
2020-06-11 04:41:59 +03:00
/* active_mm is still 'mm' */
enter_lazy_tlb ( mm , tsk ) ;
2020-08-07 09:17:16 +03:00
local_irq_enable ( ) ;
2020-06-11 04:41:59 +03:00
task_unlock ( tsk ) ;
}
2020-06-11 04:42:06 +03:00
EXPORT_SYMBOL_GPL ( kthread_unuse_mm ) ;
2020-06-11 04:41:59 +03:00
2017-09-26 21:02:12 +03:00
# ifdef CONFIG_BLK_CGROUP
2017-09-15 00:02:04 +03:00
/**
* kthread_associate_blkcg - associate blkcg to current kthread
* @ css : the cgroup info
*
* Current thread must be a kthread . The thread is running jobs on behalf of
* other threads . In some cases , we expect the jobs attach cgroup info of
* original threads instead of that of current thread . This function stores
* original thread ' s cgroup info in current kthread context for later
* retrieval .
*/
void kthread_associate_blkcg ( struct cgroup_subsys_state * css )
{
struct kthread * kthread ;
if ( ! ( current - > flags & PF_KTHREAD ) )
return ;
kthread = to_kthread ( current ) ;
if ( ! kthread )
return ;
if ( kthread - > blkcg_css ) {
css_put ( kthread - > blkcg_css ) ;
kthread - > blkcg_css = NULL ;
}
if ( css ) {
css_get ( css ) ;
kthread - > blkcg_css = css ;
}
}
EXPORT_SYMBOL ( kthread_associate_blkcg ) ;
/**
* kthread_blkcg - get associated blkcg css of current kthread
*
* Current thread must be a kthread .
*/
struct cgroup_subsys_state * kthread_blkcg ( void )
{
struct kthread * kthread ;
if ( current - > flags & PF_KTHREAD ) {
kthread = to_kthread ( current ) ;
if ( kthread )
return kthread - > blkcg_css ;
}
return NULL ;
}
# endif