Go to file
Chengming Zhou f841b682ba perf/core: Fix cgroup events tracking
We encounter perf warnings when using cgroup events like:

  cd /sys/fs/cgroup
  mkdir test
  perf stat -e cycles -a -G test

Which then triggers:

  WARNING: CPU: 0 PID: 690 at kernel/events/core.c:849 perf_cgroup_switch+0xb2/0xc0
  Call Trace:
   <TASK>
   __schedule+0x4ae/0x9f0
   ? _raw_spin_unlock_irqrestore+0x23/0x40
   ? __cond_resched+0x18/0x20
   preempt_schedule_common+0x2d/0x70
   __cond_resched+0x18/0x20
   wait_for_completion+0x2f/0x160
   ? cpu_stop_queue_work+0x9e/0x130
   affine_move_task+0x18a/0x4f0

  WARNING: CPU: 0 PID: 690 at kernel/events/core.c:829 ctx_sched_in+0x1cf/0x1e0
  Call Trace:
   <TASK>
   ? ctx_sched_out+0xb7/0x1b0
   perf_cgroup_switch+0x88/0xc0
   __schedule+0x4ae/0x9f0
   ? _raw_spin_unlock_irqrestore+0x23/0x40
   ? __cond_resched+0x18/0x20
   preempt_schedule_common+0x2d/0x70
   __cond_resched+0x18/0x20
   wait_for_completion+0x2f/0x160
   ? cpu_stop_queue_work+0x9e/0x130
   affine_move_task+0x18a/0x4f0

The above two warnings are not complete here since I remove other
unimportant information. The problem is caused by the perf cgroup
events tracking:

  CPU0					CPU1
  perf_event_open()
    perf_event_alloc()
      account_event()
	account_event_cpu()
	  atomic_inc(perf_cgroup_events)
					  __perf_event_task_sched_out()
					    if (atomic_read(perf_cgroup_events))
					      perf_cgroup_switch()
						// kernel/events/core.c:849
						WARN_ON_ONCE(cpuctx->ctx.nr_cgroups == 0)
						if (READ_ONCE(cpuctx->cgrp) == cgrp) // false
						  return
						perf_ctx_lock()
						ctx_sched_out()
						cpuctx->cgrp = cgrp
						ctx_sched_in()
						  perf_cgroup_set_timestamp()
						    // kernel/events/core.c:829
						    WARN_ON_ONCE(!ctx->nr_cgroups)
						perf_ctx_unlock()
    perf_install_in_context()
      cpu_function_call()
					  __perf_install_in_context()
					    add_event_to_ctx()
					      list_add_event()
						perf_cgroup_event_enable()
						  ctx->nr_cgroups++
						  cpuctx->cgrp = X

We can see from above that we wrongly use percpu atomic perf_cgroup_events
to check if we need to perf_cgroup_switch(), which should only be used
when we know this CPU has cgroup events enabled.

The commit bd27568117 ("perf: Rewrite core context handling") change
to have only one context per-CPU, so we can just use cpuctx->cgrp to
check if this CPU has cgroup events enabled.

So percpu atomic perf_cgroup_events is not needed.

Fixes: bd27568117 ("perf: Rewrite core context handling")
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lkml.kernel.org/r/20221207124023.66252-1-zhouchengming@bytedance.com
2022-12-27 12:44:00 +01:00
arch treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
block treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
certs certs: make system keyring depend on built-in x509 parser 2022-09-24 04:31:18 +09:00
crypto This update includes the following changes: 2022-12-14 12:31:09 -08:00
Documentation kernel hardening fixes for v6.2-rc1 2022-12-23 12:00:24 -08:00
drivers treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
fs treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
include 9p-for-6.2-rc1 2022-12-23 11:39:18 -08:00
init Kbuild updates for v6.2 2022-12-19 12:33:32 -06:00
io_uring io_uring/net: fix cleanup after recycle 2022-12-19 08:28:28 -07:00
ipc Non-MM patches for 6.2-rc1. 2022-12-12 17:28:58 -08:00
kernel perf/core: Fix cgroup events tracking 2022-12-27 12:44:00 +01:00
lib test_maple_tree: add test for mas_spanning_rebalance() on insufficient data 2022-12-21 14:31:52 -08:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm hugetlb: really allocate vma lock for all sharable vmas 2022-12-21 14:31:52 -08:00
net treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
rust rust: types: add Opaque type 2022-12-04 01:59:16 +01:00
samples Char/Misc driver changes for 6.2-rc1 2022-12-16 03:49:24 -08:00
scripts modernize use of grep in coccicheck 2022-12-23 13:56:41 -08:00
security kernel hardening fixes for v6.2-rc1 2022-12-23 12:00:24 -08:00
sound treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
tools perf tools fixes and improvements for v6.2: 2nd batch 2022-12-22 11:07:29 -08:00
usr usr/gen_init_cpio.c: remove unnecessary -1 values from int file 2022-10-03 14:21:44 -07:00
virt ARM64: 2022-12-15 11:12:21 -08:00
.clang-format iommufd for 6.2 2022-12-14 09:15:43 -08:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes
.gitignore kbuild: Cleanup DT Overlay intermediate files as appropriate 2022-11-18 14:45:30 -06:00
.mailmap Non-MM patches for 6.2-rc1. 2022-12-12 17:28:58 -08:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: zram: zsmalloc: Add an additional co-maintainer 2022-12-15 16:37:49 -08:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS io_uring-6.2-2022-12-19 2022-12-21 16:28:25 -08:00
Makefile Linux 6.2-rc1 2022-12-25 13:41:39 -08:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.