1270 Commits

Author SHA1 Message Date
Zhang Rui
1ab28f17ee perf/x86/rapl: Add support for Intel AlderLake-N
AlderLake-N RAPL support is the same as previous Sky Lake.
Add AlderLake-N model for RAPL.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Wang Wendy <wendy.wang@intel.com>
Link: https://lkml.kernel.org/r/20221023125120.2727-1-rui.zhang@intel.com
2022-10-27 10:27:31 +02:00
Maxim Levitsky
b329f5ddc9 perf/x86/intel/lbr: Use setup_clear_cpu_cap() instead of clear_cpu_cap()
clear_cpu_cap(&boot_cpu_data) is very similar to setup_clear_cpu_cap()
except that the latter also sets a bit in 'cpu_caps_cleared' which
later clears the same cap in secondary cpus, which is likely what is
meant here.

Fixes: 47125db27e47 ("perf/x86/intel/lbr: Support Architectural LBR")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lkml.kernel.org/r/20220718141123.136106-2-mlevitsk@redhat.com
2022-10-20 17:10:28 +02:00
Linus Torvalds
3871d93b82 Perf events updates for v6.1:
- PMU driver updates:
 
      - Add AMD Last Branch Record Extension Version 2 (LbrExtV2)
        feature support for Zen 4 processors.
 
      - Extend the perf ABI to provide branch speculation information,
        if available, and use this on CPUs that have it (eg. LbrExtV2).
 
      - Improve Intel PEBS TSC timestamp handling & integration.
 
      - Add Intel Raptor Lake S CPU support.
 
      - Add 'perf mem' and 'perf c2c' memory profiling support on
        AMD CPUs by utilizing IBS tagged load/store samples.
 
      - Clean up & optimize various x86 PMU details.
 
  - HW breakpoints:
 
      - Big rework to optimize the code for systems with hundreds of CPUs and
        thousands of breakpoints:
 
         - Replace the nr_bp_mutex global mutex with the bp_cpuinfo_sem
 	  per-CPU rwsem that is read-locked during most of the key operations.
 
 	- Improve the O(#cpus * #tasks) logic in toggle_bp_slot()
 	  and fetch_bp_busy_slots().
 
 	- Apply micro-optimizations & cleanups.
 
   - Misc cleanups & enhancements.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmM/2pMRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iIMA/+J+MCEVTt9kwZeBtHoPX7iZ5gnq1+McoQ
 f6ALX19AO/ZSuA7EBA3cS3Ny5eyGy3ofYUnRW+POezu9CpflLW/5N27R2qkZFrWC
 A09B86WH676ZrmXt+oI05rpZ2y/NGw4gJxLLa4/bWF3g9xLfo21i+YGKwdOnNFpl
 DEdCVHtjlMcOAU3+on6fOYuhXDcYd7PKGcCfLE7muOMOAtwyj0bUDBt7m+hneZgy
 qbZHzDU2DZ5L/LXiMyuZj5rC7V4xUbfZZfXglG38YCW1WTieS3IjefaU2tREhu7I
 rRkCK48ULDNNJR3dZK8IzXJRxusq1ICPG68I+nm/K37oZyTZWtfYZWehW/d/TnPa
 tUiTwimabz7UUqaGq9ZptxwINcAigax0nl6dZ3EseeGhkDE6j71/3kqrkKPz4jth
 +fCwHLOrI3c4Gq5qWgPvqcUlUneKf3DlOMtzPKYg7sMhla2zQmFpYCPzKfm77U/Z
 BclGOH3FiwaK6MIjPJRUXTePXqnUseqCR8PCH/UPQUeBEVHFcMvqCaa15nALed8x
 dFi76VywR9mahouuLNq6sUNePlvDd2B124PygNwegLlBfY9QmKONg9qRKOnQpuJ6
 UprRJjLOOucZ/N/jn6+ShHkqmXsnY2MhfUoHUoMQ0QAI+n++e+2AuePo251kKWr8
 QlqKxd9PMQU=
 =LcGg
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf events updates from Ingo Molnar:
 "PMU driver updates:

   - Add AMD Last Branch Record Extension Version 2 (LbrExtV2) feature
     support for Zen 4 processors.

   - Extend the perf ABI to provide branch speculation information, if
     available, and use this on CPUs that have it (eg. LbrExtV2).

   - Improve Intel PEBS TSC timestamp handling & integration.

   - Add Intel Raptor Lake S CPU support.

   - Add 'perf mem' and 'perf c2c' memory profiling support on AMD CPUs
     by utilizing IBS tagged load/store samples.

   - Clean up & optimize various x86 PMU details.

  HW breakpoints:

   - Big rework to optimize the code for systems with hundreds of CPUs
     and thousands of breakpoints:

      - Replace the nr_bp_mutex global mutex with the bp_cpuinfo_sem
        per-CPU rwsem that is read-locked during most of the key
        operations.

      - Improve the O(#cpus * #tasks) logic in toggle_bp_slot() and
        fetch_bp_busy_slots().

      - Apply micro-optimizations & cleanups.

  - Misc cleanups & enhancements"

* tag 'perf-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
  perf/hw_breakpoint: Annotate tsk->perf_event_mutex vs ctx->mutex
  perf: Fix pmu_filter_match()
  perf: Fix lockdep_assert_event_ctx()
  perf/x86/amd/lbr: Adjust LBR regardless of filtering
  perf/x86/utils: Fix uninitialized var in get_branch_type()
  perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file
  perf/x86/amd: Support PERF_SAMPLE_PHY_ADDR
  perf/x86/amd: Support PERF_SAMPLE_ADDR
  perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}
  perf/x86/amd: Support PERF_SAMPLE_DATA_SRC
  perf/x86/amd: Add IBS OP_DATA2 DataSrc bit definitions
  perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM|IO}
  perf/x86/uncore: Add new Raptor Lake S support
  perf/x86/cstate: Add new Raptor Lake S support
  perf/x86/msr: Add new Raptor Lake S support
  perf/x86: Add new Raptor Lake S support
  bpf: Check flags for branch stack in bpf_read_branch_records helper
  perf, hw_breakpoint: Fix use-after-free if perf_event_open() fails
  perf: Use sample_flags for raw_data
  perf: Use sample_flags for addr
  ...
2022-10-10 09:27:46 -07:00
Stephane Eranian
3f9a1b3591 perf/x86/amd/lbr: Adjust LBR regardless of filtering
In case of fused compare and taken branch instructions, the AMD LBR points to
the compare instruction instead of the branch. Users of LBR usually expects
the from address to point to a branch instruction. The kernel has code to
adjust the from address via get_branch_type_fused(). However this correction
is only applied when a branch filter is applied. That means that if no
filter is present, the quality of the data is lower.

Fix the problem by applying the adjustment regardless of the filter setting,
bringing the AMD LBR to the same level as other LBR implementations.

Fixes: 245268c19f70 ("perf/x86/amd/lbr: Use fusion-aware branch classifier")
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20220928184043.408364-3-eranian@google.com
2022-09-29 12:20:57 +02:00
Stephane Eranian
117ceeb1f4 perf/x86/utils: Fix uninitialized var in get_branch_type()
offset is passed as a pointer and on certain call path is not set by the
function. If the caller does not re-initialize offset between calls, value
could be inherited between calls. Prevent this by initializing offset on each
call.

This impacts the code in amd_pmu_lbr_filter() which does:

   for(i=0; ...) {
       ret = get_branch_type_fused(..., &offset);
       if (offset)
          lbr_entries[i].from += offset;
   }

Fixes: df3e9612f758 ("perf/x86: Make branch classifier fusion-aware")
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20220928184043.408364-2-eranian@google.com
2022-09-29 12:20:56 +02:00
Ravi Bangoria
5b26af6d2b perf/x86/amd: Support PERF_SAMPLE_PHY_ADDR
IBS_DC_PHYSADDR provides the physical data address for the tagged load/
store operation. Populate perf sample physical address using it.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220928095805.596-7-ravi.bangoria@amd.com
2022-09-29 12:20:56 +02:00
Ravi Bangoria
cb2bb85f7e perf/x86/amd: Support PERF_SAMPLE_ADDR
IBS_DC_LINADDR provides the linear data address for the tagged load/
store operation. Populate perf sample address using it.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220928095805.596-6-ravi.bangoria@amd.com
2022-09-29 12:20:55 +02:00
Ravi Bangoria
6b2ae4952e perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}
IbsDcMissLat indicates the number of clock cycles from when a miss is
detected in the data cache to when the data was delivered to the core.
Similarly, IbsTagToRetCtr provides number of cycles from when the op
was tagged to when the op was retired. Consider these fields for
sample->weight.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220928095805.596-5-ravi.bangoria@amd.com
2022-09-29 12:20:55 +02:00
Ravi Bangoria
7c10dd0a88 perf/x86/amd: Support PERF_SAMPLE_DATA_SRC
struct perf_mem_data_src is used to pass arch specific memory access
details into generic form. These details gets consumed by tools like
perf mem and c2c. IBS tagged load/store sample provides most of the
information needed for these tools. Add a logic to convert IBS
specific raw data into perf_mem_data_src.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220928095805.596-4-ravi.bangoria@amd.com
2022-09-29 12:20:55 +02:00
Kan Liang
e04a1607c9 perf/x86/uncore: Add new Raptor Lake S support
From the perspective of the uncore PMU, the new Raptor Lake S is the
same as the other hybrid {ALDER,RAPTOP}LAKE.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220928153331.3757388-4-kan.liang@linux.intel.com
2022-09-29 12:20:53 +02:00
Kan Liang
d12940d2ea perf/x86/cstate: Add new Raptor Lake S support
From the perspective of Intel cstate residency counters, the new
Raptor Lake S is the same as the other hybrid {ALDER,RAPTOP}LAKE.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220928153331.3757388-3-kan.liang@linux.intel.com
2022-09-29 12:20:53 +02:00
Kan Liang
193c888b7f perf/x86/msr: Add new Raptor Lake S support
The same as the other hybrid {ALDER,RAPTOP}LAKE, the new Raptor Lake S
also support PPERF and SMI_COUNT MSRs.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220928153331.3757388-2-kan.liang@linux.intel.com
2022-09-29 12:20:53 +02:00
Kan Liang
50b0c97bf0 perf/x86: Add new Raptor Lake S support
From PMU's perspective, the new Raptor Lake S is the same as the other
of hybrid {ALDER,RAPTOP}LAKE.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220928153331.3757388-1-kan.liang@linux.intel.com
2022-09-29 12:20:52 +02:00
Peter Zijlstra
a1ebcd5943 Linux 6.0-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmMwwY4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGdlwH/0ESzdb6F9zYWwHR
 E08har56/IfwjOsn1y+JuHibpwUjzskLzdwIfI5zshSZAQTj5/UyC0P7G/wcYh/Z
 INh1uHGazmDUkx4O3lwuWLR+mmeUxZRWdq4NTwYDRNPMSiPInVxz+cZJ7y0aPr2e
 wii7kMFRHgXmX5DMDEwuHzehsJF7vZrp8zBu2DqzVUGnbwD50nPbyMM3H4g9mute
 fAEpDG0X3+smqMaKL+2rK0W/Av/87r3U8ZAztBem3nsCJ9jT7hqMO1ICcKmFMviA
 DTERRMwWjPq+mBPE2CiuhdaXvNZBW85Ds81mSddS6MsO6+Tvuzfzik/zSLQJxlBi
 vIqYphY=
 =NqG+
 -----END PGP SIGNATURE-----

Merge branch 'v6.0-rc7'

Merge upstream to get RAPTORLAKE_S

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-09-29 12:20:50 +02:00
Namhyung Kim
838d9bb62d perf: Use sample_flags for raw_data
Use the new sample_flags to indicate whether the raw data field is
filled by the PMU driver.  Although it could check with the NULL,
follow the same rule with other fields.

Remove the raw field from the perf_sample_data_init() to minimize
the number of cache lines touched.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220921220032.2858517-2-namhyung@kernel.org
2022-09-27 22:50:24 +02:00
Namhyung Kim
7b08463015 perf: Use sample_flags for addr
Use the new sample_flags to indicate whether the addr field is filled by
the PMU driver.  As most PMU drivers pass 0, it can set the flag only if
it has a non-zero value.  And use 0 in perf_sample_output() if it's not
filled already.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220921220032.2858517-1-namhyung@kernel.org
2022-09-27 22:50:24 +02:00
Namhyung Kim
b4e12b2d70 perf: Kill __PERF_SAMPLE_CALLCHAIN_EARLY
There's no in-tree user anymore.  Let's get rid of it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220908214104.3851807-3-namhyung@kernel.org
2022-09-13 15:03:23 +02:00
Namhyung Kim
3749d33e51 perf: Use sample_flags for callchain
So that it can call perf_callchain() only if needed.  Historically it used
__PERF_SAMPLE_CALLCHAIN_EARLY but we can do that with sample_flags in the
struct perf_sample_data.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220908214104.3851807-1-namhyung@kernel.org
2022-09-13 15:03:22 +02:00
Kan Liang
fae9ebde96 perf/x86/intel: Optimize FIXED_CTR_CTRL access
All the fixed counters share a fixed control register. The current
perf reads and re-writes the fixed control register for each fixed
counter disable/enable, which is unnecessary.

When changing the fixed control register, the entire PMU must be
disabled via the global control register. The changing cannot be taken
effect until the entire PMU is re-enabled. Only updating the fixed
control register once right before the entire PMU re-enabling is
enough.

The read of the fixed control register is not necessary either. The
value can be cached in the per CPU cpu_hw_events.

Test results:

Counting all the fixed counters with the perf bench sched pipe as below
on a SPR machine.

 $perf stat -e cycles,instructions,ref-cycles,slots --no-inherit --
  taskset -c 1 perf bench sched pipe

The Total elapsed time reduces from 5.36s (without the patch) to 4.99s
(with the patch), which is ~6.9% improvement.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220804140729.2951259-1-kan.liang@linux.intel.com
2022-09-07 21:54:04 +02:00
Peter Zijlstra
dbf4e792be perf/x86/p4: Remove perfctr_second_write quirk
Now that we have a x86_pmu::set_period() method, use it to remove the
perfctr_second_write quirk from the generic code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220829101321.839502514@infradead.org
2022-09-07 21:54:04 +02:00
Peter Zijlstra
1acab2e01c perf/x86/intel: Remove x86_pmu::update_topdown_event
Now that it is all internal to the intel driver, remove
x86_pmu::update_topdown_event.

Assumes that is_topdown_count(event) can only be true when the
hardware has topdown stuff and the function is set.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220829101321.771635301@infradead.org
2022-09-07 21:54:03 +02:00
Peter Zijlstra
2368516731 perf/x86/intel: Remove x86_pmu::set_topdown_event_period
Now that it is all internal to the intel driver, remove
x86_pmu::set_topdown_event_period.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220829101321.706354189@infradead.org
2022-09-07 21:54:03 +02:00
Peter Zijlstra
08b3068fab perf/x86: Add a x86_pmu::limit_period static_call
Avoid a branch and indirect call.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220829101321.640658334@infradead.org
2022-09-07 21:54:03 +02:00
Peter Zijlstra
28f0f3c44b perf/x86: Change x86_pmu::limit_period signature
In preparation for making it a static_call, change the signature.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220829101321.573713839@infradead.org
2022-09-07 21:54:02 +02:00
Peter Zijlstra
e577bb17a1 perf/x86/intel: Move the topdown stuff into the intel driver
Use the new x86_pmu::{set_period,update}() methods to push the topdown
stuff into the Intel driver, where it belongs.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220829101321.505933457@infradead.org
2022-09-07 21:54:02 +02:00
Peter Zijlstra
73759c3463 perf/x86: Add two more x86_pmu methods
In order to clean up x86_perf_event_{set_period,update)() start by
adding them as x86_pmu methods.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220829101321.440196408@infradead.org
2022-09-07 21:54:02 +02:00
Anshuman Khandual
88081cfb69 x86/perf: Assert all platform event flags are within PERF_EVENT_FLAG_ARCH
Ensure all platform specific event flags are within PERF_EVENT_FLAG_ARCH.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: James Clark <james.clark@arm.com>
Link: https://lkml.kernel.org/r/20220907091924.439193-5-anshuman.khandual@arm.com
2022-09-07 21:54:01 +02:00
Kan Liang
ee9db0e14b perf: Use sample_flags for txn
Use the new sample_flags to indicate whether the txn field is filled by
the PMU driver.

Remove the txn field from the perf_sample_data_init() to minimize the
number of cache lines touched.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220901130959.1285717-7-kan.liang@linux.intel.com
2022-09-06 11:33:03 +02:00
Kan Liang
e16fd7f2cb perf: Use sample_flags for data_src
Use the new sample_flags to indicate whether the data_src field is
filled by the PMU driver.

Remove the data_src field from the perf_sample_data_init() to minimize
the number of cache lines touched.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220901130959.1285717-6-kan.liang@linux.intel.com
2022-09-06 11:33:03 +02:00
Kan Liang
2abe681da0 perf: Use sample_flags for weight
Use the new sample_flags to indicate whether the weight field is filled
by the PMU driver.

Remove the weight field from the perf_sample_data_init() to minimize the
number of cache lines touched.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220901130959.1285717-5-kan.liang@linux.intel.com
2022-09-06 11:33:02 +02:00
Kan Liang
a9a931e266 perf: Use sample_flags for branch stack
Use the new sample_flags to indicate whether the branch stack is filled
by the PMU driver.

Remove the br_stack from the perf_sample_data_init() to minimize the number
of cache lines touched.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220901130959.1285717-4-kan.liang@linux.intel.com
2022-09-06 11:33:02 +02:00
Kan Liang
47a3aeb39e perf/x86/intel/pebs: Fix PEBS timestamps overwritten
The PEBS TSC-based timestamps do not appear correctly in the final
perf.data output file from perf record.

The data->time field setup by PEBS in the setup_pebs_fixed_sample_data()
is later overwritten by perf_events generic code in
perf_prepare_sample(). There is an ordering problem.

Set the sample flags when the data->time is updated by PEBS.
The data->time field will not be overwritten anymore.

Reported-by: Andreas Kogler <andreas.kogler.0x@gmail.com>
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220901130959.1285717-3-kan.liang@linux.intel.com
2022-09-06 11:33:01 +02:00
Paolo Bonzini
29250ba51b PCI interpretation compile fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmMMpCYACgkQ41TmuOI4
 ufirVA/+NpbDVgiGehgV9ngukDFWNNBOvPeKFIjnIWmhcTvRtQ/YWIALKJSinaVl
 jyPOTM0cEcrxhIPM2kHmnkZhwECRwM32Ox2N6ftmOPEJk8xrP6glfCuwJ6Yl/pNU
 EBLWzgZGRZyDGmCNo0jvaycPIh/WkeNKGwe/3FlI8gHx7iALvihE/0IFoPlCEWky
 dbetAAx5W0fRCoKscbMh0q2mDXuz4KBu3yij9wMGlhSzLcrSr9qejCmjEdERruYM
 c9hrLtIcy8cS9Epa5kOnz0PekK4EUlecB66BjnvsXabjTVNlF8MJwEG5EZfGxP+a
 jdthEjXaO5u7z4Y5NKxDwTFslmzX3PPbdBka9VGmhLkkV1/jGZ5RKKVY7+MljV74
 ZhhVkwUB0D3c7Wbpo+S1mVY6EHi/RO4TkaPCORsmCvwYXFZyhNsteKnWdRbcu3Iu
 aIjKamGeEfe8Mqojmzaksql9ScNnW9/INUyObz5UruGXFg2uFd0Vae20ZBpn7DYh
 gBYKZAFArNzHUUpP5zMBPnwpTGt8MuurI7rnVEvVRqsy4SofbYZYuFdxpn2obzui
 zPQY/lg6eXUWZkbfqJNrA1JN3cTSxVQCO96z8yYp/RazYiKMGUgZ7Ps7WlZkxRNE
 OSMkssXz6cIL1QPbu75iu6FMQGcVAywH2EN8jJh6E0ceUcVhsB0=
 =ENc3
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

PCI interpretation compile fixes
2022-09-01 19:21:27 -04:00
Like Xu
87693645ae perf/x86/core: Completely disable guest PEBS via guest's global_ctrl
When a guest PEBS counter is cross-mapped by a host counter, software
will remove the corresponding bit in the arr[global_ctrl].guest and
expect hardware to perform a change of state "from enable to disable"
via the msr_slot[] switch during the vmx transaction.

The real world is that if user adjust the counter overflow value small
enough, it still opens a tiny race window for the previously PEBS-enabled
counter to write cross-mapped PEBS records into the guest's PEBS buffer,
when arr[global_ctrl].guest has been prioritised (switch_msr_special stuff)
to switch into the enabled state, while the arr[pebs_enable].guest has not.

Close this window by clearing invalid bits in the arr[global_ctrl].guest.

Cc: linux-perf-users@vger.kernel.org
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Fixes: 854250329c02 ("KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations")
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20220831033524.58561-1-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-01 19:20:59 -04:00
Like Xu
f2aeea5750 perf/x86/core: Completely disable guest PEBS via guest's global_ctrl
When a guest PEBS counter is cross-mapped by a host counter, software
will remove the corresponding bit in the arr[global_ctrl].guest and
expect hardware to perform a change of state "from enable to disable"
via the msr_slot[] switch during the vmx transaction.

The real world is that if user adjust the counter overflow value small
enough, it still opens a tiny race window for the previously PEBS-enabled
counter to write cross-mapped PEBS records into the guest's PEBS buffer,
when arr[global_ctrl].guest has been prioritised (switch_msr_special stuff)
to switch into the enabled state, while the arr[pebs_enable].guest has not.

Close this window by clearing invalid bits in the arr[global_ctrl].guest.

Fixes: 854250329c02 ("KVM: x86/pmu: Disable guest PEBS temporarily in two rare situations")
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220831033524.58561-1-likexu@tencent.com
2022-09-01 11:19:42 +02:00
Kan Liang
24919fdea6 perf/x86/intel: Fix unchecked MSR access error for Alder Lake N
For some Alder Lake N machine, the below unchecked MSR access error may be
triggered.

[ 0.088017] rcu: Hierarchical SRCU implementation.
[ 0.088017] unchecked MSR access error: WRMSR to 0x38f (tried to write
0x0001000f0000003f) at rIP: 0xffffffffb5684de8 (native_write_msr+0x8/0x30)
[ 0.088017] Call Trace:
[ 0.088017] <TASK>
[ 0.088017] __intel_pmu_enable_all.constprop.46+0x4a/0xa0

The Alder Lake N only has e-cores. The X86_FEATURE_HYBRID_CPU flag is
not set. The perf cannot retrieve the correct CPU type via
get_this_hybrid_cpu_type(). The model specific get_hybrid_cpu_type() is
hardcode to p-core. The wrong CPU type is given to the PMU of the
Alder Lake N.

Since Alder Lake N isn't in fact a hybrid CPU, remove ALDERLAKE_N from
the rest of {ALDER,RAPTOP}LAKE and create a non-hybrid PMU setup.

The differences between Gracemont and the previous Tremont are,
- Number of GP counters
- Load and store latency Events
- PEBS event_constraints
- Instruction Latency support
- Data source encoding
- Memory access latency encoding

Fixes: c2a960f7c574 ("perf/x86: Add new Alder Lake and Raptor Lake support")
Reported-by: Jianfeng Gao <jianfeng.gao@intel.com>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220831142702.153110-1-kan.liang@linux.intel.com
2022-09-01 11:19:41 +02:00
Anshuman Khandual
a724ec8296 perf: Add system error and not in transaction branch types
This expands generic branch type classification by adding two more entries
there in i.e system error and not in transaction. This also updates the x86
implementation to process X86_BR_NO_TX records as appropriate. This changes
branch types reported to user space on x86 platform but it should not be a
problem. The possible scenarios and impacts are enumerated here.

 --------------------------------------------------------------------------
 | kernel | perf tool |                     Impact                        |
 --------------------------------------------------------------------------
 |   old  |    old    |  Works as before                                  |
 --------------------------------------------------------------------------
 |   old  |    new    |  PERF_BR_UNKNOWN is processed                     |
 --------------------------------------------------------------------------
 |   new  |    old    |  PERF_BR_NO_TX is blocked via old PERF_BR_MAX     |
 --------------------------------------------------------------------------
 |   new  |    new    |  PERF_BR_NO_TX is recognized                      |
 --------------------------------------------------------------------------

When PERF_BR_NO_TX is blocked via old PERF_BR_MAX (new kernel with old perf
tool) the user space might throw up an warning complaining about an
unrecognized branch types being reported, but it's expected. PERF_BR_SERROR
& PERF_BR_NO_TX branch types will be used for BRBE implementation on arm64
platform.

PERF_BR_NO_TX complements 'abort' and 'in_tx' elements in perf_branch_entry
which represent other transaction states for a given branch record. Because
this completes the transaction state classification.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: James Clark <james.clark@arm.com>
Link: https://lkml.kernel.org/r/20220824044822.70230-2-anshuman.khandual@arm.com
2022-08-29 09:42:41 +02:00
Sandipan Das
0bc3be5b4b perf/x86/amd/lbr: Add LbrExtV2 branch speculation info support
Provide branch speculation information captured via AMD Last Branch Record
Extension Version 2 (LbrExtV2) by setting the speculation info in branch
records. The info is based on the "valid" and "spec" bits in the Branch To
registers.

Suggested-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/ddc02f6320464cad0e3ff5bdb2314531568a91bc.1660211399.git.sandipan.das@amd.com
2022-08-27 00:05:45 +02:00
Sandipan Das
245268c19f perf/x86/amd/lbr: Use fusion-aware branch classifier
AMD Last Branch Record Extension Version 2 (LbrExtV2) can report a branch
from address that points to an instruction preceding the actual branch by
several bytes due to branch fusion and further optimizations in Zen4
processors.

In such cases, software should move forward sequentially in the instruction
stream from the reported address and the address of the first branch
encountered should be used instead. Hence, use the fusion-aware branch
classifier to determine the correct branch type and get the offset for
adjusting the branch from address.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/c324d2d0a9c3976da30b9563d09e50bfee0f264d.1660211399.git.sandipan.das@amd.com
2022-08-27 00:05:45 +02:00
Sandipan Das
df3e9612f7 perf/x86: Make branch classifier fusion-aware
With branch fusion and other optimizations, branch sampling hardware in
some processors can report a branch from address that points to an
instruction preceding the actual branch by several bytes.

In such cases, the classifier cannot determine the branch type which leads
to failures such as with the recently added test from commit b55878c90ab9
("perf test: Add test for branch stack sampling"). Branch information is
also easier to consume and annotate if branch from addresses always point
to branch instructions.

Add a new variant of the branch classifier that can account for instruction
fusion. If fusion is expected and the current branch from address does not
point to a branch instruction, it attempts to find the first branch within
the next (MAX_INSN_SIZE - 1) bytes and if found, additionally provides the
offset between the reported branch from address and the address of the
expected branch instruction.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/b6bb0abaa8a54c0b6d716344700ee11a1793d709.1660211399.git.sandipan.das@amd.com
2022-08-27 00:05:44 +02:00
Sandipan Das
f9c732249b perf/x86/amd/lbr: Add LbrExtV2 software branch filter support
With AMD Last Branch Record Extension Version 2 (LbrExtV2), it is necessary
to process the branch records further as hardware filtering is not granular
enough for identifying certain types of branches. E.g. to record system
calls, one should record far branches. The filter captures both far calls
and far returns but the irrelevant records are filtered out based on the
branch type as seen by the branch classifier.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/e51de057517f77788abd393c832e8dea616d489c.1660211399.git.sandipan.das@amd.com
2022-08-27 00:05:44 +02:00
Sandipan Das
4462fbfe6e perf/x86: Move branch classifier
Commit 3e702ff6d1ea ("perf/x86: Add LBR software filter support for Intel
CPUs") introduces a software branch filter which complements the hardware
branch filter and adds an x86 branch classifier.

Move the branch classifier to arch/x86/events/ so that it can be utilized
by other vendors for branch record filtering.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/bae5b95470d6bd49f40954bd379f414f5afcb965.1660211399.git.sandipan.das@amd.com
2022-08-27 00:05:44 +02:00
Sandipan Das
f4f925dae7 perf/x86/amd/lbr: Add LbrExtV2 hardware branch filter support
If AMD Last Branch Record Extension Version 2 (LbrExtV2) is detected,
convert the requested branch filter (PERF_SAMPLE_BRANCH_* flags) to the
corresponding hardware filter value and stash it in the event data when
a branch stack is requested. The hardware filter value is also saved in
per-CPU areas for use during event scheduling.

Hardware filtering is provided by the LBR Branch Select register. It has
bits which when set, suppress recording of the following types of branches:

  * CPL = 0 (Kernel only)
  * CPL > 0 (Userspace only)
  * Conditional Branches
  * Near Relative Calls
  * Near Indirect Calls
  * Near Returns
  * Near Indirect Jumps (excluding Near Indirect Calls and Near Returns)
  * Near Relative Jumps (excluding Near Relative Calls)
  * Far Branches

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/9336af5c9785b8e14c62220fc0e6cfb10ab97de3.1660211399.git.sandipan.das@amd.com
2022-08-27 00:05:43 +02:00
Sandipan Das
ca5b7c0d96 perf/x86/amd/lbr: Add LbrExtV2 branch record support
If AMD Last Branch Record Extension Version 2 (LbrExtV2) is detected,
enable it alongside LBR Freeze on PMI when an event requests branch stack
i.e. PERF_SAMPLE_BRANCH_STACK.

Each branch record is represented by a pair of registers, LBR From and LBR
To. The freeze feature prevents any updates to these registers once a PMC
overflows. The contents remain unchanged until the freeze bit is cleared by
the PMI handler.

The branch records are read and copied to sample data before unfreezing.
However, only valid entries are copied. There is no additional register to
denote which of the register pairs represent the top of the stack (TOS)
since internal register renaming always ensures that the first pair (i.e.
index 0) is the one representing the most recent branch and so on.

The LBR registers are per-thread resources and are cleared explicitly
whenever a new task is scheduled in. There are no special implications on
the contents of these registers when transitioning to deep C-states.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/d3b8500a3627a0d4d0259b005891ee248f248d91.1660211399.git.sandipan.das@amd.com
2022-08-27 00:05:43 +02:00
Sandipan Das
703fb765f4 perf/x86/amd/lbr: Detect LbrExtV2 support
AMD Last Branch Record Extension Version 2 (LbrExtV2) is driven by Core PMC
overflows. It records recently taken branches up to the moment when the PMC
overflow occurs.

Detect the feature during PMU initialization and set the branch stack depth
using CPUID leaf 0x80000022 EBX.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/fc6e45378ada258f1bab79b0de6e05c393a8f1dd.1660211399.git.sandipan.das@amd.com
2022-08-27 00:05:43 +02:00
Sandipan Das
706460a96f perf/x86/amd/core: Add generic branch record interfaces
AMD processors that are capable of recording branches support either Branch
Sampling (BRS) or Last Branch Record (LBR). In preparation for adding Last
Branch Record Extension Version 2 (LbrExtV2) support, introduce new static
calls which act as gateways to call into the feature-dependent functions
based on what is available on the processor.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/b75dbc32663cb395f0d701167e952c6a6b0445a3.1660211399.git.sandipan.das@amd.com
2022-08-27 00:05:42 +02:00
Sandipan Das
9603aa79e8 perf/x86/amd/core: Refactor branch attributes
AMD processors that are capable of recording branches support either Branch
Sampling (BRS) or Last Branch Record (LBR). In preparation for adding Last
Branch Record Extension Version 2 (LbrExtV2) support, reuse the "branches"
capability to advertise information about both BRS and LBR but make the
"branch-brs" event exclusive to Family 19h processors that support BRS.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/ba4a4cde6db79b1c65c49834027bbdb8a915546b.1660211399.git.sandipan.das@amd.com
2022-08-27 00:05:42 +02:00
Sandipan Das
b40d0156f5 perf/x86/amd/brs: Move feature-specific functions
Move some of the Branch Sampling (BRS) specific functions out of the Core
events sources and into the BRS sources in preparation for adding other
mechanisms to record branches.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/b60283b57179475d18ee242d117c335c16733693.1660211399.git.sandipan.das@amd.com
2022-08-27 00:05:41 +02:00
Stephane Eranian
11745ecfe8 perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU
Existing code was generating bogus counts for the SNB IMC bandwidth counters:

$ perf stat -a -I 1000 -e uncore_imc/data_reads/,uncore_imc/data_writes/
     1.000327813           1,024.03 MiB  uncore_imc/data_reads/
     1.000327813              20.73 MiB  uncore_imc/data_writes/
     2.000580153         261,120.00 MiB  uncore_imc/data_reads/
     2.000580153              23.28 MiB  uncore_imc/data_writes/

The problem was introduced by commit:
  07ce734dd8ad ("perf/x86/intel/uncore: Clean up client IMC")

Where the read_counter callback was replace to point to the generic
uncore_mmio_read_counter() function.

The SNB IMC counters are freerunnig 32-bit counters laid out contiguously in
MMIO. But uncore_mmio_read_counter() is using a readq() call to read from
MMIO therefore reading 64-bit from MMIO. Although this is okay for the
uncore_perf_event_update() function because it is shifting the value based
on the actual counter width to compute a delta, it is not okay for the
uncore_pmu_event_start() which is simply reading the counter  and therefore
priming the event->prev_count with a bogus value which is responsible for
causing bogus deltas in the perf stat command above.

The fix is to reintroduce the custom callback for read_counter for the SNB
IMC PMU and use readl() instead of readq(). With the change the output of
perf stat is back to normal:
$ perf stat -a -I 1000 -e uncore_imc/data_reads/,uncore_imc/data_writes/
     1.000120987             296.94 MiB  uncore_imc/data_reads/
     1.000120987             138.42 MiB  uncore_imc/data_writes/
     2.000403144             175.91 MiB  uncore_imc/data_reads/
     2.000403144              68.50 MiB  uncore_imc/data_writes/

Fixes: 07ce734dd8ad ("perf/x86/intel/uncore: Clean up client IMC")
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20220803160031.1379788-1-eranian@google.com
2022-08-27 00:05:38 +02:00
Kan Liang
cde643ff75 perf/x86/intel: Fix pebs event constraints for ADL
According to the latest event list, the LOAD_LATENCY PEBS event only
works on the GP counter 0 and 1 for ADL and RPL.

Update the pebs event constraints table.

Fixes: f83d2f91d259 ("perf/x86/intel: Add Alder Lake Hybrid support")
Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220818184429.2355857-1-kan.liang@linux.intel.com
2022-08-19 19:47:31 +02:00