IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
* Support for various vector-accelerated crypto routines.
* Hibernation is now enabled for portable kernel builds.
* mmap_rnd_bits_max is larger on systems with larger VAs.
* Support for fast GUP.
* Support for membarrier-based instruction cache synchronization.
* Support for the Andes hart-level interrupt controller and PMU.
* Some cleanups around unaligned access speed probing and Kconfig
settings.
* Support for ACPI LPI and CPPC.
* Various cleanus related to barriers.
* A handful of fixes.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmX9icgTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYib+UD/4xyL6UMixx6A06BVBL9UT4vOrxRvNr
JIihG5y5QNMjes9DHWL35mZTMqFtQ0tq94ViWFLmJWloV/8KRVM2C9R9KX7vplf3
M/OwvP106spxgvNHoeQbycgs42RU1t2mpqT7N1iK2hCjqieP3vLn6hsSLXWTAG0L
3gQbQw6XCLC3hPyLq+nbFY2i4faeCmpXWmixoy/IvQ5calZQrRU0LNlP6lcMBhVo
uocjG0uGAhrahw2s81jxcMZcxa3AvUCiplapdD5H5v9rBM85SkYJj2Q9SqdSorkb
xzuimRnKPI5s47yM3pTfZY0qnQUYHV7PXXuw4WujpCQVQdhaG+Ggq63UUZA61J9t
IzZK2zdcfHqICrGTtXImUzRT3dcc3oq+IFq4tTY+rEJm29hrXkAtx+qBm5xtMvax
fJz5feJ/iT0u7MDj4Oq24n+Kpl+Olm+MJaZX3m5Ovi/9V6a9iK9HXqxg9/Fs0fMO
+J/0kTgd8Vu9CYH7KNWz3uztcO9eMAH3VyzuXuab4BGj1i1Y/9EjpALQi7rDN73S
OsYQX6NnzMkBV4dvElJVLXiPlvNlMHZZwdak5CqPb48jaJu6iiIZAuvOrG6/naGP
wnQSLVA2WWWoOkl3AJhxfpa11CLhbMl9E2gYm1VtNvASXoSFIxlAq1Yv3sG8yjty
4ZT0rYFJOstYiQ==
=3dL5
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for various vector-accelerated crypto routines
- Hibernation is now enabled for portable kernel builds
- mmap_rnd_bits_max is larger on systems with larger VAs
- Support for fast GUP
- Support for membarrier-based instruction cache synchronization
- Support for the Andes hart-level interrupt controller and PMU
- Some cleanups around unaligned access speed probing and Kconfig
settings
- Support for ACPI LPI and CPPC
- Various cleanus related to barriers
- A handful of fixes
* tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (66 commits)
riscv: Fix syscall wrapper for >word-size arguments
crypto: riscv - add vector crypto accelerated AES-CBC-CTS
crypto: riscv - parallelize AES-CBC decryption
riscv: Only flush the mm icache when setting an exec pte
riscv: Use kcalloc() instead of kzalloc()
riscv/barrier: Add missing space after ','
riscv/barrier: Consolidate fence definitions
riscv/barrier: Define RISCV_FULL_BARRIER
riscv/barrier: Define __{mb,rmb,wmb}
RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ
cpufreq: Move CPPC configs to common Kconfig and add RISC-V
ACPI: RISC-V: Add CPPC driver
ACPI: Enable ACPI_PROCESSOR for RISC-V
ACPI: RISC-V: Add LPI driver
cpuidle: RISC-V: Move few functions to arch/riscv
riscv: Introduce set_compat_task() in asm/compat.h
riscv: Introduce is_compat_thread() into compat.h
riscv: add compile-time test into is_compat_task()
riscv: Replace direct thread flag check with is_compat_task()
riscv: Improve arch_get_mmap_end() macro
...
Add the Andes AX45 JSON files that allows specifying symbolic event
names for the raw PMU events.
Signed-off-by: Locus Wei-Han Chen <locus84@andestech.com>
Reviewed-by: Yu Chien Peter Lin <peterlin@andestech.com>
Reviewed-by: Charles Ci-Jyun Wu <dminus@andestech.com>
Reviewed-by: Leo Yu-Chi Liang <ycliang@andestech.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Acked-by: Atish Patra <atishp@rivosinc.com>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240222083946.3977135-11-peterlin@andestech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
L3PMCx0AC and L3PMCx0AD, used in l3_xi_sampled_latency* events, have a
quirk that requires them to be programmed with SliceId set to 0x3.
Without this, the events do not count at all and affects dependent
metrics such as l3_read_miss_latency.
If ThreadMask is not specified, the amd-uncore driver internally sets
ThreadMask to 0x3, EnAllCores to 0x1 and EnAllSlices to 0x1 but does
not set SliceId. Since SliceId must also be set to 0x3 in this case,
specify all the other fields explicitly.
E.g.
$ sudo perf stat -e l3_xi_sampled_latency.all,l3_xi_sampled_latency_requests.all -a sleep 1
Before:
Performance counter stats for 'system wide':
0 l3_xi_sampled_latency.all
0 l3_xi_sampled_latency_requests.all
1.005155399 seconds time elapsed
After:
Performance counter stats for 'system wide':
921,446 l3_xi_sampled_latency.all
54,210 l3_xi_sampled_latency_requests.all
1.005664472 seconds time elapsed
Fixes: 5b2ca349c313 ("perf vendor events amd: Add Zen 4 uncore events")
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: ananth.narayan@amd.com
Cc: ravi.bangoria@amd.com
Cc: eranian@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301084431.646221-1-sandipan.das@amd.com
Correct the short description of the following events:
DCW_REQ, DCW_REQ_CHIP_HIT, DCW_REQ_DRAWER_HIT, DCW_REQ_IV,
DCW_ON_CHIP, DCW_ON_CHIP_IV, DCW_ON_CHIP_CHIP_HIT,
DCW_ON_CHIP_DRAWER_HIT, CW_ON_MODULE, DCW_ON_DRAWER,
DCW_OFF_DRAWER, IDCW_ON_MODULE_IV, IDCW_ON_MODULE_CHIP_HIT,
IDCW_ON_MODULE_DRAWER_HIT, IDCW_ON_DRAWER_IV, IDCW_ON_DRAWER_CHIP_HIT,
IDCW_ON_DRAWER_DRAWER_HIT, IDCW_OFF_DRAWER_IV, IDCW_OFF_DRAWER_CHIP_HIT,
IDCW_OFF_DRAWER_DRAWER_HIT, ICW_REQ, ICW_REQ_IV, CW_REQ_CHIP_HIT,
ICW_REQ_DRAWER_HIT, ICW_ON_CHIP, ICW_ON_CHIP_IV, ICW_ON_CHIP_CHIP_HIT,
ICW_ON_CHIP_DRAWER_HIT, ICW_ON_MODULE and ICW_OFF_DRAWER.
The second Cache should be L2-Cache.
Output before (display diff of the first four events)
# perf list -d
DCW_REQ
[Directory Write Level 1 Data Cache from Cache. Unit: cpum_cf]
DCW_REQ_CHIP_HIT
[Directory Write Level 1 Data Cache from Cache with Chip HP \
Hit. Unit: cpum_cf]
DCW_REQ_DRAWER_HIT
[Directory Write Level 1 Data Cache from Cache with Drawer \
HP Hit. Unit: cpum_cf]
DCW_REQ_IV
[Directory Write Level 1 Data Cache from Cache with Intervention. \
Unit: cpum_cf]
Output after:
# perf list -d
DCW_REQ
[Directory Write Level 1 Data Cache from L2-Cache. Unit: cpum_cf]
DCW_REQ_CHIP_HIT
[Directory Write Level 1 Data Cache from L2-Cache with Chip HP \
Hit. Unit: cpum_cf]
DCW_REQ_DRAWER_HIT
[Directory Write Level 1 Data Cache from L2-Cache with Drawer \
HP Hit. Unit: cpum_cf]
DCW_REQ_IV
[Directory Write Level 1 Data Cache from L2-Cache with \
Intervention. Unit: cpum_cf]
Fixes: 7f76b3113068 ("perf list: Add IBM z16 event description for s390")
Reported-by: Andreas Krebbel <krebbel@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Andreas Krebbel <krebbel@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: sumanthk@linux.ibm.com
Cc: svens@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221091908.1759083-1-tmricht@linux.ibm.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- tma_info_bottleneck* metrics, an abstraction or summarization of
the 100+ TMA tree nodes into 12-entry familiar performance metrics.
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
- Fixes for tma_info_bottleneck_mispredictions and
tma_info_bad_spec_branch_misprediction_cost.
- New tma_info_inst_mix_ippause metric.
- tma_serializing_operation is raised to level 3.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- tma_nop_instructions and tma_shuffles_256b are lowered to level 4
under tma_other_light_ops_group.
- Reduced number of events when SMT is off.
- Tuned thresholds for tma_info_bottleneck_branching_overhead,
tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-31-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- tma_info_bottleneck* metrics, an abstraction or summarization of
the 100+ TMA tree nodes into 12-entry familiar performance metrics.
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
- Fixes for tma_info_bottleneck_mispredictions and
tma_info_bad_spec_branch_misprediction_cost.
- tma_serializing_operation is raised to level 3.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- tma_nop_instructions and tma_shuffles_256b are lowered to level 4
under tma_other_light_ops_group.
- Reduced number of events when SMT is off.
- Tuned thresholds for tma_info_bottleneck_branching_overhead,
tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-30-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- tma_info_bottleneck* metrics, an abstraction or summarization of
the 100+ TMA tree nodes into 12-entry familiar performance metrics.
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
- Fixes for tma_info_bottleneck_mispredictions and
tma_info_bad_spec_branch_misprediction_cost.
- tma_serializing_operation is raised to level 3.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- tma_nop_instructions and tma_shuffles_256b are lowered to level 4
under tma_other_light_ops_group.
- Reduced number of events when SMT is off.
- Tuned thresholds for tma_info_bottleneck_branching_overhead,
tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-29-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- tma_info_bottleneck* metrics, an abstraction or summarization of
the 100+ TMA tree nodes into 12-entry familiar performance metrics.
- tma_c01_wait and tma_c02_wait metrics measure power-performance
states.
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
- Fixes for tma_info_bottleneck_mispredictions and
tma_info_bad_spec_branch_misprediction_cost.
- New tma_info_inst_mix_ippause metric.
- tma_serializing_operation is raised to level 3.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- tma_nop_instructions and tma_shuffles_256b are lowered to level 4
under tma_other_light_ops_group.
- Reduced number of events when SMT is off.
- Tuned thresholds for tma_info_bottleneck_branching_overhead,
tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-28-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- Add metrics tma_fp_vector_128b, tma_fp_vector_256b and
tma_info_system_cpus_utilized.
- Remove metrics tma_info_system_mem_parallel_requests,
tma_info_system_core_frequency and
tma_info_system_mem_request_latency.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- Tuned thresholds for tma_fetch_bandwidth.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-27-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- tma_info_bottleneck* metrics, an abstraction or summarization of
the 100+ TMA tree nodes into 12-entry familiar performance metrics.
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
- Fixes for tma_info_bottleneck_mispredictions and
tma_info_bad_spec_branch_misprediction_cost.
- New tma_info_inst_mix_ippause metric.
- tma_serializing_operation is raised to level 3.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- tma_nop_instructions and tma_shuffles_256b are lowered to level 4
under tma_other_light_ops_group.
- Reduced number of events when SMT is off.
- Tuned thresholds for tma_info_bottleneck_branching_overhead,
tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-26-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- Tuned thresholds for tma_fetch_bandwidth.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-25-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- Reduced number of events when SMT is off.
- Tuned thresholds for tma_fetch_bandwidth and
tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-24-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- Reduced number of events when SMT is off.
- Tuned thresholds for tma_fetch_bandwidth and
tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-23-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- tma_info_bottleneck* metrics, an abstraction or summarization of
the 100+ TMA tree nodes into 12-entry familiar performance metrics.
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
- Fixes for tma_info_bottleneck_mispredictions and
tma_info_bad_spec_branch_misprediction_cost.
- New tma_info_inst_mix_ippause metric.
- tma_serializing_operation is raised to level 3.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- tma_nop_instructions and tma_shuffles_256b are lowered to level 4
under tma_other_light_ops_group.
- Reduced number of events when SMT is off.
- Tuned thresholds for tma_info_bottleneck_branching_overhead,
tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-22-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- tma_info_bottleneck* metrics, an abstraction or summarization of
the 100+ TMA tree nodes into 12-entry familiar performance metrics.
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
- Fixes for tma_info_bottleneck_mispredictions and
tma_info_bad_spec_branch_misprediction_cost.
- New tma_info_inst_mix_ippause metric.
- tma_serializing_operation is raised to level 3.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- tma_nop_instructions and tma_shuffles_256b are lowered to level 4
under tma_other_light_ops_group.
- Reduced number of events when SMT is off.
- Tuned thresholds for tma_info_bottleneck_branching_overhead,
tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-21-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- Tuned thresholds for tma_fetch_bandwidth and
tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-20-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- Tuned thresholds for tma_fetch_bandwidth and
tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-19-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- tma_info_bottleneck* metrics, an abstraction or summarization of
the 100+ TMA tree nodes into 12-entry familiar performance metrics.
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
- Fixes for tma_info_bottleneck_mispredictions and
tma_info_bad_spec_branch_misprediction_cost.
- New tma_info_inst_mix_ippause metric.
- tma_serializing_operation is raised to level 3.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- tma_nop_instructions and tma_shuffles_256b are lowered to level 4
under tma_other_light_ops_group.
- Reduced number of events when SMT is off.
- Tuned thresholds for tma_info_bottleneck_branching_overhead,
tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-18-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc and tma_info_inst_mix_ipflop.
- Removal of tma_info_bad_spec_branch_misprediction_cost.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-17-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc and tma_info_inst_mix_ipflop.
- Removal of tma_info_bad_spec_branch_misprediction_cost.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-16-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc and tma_info_inst_mix_ipflop.
- Removal of tma_info_bad_spec_branch_misprediction_cost.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-15-irogers@google.com
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.
The update includes:
- tma_info_bottleneck* metrics, an abstraction or summarization of
the 100+ TMA tree nodes into 12-entry familiar performance metrics.
- tma_c01_wait and tma_c02_wait metrics measure power-performance
states.
- Reduce number of events (multiplexing) for tma_info_system_gflops,
tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
- Fixes for tma_info_bottleneck_mispredictions and
tma_info_bad_spec_branch_misprediction_cost.
- New tma_info_inst_mix_ippause metric.
- tma_serializing_operation is raised to level 3.
- Swapped tma_info_core_ilp (becomes per SMT thread) and
tma_info_pipeline_execute (per physical core).
- tma_nop_instructions and tma_shuffles_256b are lowered to level 4
under tma_other_light_ops_group.
- Reduced number of events when SMT is off.
- Tuned thresholds for tma_info_bottleneck_branching_overhead,
tma_fetch_bandwidth and tma_ports_utilized_3m.
The update came from:
https://github.com/intel/perfmon/pull/140https://github.com/intel/perfmon/pull/138
Running the script:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-14-irogers@google.com
Update alderlake events to v1.15 released in:
282a6951fd
Documentation fixes, removal of TOPDOWN.BR_MISPREDICT_SLOTS,
deprecation of UNC_ARB_DAT_REQUESTS.RD, UNC_ARB_DAT_REQUESTS.RD and
UNC_ARB_IFA_OCCUPANCY.ALL.
Event json automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-13-irogers@google.com
Update sierraforest events to v1.01 released in:
582bca24aa
Adds the majority of core and uncore events.
Event json automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-11-irogers@google.com
Update haswell events to v35 released in:
c0f9b34d42
Updates "must be precise" on RTM_RETIRED.ABORTED.
Event json automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-7-irogers@google.com
Update grandridge events to v1.01 released in:
211d607165
Adds the majority of core and uncore events.
Event json automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-6-irogers@google.com
Update broadwell events to v29 released in:
47117146c6
Updates "must be precise" on RTM_RETIRED.ABORTED.
Event json automatically generated by:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-4-irogers@google.com
Prior to this patch '0' would be dropped as the config values default
to 0. Some json values are hex and the string '0' wouldn't match '0x0'
as zero. Add a more robust is_zero test to drop these event terms.
When encoding numbers as hex, if the number is between 0 and 9
inclusive then don't add a 0x prefix.
Update test expectations for these changes.
On x86 this reduces the event/metric C string by 58,411 bytes.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240131201429.792138-1-irogers@google.com
As events are deduplicated by name, ensure PMU prefixes are always
used in metrics. Previously they may be missed on the first event in a
formula.
Update metric constraints for architectures with topdown l2 events.
Conversion script updated in:
https://github.com/intel/perfmon/pull/128
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Closes: https://lore.kernel.org/lkml/ZZam-EG-UepcXtWw@kernel.org/
Link: https://lore.kernel.org/r/20240104231903.775717-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.17 released in:
https://github.com/intel/perfmon/pull/123
Add events FP_ARITH_DISPATCHED.V0, FP_ARITH_DISPATCHED.V1,
FP_ARITH_DISPATCHED.V2, UNC_IIO_IOMMU0.1G_HITS, UNC_IIO_IOMMU0.2M_HITS
and UNC_IIO_IOMMU0.4K_HITS. Description updates.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240104074259.653219-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.23 released in:
https://github.com/intel/perfmon/pull/123
Updates to event descriptions.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240104074259.653219-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.02 released in:
https://github.com/intel/perfmon/pull/123
Removes events AMX_OPS_RETIRED.BF16 and AMX_OPS_RETIRED.INT8. Add
events FP_ARITH_DISPATCHED.V0, FP_ARITH_DISPATCHED.V1,
FP_ARITH_DISPATCHED.V2, UNC_IIO_IOMMU0.1G_HITS, UNC_IIO_IOMMU0.2M_HITS
and UNC_IIO_IOMMU0.4K_HITS. Description updates.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240104074259.653219-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix that the core PMU is being specified for 2 uncore events. Specify
a PMU for the alderlake UNCORE_FREQ metric.
Conversion script updated in:
https://github.com/intel/perfmon/pull/126
Committer testing:
Before this patch the "perf all metricgroups test" was failing, now:
root@number:~# perf test metric
10: PMU events :
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
10.5: Parsing of metric thresholds with fake PMUs : Ok
61: Parse and process metrics : Ok
98: perf stat metrics (shadow stat) test : Skip
101: perf all metricgroups test : Ok
102: perf all metrics test : FAILED!
107: perf metrics value validation : Ok
root@number:~#
Test 102 is failing for another reason, not being able to get as many
counters as needed, Ian Rogers suggested disabling the NMI watchdog to
have more counters available:
root@number:/home/acme# cat /proc/sys/kernel/nmi_watchdog
1
root@number:/home/acme# echo 0 > /proc/sys/kernel/nmi_watchdog
root@number:/home/acme# perf test 102
102: perf all metrics test : Ok
root@number:/home/acme#
Closes: https://lore.kernel.org/lkml/ZZWOdHXJJ_oecWwm@kernel.org/
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240104074259.653219-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make the jevents parser aware of the Unified Memory Controller (UMC) PMU
and add events taken from Section 8.2.1 "UMC Performance Monitor Events"
of the Processor Programming Reference (PPR) for AMD Family 19h Model 11h
processors. The events capture UMC command activity such as CAS, ACTIVATE,
PRECHARGE etc. while the metrics derive data bus utilization and memory
bandwidth out of these events.
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/e0d8a7e8ca8ee3e378d8029e80b456ac327d6419.1701238314.git.sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
HX-C2000 is a new CPU made by HEXIN Technologies Co., Ltd. And a new PVN
0x0066 has been applied from the OpenPower Community for this CPU.
Here is a patch to make perf tool run in the CPU.
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: JiaLong.Yang <jialong.yang@shingroup.cn>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: shenghui.qu@shingroup.cn
Cc: Zhao Ke <ke.zhao@shingroup.cn>
Cc: zhijie.ren@shingroup.cn
Link: https://lore.kernel.org/r/20231221060242.4532-1-jialong.yang@shingroup.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up fixes that went thru perf-tools for v6.7 and to get in sync
with upstream to check for drift in the copies of headers, etc.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Running "perf list" on powerpc fails with segfault as below:
$ ./perf list
Segmentation fault (core dumped)
$
This happens because of duplicate events in the JSON list. The powerpc
JSON event list contains some event with same event name, but different
event code. They are:
- PM_INST_FROM_L3MISS (Present in datasource and frontend)
- PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked)
- PM_MRK_INST_FROM_L3MISS (Present in datasource and marked)
- PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked)
pmu_events_table__num_events() uses the value from table_pmu->num_entries
which includes duplicate events as well. This causes issue during "perf
list" and results in a segmentation fault.
Since both event codes are valid, append _DSRC to the Data Source events
(datasource.json), so that they would have a unique name.
Also add PM_DATA_FROM_L2MISS_DSRC and PM_DATA_FROM_L3MISS_DSRC events.
With the fix, 'perf list' works as expected.
Fixes: fc143580753348c6 ("perf vendor events power10: Update JSON/events")
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20231123160110.94090-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add JSON files for AmpereOneX core PMU events and metrics.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20231201021550.1109196-4-ilkka@os.amperecomputing.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The documentation wrongly called the event as BPU_FLUSH_MEM_FAULT and now
has been fixed. Correct the name in the perf tool as well.
Fixes: a9650b7f6fc09d16 ("perf vendor events arm64: Add AmpereOne core PMU events")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20231201021550.1109196-3-ilkka@os.amperecomputing.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>