d5d9696b03
The ARMv8.2 architecture introduces the optional Statistical Profiling Extension (SPE). SPE can be used to profile a population of operations in the CPU pipeline after instruction decode. These are either architected instructions (i.e. a dynamic instruction trace) or CPU-specific uops and the choice is fixed statically in the hardware and advertised to userspace via caps/. Sampling is controlled using a sampling interval, similar to a regular PMU counter, but also with an optional random perturbation to avoid falling into patterns where you continuously profile the same instruction in a hot loop. After each operation is decoded, the interval counter is decremented. When it hits zero, an operation is chosen for profiling and tracked within the pipeline until it retires. Along the way, information such as TLB lookups, cache misses, time spent to issue etc is captured in the form of a sample. The sample is then filtered according to certain criteria (e.g. load latency) that can be specified in the event config (described under format/) and, if the sample satisfies the filter, it is written out to memory as a record, otherwise it is discarded. Only one operation can be sampled at a time. The in-memory buffer is linear and virtually addressed, raising an interrupt when it fills up. The PMU driver handles these interrupts to give the appearance of a ring buffer, as expected by the AUX code. The in-memory trace-like format is self-describing (though not parseable in reverse) and written as a series of records, with each record corresponding to a sample and consisting of a sequence of packets. These packets are defined by the architecture, although some have CPU-specific fields for recording information specific to the microarchitecture. As a simple example, a record generated for a branch instruction may consist of the following packets: 0 (Address) : Virtual PC of the branch instruction 1 (Type) : Conditional direct branch 2 (Counter) : Number of cycles taken from Dispatch to Issue 3 (Address) : Virtual branch target + condition flags 4 (Counter) : Number of cycles taken from Dispatch to Complete 5 (Events) : Mispredicted as not-taken 6 (END) : End of record It is also possible to toggle properties such as timestamp packets in each record. This patch adds support for SPE in the form of a new perf driver. Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
55 lines
1.4 KiB
Plaintext
55 lines
1.4 KiB
Plaintext
#
|
|
# Performance Monitor Drivers
|
|
#
|
|
|
|
menu "Performance monitor support"
|
|
depends on PERF_EVENTS
|
|
|
|
config ARM_PMU
|
|
depends on ARM || ARM64
|
|
bool "ARM PMU framework"
|
|
default y
|
|
help
|
|
Say y if you want to use CPU performance monitors on ARM-based
|
|
systems.
|
|
|
|
config ARM_PMU_ACPI
|
|
depends on ARM_PMU && ACPI
|
|
def_bool y
|
|
|
|
config QCOM_L2_PMU
|
|
bool "Qualcomm Technologies L2-cache PMU"
|
|
depends on ARCH_QCOM && ARM64 && ACPI
|
|
help
|
|
Provides support for the L2 cache performance monitor unit (PMU)
|
|
in Qualcomm Technologies processors.
|
|
Adds the L2 cache PMU into the perf events subsystem for
|
|
monitoring L2 cache events.
|
|
|
|
config QCOM_L3_PMU
|
|
bool "Qualcomm Technologies L3-cache PMU"
|
|
depends on ARCH_QCOM && ARM64 && ACPI
|
|
select QCOM_IRQ_COMBINER
|
|
help
|
|
Provides support for the L3 cache performance monitor unit (PMU)
|
|
in Qualcomm Technologies processors.
|
|
Adds the L3 cache PMU into the perf events subsystem for
|
|
monitoring L3 cache events.
|
|
|
|
config XGENE_PMU
|
|
depends on ARCH_XGENE
|
|
bool "APM X-Gene SoC PMU"
|
|
default n
|
|
help
|
|
Say y if you want to use APM X-Gene SoC performance monitors.
|
|
|
|
config ARM_SPE_PMU
|
|
tristate "Enable support for the ARMv8.2 Statistical Profiling Extension"
|
|
depends on PERF_EVENTS && ARM64
|
|
help
|
|
Enable perf support for the ARMv8.2 Statistical Profiling
|
|
Extension, which provides periodic sampling of operations in
|
|
the CPU pipeline and reports this via the perf AUX interface.
|
|
|
|
endmenu
|