3374491619
The branch counters logging (A.K.A LBR event logging) introduces a per-counter indication of precise event occurrences in LBRs. It can provide a means to attribute exposed retirement latency to combinations of events across a block of instructions. It also provides a means of attributing Timed LBR latencies to events. The feature is first introduced on SRF/GRR. It is an enhancement of the ARCH LBR. It adds new fields in the LBR_INFO MSRs to log the occurrences of events on the GP counters. The information is displayed by the order of counters. The design proposed in this patch requires that the events which are logged must be in a group with the event that has LBR. If there are more than one LBR group, the counters logging information only from the current group (overflowed) are stored for the perf tool, otherwise the perf tool cannot know which and when other groups are scheduled especially when multiplexing is triggered. The user can ensure it uses the maximum number of counters that support LBR info (4 by now) by making the group large enough. The HW only logs events by the order of counters. The order may be different from the order of enabling which the perf tool can understand. When parsing the information of each branch entry, convert the counter order to the enabled order, and store the enabled order in the extension space. Unconditionally reset LBRs for an LBR event group when it's deleted. The logged counter information is only valid for the current LBR group. If another LBR group is scheduled later, the information from the stale LBRs would be otherwise wrongly interpreted. Add a sanity check in intel_pmu_hw_config(). Disable the feature if other counter filters (inv, cmask, edge, in_tx) are set or LBR call stack mode is enabled. (For the LBR call stack mode, we cannot simply flush the LBR, since it will break the call stack. Also, there is no obvious usage with the call stack mode for now.) Only applying the PERF_SAMPLE_BRANCH_COUNTERS doesn't require any branch stack setup. Expose the maximum number of supported counters and the width of the counters into the sysfs. The perf tool can use the information to parse the logged counters in each branch. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20231025201626.3000228-5-kan.liang@linux.intel.com
25 lines
1.3 KiB
C
25 lines
1.3 KiB
C
|
|
/*
|
|
* struct hw_perf_event.flags flags
|
|
*/
|
|
PERF_ARCH(PEBS_LDLAT, 0x00001) /* ld+ldlat data address sampling */
|
|
PERF_ARCH(PEBS_ST, 0x00002) /* st data address sampling */
|
|
PERF_ARCH(PEBS_ST_HSW, 0x00004) /* haswell style datala, store */
|
|
PERF_ARCH(PEBS_LD_HSW, 0x00008) /* haswell style datala, load */
|
|
PERF_ARCH(PEBS_NA_HSW, 0x00010) /* haswell style datala, unknown */
|
|
PERF_ARCH(EXCL, 0x00020) /* HT exclusivity on counter */
|
|
PERF_ARCH(DYNAMIC, 0x00040) /* dynamic alloc'd constraint */
|
|
/* 0x00080 */
|
|
PERF_ARCH(EXCL_ACCT, 0x00100) /* accounted EXCL event */
|
|
PERF_ARCH(AUTO_RELOAD, 0x00200) /* use PEBS auto-reload */
|
|
PERF_ARCH(LARGE_PEBS, 0x00400) /* use large PEBS */
|
|
PERF_ARCH(PEBS_VIA_PT, 0x00800) /* use PT buffer for PEBS */
|
|
PERF_ARCH(PAIR, 0x01000) /* Large Increment per Cycle */
|
|
PERF_ARCH(LBR_SELECT, 0x02000) /* Save/Restore MSR_LBR_SELECT */
|
|
PERF_ARCH(TOPDOWN, 0x04000) /* Count Topdown slots/metrics events */
|
|
PERF_ARCH(PEBS_STLAT, 0x08000) /* st+stlat data address sampling */
|
|
PERF_ARCH(AMD_BRS, 0x10000) /* AMD Branch Sampling */
|
|
PERF_ARCH(PEBS_LAT_HYBRID, 0x20000) /* ld and st lat for hybrid */
|
|
PERF_ARCH(NEEDS_BRANCH_STACK, 0x40000) /* require branch stack setup */
|
|
PERF_ARCH(BRANCH_COUNTERS, 0x80000) /* logs the counters in the extra space of each branch */
|