IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The PMU can monitor traffic of certain target Root Port or downstream
target Endpoint. User can specify the target filter by the "port" or
"bdf" option respectively. The PMU can only monitor the Root Port or
Endpoint on the same PCIe core so the value of "port" or "bdf" should
be valid and will be checked by the driver.
Currently at least and only one of "port" and "bdf" option must be set.
If "port" filter is not set or is set explicitly to zero (default),
driver will regard the user specifies a "bdf" option since "port" option
is a bitmask of the target Root Ports and zero is not a valid
value.
If user not explicitly set "port" or "bdf" filter, the driver uses "bdf"
default value (zero) to set target filter, but driver will skip the
check of bdf=0, although it's a valid value (meaning 0000:000:00.0).
Then the user just gets zero.
Therefore, we need to check if both "port" and "bdf" are invalid, then
return failure and report warning.
Testing:
before the patch:
0 hisi_pcie0_core1/rx_mrd_flux/
0 hisi_pcie0_core1/rx_mrd_flux,port=0/
24,124 hisi_pcie0_core1/rx_mrd_flux,port=1/
0 hisi_pcie0_core1/rx_mrd_flux,bdf=0/
0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/
<not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
24,132 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/
<not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/
<not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/
24,138 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/
24,126 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/
after the patch:
<not supported> hisi_pcie0_core1/rx_mrd_flux/
<not supported> hisi_pcie0_core1/rx_mrd_flux,port=0/
24,153 hisi_pcie0_core1/rx_mrd_flux,port=1/
0 hisi_pcie0_core1/rx_mrd_flux,port=0x800/
<not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=0/
<not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
24,117 hisi_pcie0_core1/rx_mrd_flux,bdf=0x1700/
<not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x0/
<not supported> hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1/
24,120 hisi_pcie0_core1/rx_mrd_flux,port=0x0,bdf=0x1700/
24,123 hisi_pcie0_core1/rx_mrd_flux,port=0x1,bdf=0x0/
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240223103359.18669-6-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
A typical PCIe transaction is consisted of various TLP packets in both
direction. For counting bandwidth only memory read events are exported
currently. Add memory write and completion counting events of both
direction to complete the bandwidth counting.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240223103359.18669-5-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The metric counting shows incorrect results if the events in the
metric group using the same event but different filter options.
This is because we only judge the event code to decide whether
the event in the metric group should share the same hardware
counter, but ignore the settings of the filter.
For example, on a platform of 2 ports 0x1 and 0x2 but only port
0x1 has a downstream PCIe NVME device. The metric counting
shows both ports have the same counts because we misassign these
two events to one same hardware counter:
[root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}'
Performance counter stats for 'system wide':
7907484924 hisi_pcie0_core1/event=0x0104,port=0x2/
7907484924 hisi_pcie0_core1/event=0x0104,port=0x1/
10.153863691 seconds time elapsed
Fix this by using the whole config rather than the event only
to judge whether two events are the same and should share the
same hardware counter. With this patch, the metric counting in
the above case tends to be corrected:
[root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}'
Performance counter stats for 'system wide':
0 hisi_pcie0_core1/event=0x0104,port=0x2/
8123122077 hisi_pcie0_core1/event=0x0104,port=0x1/
10.152875631 seconds time elapsed
Fixes: 8404b0fbc7 ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240223103359.18669-4-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Factor out retrieving of the register value for the
corresponding event from hisi_pcie_config_event_ctrl() into a
new function hisi_pcie_pmu_get_event_ctrl_val() allowing future
reuse.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240223103359.18669-3-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
hisi_pcie_pmu_{config,clear}_filter() are config/clear HISI_PCIE_EVENT_CTRL
register which contains not only the filter but also the event code. The
function names are bit misleading. Rename it to
hisi_pcie_pmu_{config,clear}_event_ctrl() to reflects their functions
more accurately.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240223103359.18669-2-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Check whether the event type matches the PMU type firstly in
pmu::event_init() before touching the event. Otherwise we'll
change the events of others and lead to incorrect results.
Since in perf_init_event() we may call every pmu's event_init()
in a certain case, we should not modify the event if it's not
ours.
Fixes: 8404b0fbc7 ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20231024092954.42297-2-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The PCIe PMUs locate on different NUMA node but currently we don't
consider it and likely stack all the sessions on the same CPU:
[root@localhost tmp]# cat /sys/devices/hisi_pcie*/cpumask
0
0
0
0
0
0
This can be optimize a bit to use a local CPU for the PMU.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20230815131010.2147-1-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The driver needs to migrate the perf context if the current using CPU going
to teardown. By the time calling the cpuhp::teardown() callback the
cpu_online_mask() hasn't updated yet and still includes the CPU going to
teardown. In current driver's implementation we may migrate the context
to the teardown CPU and leads to the below calltrace:
...
[ 368.104662][ T932] task:cpuhp/0 state:D stack: 0 pid: 15 ppid: 2 flags:0x00000008
[ 368.113699][ T932] Call trace:
[ 368.116834][ T932] __switch_to+0x7c/0xbc
[ 368.120924][ T932] __schedule+0x338/0x6f0
[ 368.125098][ T932] schedule+0x50/0xe0
[ 368.128926][ T932] schedule_preempt_disabled+0x18/0x24
[ 368.134229][ T932] __mutex_lock.constprop.0+0x1d4/0x5dc
[ 368.139617][ T932] __mutex_lock_slowpath+0x1c/0x30
[ 368.144573][ T932] mutex_lock+0x50/0x60
[ 368.148579][ T932] perf_pmu_migrate_context+0x84/0x2b0
[ 368.153884][ T932] hisi_pcie_pmu_offline_cpu+0x90/0xe0 [hisi_pcie_pmu]
[ 368.160579][ T932] cpuhp_invoke_callback+0x2a0/0x650
[ 368.165707][ T932] cpuhp_thread_fun+0xe4/0x190
[ 368.170316][ T932] smpboot_thread_fn+0x15c/0x1a0
[ 368.175099][ T932] kthread+0x108/0x13c
[ 368.179012][ T932] ret_from_fork+0x10/0x18
...
Use function cpumask_any_but() to find one correct active cpu to fixes
this issue.
Fixes: 8404b0fbc7 ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230608114326.27649-1-hejunhao3@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
The PMU support to filter the TLP when counting the bandwidth with below
options:
- only count the TLP headers
- only count the TLP payloads
- count both TLP headers and payloads
In the current driver it's default to count the TLP payloads only, which
will have an implicity side effects that on the traffic only have header
only TLPs, we'll get no data.
Make this user configuration through "len_mode" parameter and make it
default to count both TLP headers and payloads when user not specified.
Also update the documentation for it.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20221117084136.53572-5-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Some event id of hisi-pcie-pmu is incorrect, fix them.
Fixes: 8404b0fbc7 ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20221117084136.53572-2-yangyicong@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported
to sample bandwidth, latency, buffer occupation etc.
Each PMU RCiEP device monitors multiple Root Ports, and each RCiEP is
registered as a PMU in /sys/bus/event_source/devices, so users can
select target PMU, and use filter to do further sets.
Filtering options contains:
event - select the event.
port - select target Root Ports. Information of Root Ports are
shown under sysfs.
bdf - select requester_id of target EP device.
trig_len - set trigger condition for starting event statistics.
trig_mode - set trigger mode. 0 means starting to statistic when bigger
than trigger condition, and 1 means smaller.
thr_len - set threshold for statistics.
thr_mode - set threshold mode. 0 means count when bigger than threshold,
and 1 means smaller.
Acked-by: Krzysztof Wilczyński <kw@linux.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20211202080633.2919-3-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>