13504 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Florian Fischer
|
c735b0a521 |
perf stat: Introduce stats for the user and system rusage times
This is preparation for exporting rusage values as tool events. Add new global stats tracking the values obtained via rusage. For now only ru_utime and ru_stime are part of the tracked stats. Both are stored as nanoseconds to be consistent with 'duration_time', although the finest resolution the struct timeval data in rusage provides are microseconds. Signed-off-by: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220420102354.468173-2-florian.fischer@muhq.space Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Martin Liška
|
c60664dea7 |
perf tools: Print warning when HAVE_DEBUGINFOD_SUPPORT is not set and user tries to use debuginfod support
When one requests debuginfod, either via --debuginfod option, or with a perf-config value, complain when perf is not built with it. Signed-off-by: Martin Liška <mliska@suse.cz> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/35bae747-3951-dc3d-a66b-abf4cebcd9cb@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Martin Liška
|
b8836c2a4d |
perf version: Add HAVE_DEBUGINFOD_SUPPORT to built-in features
The change adds debuginfod to ./perf -vv: ... debuginfod: [ OFF ] # HAVE_DEBUGINFOD_SUPPORT ... Signed-off-by: Martin Liška <mliska@suse.cz> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/0d1c5ace-88e8-7102-1565-7c143f01a966@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
87e0a30e9a |
perf vendor events intel: Update goldmont event topics
Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
f51c401f11 |
perf vendor events intel: Update goldmontplus event topics
Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
8f1a69825f |
perf vendor events intel: Update elkhartlake event topics
Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
44a4b9ad8e |
perf vendor events intel: Update westmereex event topics
Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
7f2c72fa69 |
perf vendor events intel: Update westmereep-sp event topics
Apply topic updates from: p https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
a01174fc9e |
perf vendor events intel: Update westmereep-dp event topics
Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
55ae1b759e |
perf vendor events intel: Update tremontx uncore and topics
Update the topic of BTCLEAR.ANY and add additional uncore event names as per: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>1 Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
45d97cdd2f |
perf vendor events intel: Update tigerlake topic
Update the topic of ASSISTS.ANY as per: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
da578feb70 |
perf vendor events intel: Update nehalemep event topics
Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
339ec95167 |
perf vendor events intel: Update SKX uncore
JSON uncore events are generated for Skylake Server for v1.26 with events from: https://download.01.org/perfmon/SKX/ New event names are added, that match the original JSON names, due to an update to: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
dd498d0804 |
perf vendor events intel: Update CLX uncore to v1.14
JSON uncore events are generated for CascadeLake Server for v1.14 with events from: https://download.01.org/perfmon/CLX/ New event names are added, that match the original JSON names, due to an update to: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
12c6385eeb |
perf vendor events intel: Add sapphirerapids events
Events were generated from 01.org using: https://github.com/intel/event-converter-for-linux-perf Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
cbeee6caa4 |
perf vendor events intel: Fix icelakex cstate metrics
Apply cstate fix from: https://github.com/intel/event-converter-for-linux-perf/ so that metrics for cstates that exist on the particular architecture are generated. This corrects issues with metric testing. Also correct topic of ASSISTS.ANY event. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
2c77f36a9a |
perf vendor events intel: Fix icelake cstate metrics
Apply cstate fix from: https://github.com/intel/event-converter-for-linux-perf/ so that metrics for cstates that exist on the particular architecture are generated. This corrects issues with metric testing. Also correct topic of ASSISTS.ANY event. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Leo Yan
|
fdefc3750e |
perf mem: Print memory operation type
The memory operation types are not only for load and store, for easier reviewing the memory operation type, this patch prints out it. Before: ls 14753 [011] 3678.072400: 1 l1d-miss: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 l1d-access: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 tlb-access: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 memory: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) After: ls 14753 [011] 3678.072400: 1 l1d-miss: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 l1d-access: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 tlb-access: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 memory: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220417124524.901148-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Athira Rajeev
|
f58faed7fb |
perf bench: Fix numa bench to fix usage of affinity for machines with #CPUs > 1K
The 'perf bench numa' testcase fails on systems with more than 1K CPUs. Testcase: perf bench numa mem -p 1 -t 3 -P 512 -s 100 -zZ0qcm --thp 1 Snippet of code: <<>> perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed. Aborted (core dumped) <<>> bind_to_node() uses "sched_getaffinity" to save the original cpumask and this call is returning EINVAL ((invalid argument). This happens because the default mask size in glibc is 1024. To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros ie, use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size. Apart from fixing this for "orig_mask", apply same logic to "mask" as well which is used to setaffinity so that mask size is large enough to represent number of possible CPU's in the system. sched_getaffinity is used in one more place in perf numa bench. It is in "bind_to_cpu" function. Apply the same logic there also. Though currently no failure is reported from there, it is ideal to change getaffinity to work with such system configurations having CPU's more than default mask size supported by glibc. Also fix "sched_setaffinity" to use mask size which is large enough to represent number of possible CPU's in the system. Fixed all places where "bind_cpumask" which is part of "struct thread_data" is used such that bind_cpumask works in all configuration. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220412164059.42654-3-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Athira Rajeev
|
8cb7a188ac |
perf bench: Fix numa testcase to check if CPU used to bind task is online
Perf numa bench test fails with error: Testcase: ./perf bench numa mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk Failure snippet: <<>> Running 'numa/mem' benchmark: # Running main, "perf bench numa numa-mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk" perf: bench/numa.c:333: bind_to_cpumask: Assertion `!(ret)' failed. <<>> The Testcases uses CPU's 0 and 8. In function "parse_setup_cpu_list", There is check to see if cpu number is greater than max cpu's possible in the system ie via "if (bind_cpu_0 >= g->p.nr_cpus || bind_cpu_1 >= g->p.nr_cpus) {". But it could happen that system has say 48 CPU's, but only number of online CPU's is 0-7. Other CPU's are offlined. Since "g->p.nr_cpus" is 48, so function will go ahead and set bit for CPU 8 also in cpumask ( td->bind_cpumask). bind_to_cpumask function is called to set affinity using sched_setaffinity and the cpumask. Since the CPU8 is not present, set affinity will fail here with EINVAL. Fix this issue by adding a check to make sure that, CPU's provided in the input argument values are online before proceeding further and skip the test. For this, include new helper function "is_cpu_online" in "tools/perf/util/header.c". Since "BIT(x)" definition will get included from header.h, remove that from bench/numa.c Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220412164059.42654-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
24f378e660 |
perf test: Add basic perf record tests
Test the --per-thread flag. Test Intel machine state capturing. Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.bayduraev@gmail.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220414014642.3308206-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Alexey Bayduraev
|
23380e4d53 |
perf record: Fix per-thread option
Per-thread mode doesn't have specific CPUs for events, add checks for this case. Minor fix to a pr_debug by Ian Rogers <irogers@google.com> to avoid an out of bound array access. Fixes: 7954f71689f90cb2 ("perf record: Introduce thread affinity and mmap masks") Reported-by: Ian Rogers <irogers@google.com> Signed-off-by: Alexey Bayduraev <alexey.bayduraev@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220414014642.3308206-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
James Clark
|
2adacd7f0a |
perf docs: Add man page entry for Arm SPE
The SPE integration in Perf has quite a few usability quirks that can't be found by just reading the reference manual. So document this and at the same time add a summary of the feature that is also hard to find elsewhere. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Co-authored-by: Al Grant <al.grant@arm.com> Co-authored-by: Luke Dare <Luke.Dare@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220413084021.2556142-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Lv Ruyi
|
d73f5d14e0 |
perf stat: Fix error check return value of hashmap__new(), must use IS_ERR()
hashmap__new() returns ERR_PTR(-ENOMEM) when it fails, so we should use IS_ERR() to check it in error handling path. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220413093302.2538128-1-lv.ruyi@zte.com.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
f034fc50d3 |
perf tools: Fix misleading add event PMU debug message
Fix incorrect debug message: Attempting to add event pmu 'intel_pt' with '' that may result in non-fatal errors which always appears with perf record -vv and intel_pt e.g. perf record -vv -e intel_pt//u uname The message is incorrect because there will never be non-fatal errors. Suppress the message if the PMU is 'selectable' i.e. meant to be selected directly as an event. Fixes: 4ac22b484d4c79e8 ("perf parse-events: Make add PMU verbose output clearer") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20220411061758.2458417-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Carsten Haitzler
|
41204da4c1 |
perf test: Shell - Limit to only run executable scripts in tests
'perf test''s shell runner will just run everything in the tests directory (as long as it's not another directory or does not begin with a dot), but sometimes you find files in there that are not shell scripts - perf.data output for example if you do some testing and then the next time you run perf test it tries to run these. Check the files are executable so they are actually intended to be test scripts and not just some "random junk" files there. Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Link: http://lore.kernel.org/lkml/20220309122859.31487-1-carsten.haitzler@foss.arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Eelco Chaudron
|
ae24e9b53d |
perf scripting python: Expose symbol offset and source information
This change adds the symbol offset to the data exported for each call-chain entry. This can not be calculated from the script and only the ip value, and no related mmap information. In addition, also export the source file and line information, if available, to avoid an external lookup if this information is needed. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/164554263724.752731.14651017093796049736.stgit@wsfd-netdev64.ntdv.lab.eng.bos.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Eric Lin
|
335f70faa2 |
perf jitdump: Add riscv64 support
This patch enables perf jitdump for riscv64 and was tested with V8 on qemu rv64. Qemu rv64: $ perf record -e cpu-clock -c 1000 -g -k mono ./d8_rv64 --perf-prof --no-write-protect-code-memory test.js $ perf inject -j -i perf.data -o perf.data.jitted $ perf report -i perf.data.jitted Output: To display the perf.data header info, please use --header/--header-only options. Total Lost Samples: 0 Samples: 87K of event 'cpu-clock' Event count (approx.): 87974000 Children Self Command Shared Object Symbol .... 0.28% 0.06% d8_rv64 d8_rv64 [.] _ZN2v88internal6WasmJs7InstallEPNS0_7IsolateEb 0.28% 0.00% d8_rv64 d8_rv64 [.] _ZN2v88internal10ParserBaseINS0_6ParserEE22ParseLogicalExpressionEv 0.28% 0.03% d8_rv64 jitted-112-76.so [.] Builtin:InterpreterEntryTrampoline 0.12% 0.00% d8_rv64 d8_rv64 [.] _ZN2v88internal19ContextDeserializer11DeserializeEPNS0_7IsolateENS0_6HandleINS0_13JSGlobalProxyEEENS_33DeserializeInternalFieldsCallbackE 0.12% 0.01% d8_rv64 jitted-112-651.so [.] Builtin:CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit .... Signed-off-by: Eric Lin <eric.lin@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: greentime.hu@sifive.com Cc: linux-riscv@lists.infradead.org Link: http://lore.kernel.org/lkml/20220406142606.18464-2-eric.lin@sifive.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
940a445a90 |
perf annotate: Drop objdump stderr to avoid getting stuck waiting for stdout output
If objdump writes to stderr it can block waiting for it to be read. As perf doesn't read stderr then progress stops with perf waiting for stdout output. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Truong <alexandre.truong@arm.com> Cc: Dave Marchevsky <davemarchevsky@fb.com> Cc: Denis Nikitin <denik@chromium.org> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Lexi Shao <shaolexi@huawei.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Remi Bernon <rbernon@codeweavers.com> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20220407230503.1265036-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Michael Petlan
|
3e6b43beb7 |
perf tools: Add external commands to list-cmds
The `perf --list-cmds` output prints only internal commands, although there is no reason for that from users' perspective. Adding the external commands to commands array with NULL function pointer allows printing all perf commands while not changing the logic of command handler selection. Signed-off-by: Michael Petlan <mpetlan@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220404221541.30312-2-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Michael Petlan
|
0ff26efe92 |
perf docs: Add perf-iostat link to manpages
Signed-off-by: Michael Petlan <mpetlan@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220404221541.30312-1-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Denis Nikitin
|
bc21e74d47 |
perf session: Remap buf if there is no space for event
If a perf event doesn't fit into remaining buffer space return NULL to remap buf and fetch the event again. Keep the logic to error out on inadequate input from fuzzing. This fixes perf failing on ChromeOS (with 32b userspace): $ perf report -v -i perf.data ... prefetch_event: head=0x1fffff8 event->header_size=0x30, mmap_size=0x2000000: fuzzed or compressed perf.data? Error: failed to process sample Fixes: 57fc032ad643ffd0 ("perf session: Avoid infinite loop when seeing invalid header.size") Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Denis Nikitin <denik@chromium.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220330031130.2152327-1-denik@chromium.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Athira Rajeev
|
299687e18a |
perf bench: Fix epoll bench to correct usage of affinity for machines with #CPUs > 1K
The 'perf bench epoll' testcase fails on systems with more than 1K CPUs. Testcase: perf bench epoll all Result snippet: <<>> Run summary [PID 106497]: 1399 threads monitoring on 64 file-descriptors for 8 secs. perf: pthread_create: No such file or directory <<>> In epoll benchmarks (ctl, wait) pthread_create is invoked in do_threads from respective bench_epoll_* function. Though the logs shows direct failure from pthread_create, the actual failure is from "sched_setaffinity" returning EINVAL (invalid argument). This happens because the default mask size in glibc is 1024. To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros. Patch addresses this by fixing all the epoll benchmarks to use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the mask. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220406175113.87881-3-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Athira Rajeev
|
c9c2a427dd |
perf bench: Fix futex bench to correct usage of affinity for machines with #CPUs > 1K
The 'perf bench futex' testcase fails on systems with more than 1K CPUs. Testcase: perf bench futex all Failure snippet: <<>>Running futex/hash benchmark... perf: pthread_create: No such file or directory <<>> All the futex benchmarks (ie hash, lock-api, requeue, wake, wake-parallel), pthread_create is invoked in respective bench_futex_* function. Though the logs shows direct failure from pthread_create, strace logs showed that actual failure is from "sched_setaffinity" returning EINVAL (invalid argument). This happens because the default mask size in glibc is 1024. To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros. Patch addresses this by fixing all the futex benchmarks to use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the mask. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220406175113.87881-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
aeee9dc53c |
perf tools: Fix perf's libperf_print callback
eprintf() does not expect va_list as the type of the 4th parameter. Use veprintf() because it does. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: 428dab813a56ce94 ("libperf: Merge libperf_set_print() into libperf_init()") Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220408132625.2451452-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
James Clark
|
ffab487052 |
perf: arm-spe: Fix perf report --mem-mode
Since commit bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem info is not available") "perf mem report" and "perf report --mem-mode" don't allow opening the file unless one of the events has PERF_SAMPLE_DATA_SRC set. SPE doesn't have this set even though synthetic memory data is generated after it is decoded. Fix this issue by setting DATA_SRC on SPE events. This has no effect on the data collected because the SPE driver doesn't do anything with that flag and doesn't generate samples. Fixes: bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem info is not available") Signed-off-by: James Clark <james.clark@arm.com> Tested-by: Leo Yan <leo.yan@linaro.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220408144056.1955535-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
James Clark
|
fa7095c5c3 |
perf unwind: Don't show unwind error messages when augmenting frame pointer stack
Commit Fixes: b9f6fbb3b2c29736 ("perf arm64: Inject missing frames when using 'perf record --call-graph=fp'") intended to add a 'best effort' DWARF unwind that improved the frame pointer stack in most scenarios. It's expected that the unwind will fail sometimes, but this shouldn't be reported as an error. It only works when the return address can be determined from the contents of the link register alone. Fix the error shown when the unwinder requires extra registers by adding a new flag that suppresses error messages. This flag is not set in the normal --call-graph=dwarf unwind mode so that behavior is not changed. Fixes: b9f6fbb3b2c29736 ("perf arm64: Inject missing frames when using 'perf record --call-graph=fp'") Reported-by: John Garry <john.garry@huawei.com> Signed-off-by: James Clark <james.clark@arm.com> Tested-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Truong <alexandre.truong@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220406145651.1392529-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Chengdong Li
|
290fa68bdc |
perf test tsc: Fix error message when not supported
By default `perf test tsc` does not return the error message when the child process detected kernel does not support it. Instead, the child process prints an error message to stderr, unfortunately stderr is redirected to /dev/null when verbose <= 0. This patch does: - return TEST_SKIP to the parent process instead of TEST_OK when perf_read_tsc_conversion() is not supported. - Add a new subtest of testing if TSC is supported on current architecture by moving exist code to a separate function. It avoids two places in test__perf_time_to_tsc() that return TEST_SKIP by doing this. - Extend the test suite definition to contain above two subtests. Current test_suite and test_case structs do not support printing skip reason when the number of subtest less than 1. To print skip reason, it is necessary to extend current test suite definition. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Chengdong Li <chengdongli@tencent.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: likexu@tencent.com Link: https://lore.kernel.org/r/20220408084748.43707-1-chengdongli@tencent.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
3a8a047586 |
perf build: Don't use -ffat-lto-objects in the python feature test when building with clang-13
Using -ffat-lto-objects in the python feature test when building with clang-13 results in: clang-13: error: optimization flag '-ffat-lto-objects' is not supported [-Werror,-Wignored-optimization-argument] error: command '/usr/sbin/clang' failed with exit code 1 cp: cannot stat '/tmp/build/perf/python_ext_build/lib/perf*.so': No such file or directory make[2]: *** [Makefile.perf:639: /tmp/build/perf/python/perf.so] Error 1 Noticed when building on a docker.io/library/archlinux:base container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Fangrui Song <maskray@google.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Keeping <john@metanate.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
dd6e1fe91c |
perf python: Fix probing for some clang command line options
The clang compiler complains about some options even without a source file being available, while others require one, so use the simple tools/build/feature/test-hello.c file. Then check for the "is not supported" string in its output, in addition to the "unknown argument" already being looked for. This was noticed when building with clang-13 where -ffat-lto-objects isn't supported and since we were looking just for "unknown argument" and not providing a source code to clang, was mistakenly assumed as being available and not being filtered to set of command line options provided to clang, leading to a build failure. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Fangrui Song <maskray@google.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Keeping <john@metanate.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Link: http://lore.kernel.org/lkml/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
41caff459a |
tools build: Filter out options and warnings not supported by clang
These make the feature check fail when using clang, so remove them just like is done in tools/perf/Makefile.config to build perf itself. Adding -Wno-compound-token-split-by-macro to tools/perf/Makefile.config when building with clang is also necessary to avoid these warnings turned into errors (-Werror): CC /tmp/build/perf/util/scripting-engines/trace-event-perl.o In file included from util/scripting-engines/trace-event-perl.c:35: In file included from /usr/lib64/perl5/CORE/perl.h:4085: In file included from /usr/lib64/perl5/CORE/hv.h:659: In file included from /usr/lib64/perl5/CORE/hv_func.h:34: In file included from /usr/lib64/perl5/CORE/sbox32_hash.h:4: /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: error: '(' and '{' tokens introducing statement expression appear in different macro expansion contexts [-Werror,-Wcompound-token-split-by-macro] ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/lib64/perl5/CORE/zaphod32_hash.h:80:38: note: expanded from macro 'ZAPHOD32_SCRAMBLE32' #define ZAPHOD32_SCRAMBLE32(v,prime) STMT_START { \ ^~~~~~~~~~ /usr/lib64/perl5/CORE/perl.h:737:29: note: expanded from macro 'STMT_START' # define STMT_START (void)( /* gcc supports "({ STATEMENTS; })" */ ^ /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: note: '{' token is here ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/lib64/perl5/CORE/zaphod32_hash.h:80:49: note: expanded from macro 'ZAPHOD32_SCRAMBLE32' #define ZAPHOD32_SCRAMBLE32(v,prime) STMT_START { \ ^ /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: error: '}' and ')' tokens terminating statement expression appear in different macro expansion contexts [-Werror,-Wcompound-token-split-by-macro] ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/lib64/perl5/CORE/zaphod32_hash.h:87:41: note: expanded from macro 'ZAPHOD32_SCRAMBLE32' v ^= (v>>23); \ ^ /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: note: ')' token is here ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/lib64/perl5/CORE/zaphod32_hash.h:88:3: note: expanded from macro 'ZAPHOD32_SCRAMBLE32' } STMT_END ^~~~~~~~ /usr/lib64/perl5/CORE/perl.h:738:21: note: expanded from macro 'STMT_END' # define STMT_END ) ^ Please refer to the discussion on the Link: tag below, where Nathan clarifies the situation: <quote> acme> And then get to the problems at the end of this message, which seem acme> similar to the problem described here: acme> acme> From Nathan Chancellor <> acme> Subject [PATCH] mwifiex: Remove unnecessary braces from HostCmd_SET_SEQ_NO_BSS_INFO acme> acme> https://lkml.org/lkml/2020/9/1/135 acme> acme> So perhaps in this case its better to disable that acme> -Werror,-Wcompound-token-split-by-macro when building with clang? Yes, I think that is probably the best solution. As far as I can tell, at least in this file and context, the warning appears harmless, as the "create a GNU C statement expression from two different macros" is very much intentional, based on the presence of PERL_USE_GCC_BRACE_GROUPS. The warning is fixed in upstream Perl by just avoiding creating GNU C statement expressions using STMT_START and STMT_END: https://github.com/Perl/perl5/issues/18780 https://github.com/Perl/perl5/pull/18984 If I am reading the source code correctly, an alternative to disabling the warning would be specifying -DPERL_GCC_BRACE_GROUPS_FORBIDDEN but it seems like that might end up impacting more than just this site, according to the issue discussion above. </quote> Based-on-a-patch-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # Debian/Selfmade LLVM-14 (x86-64) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Fangrui Song <maskray@google.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Keeping <john@metanate.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: http://lore.kernel.org/lkml/YkxWcYzph5pC1EK8@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Tanu M
|
7e2022af79 |
perf python: Convert tracepoint.py example to python3
Convert the tracepoint.py file to python3 as many of the files in tools/perf are already written in python3. Committer testing: # export PYTHONPATH=/tmp/build/perf/python/ # python3 ~acme/git/perf/tools/perf/python/tracepoint.py | head time 67394457376909 prev_comm=swapper/12 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=gnome-terminal- next_pid=3313 next_prio=120 time 67394457807669 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120 time 67394457811859 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120 time 67394457824929 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120 time 67394457831899 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120 time 67394457842299 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120 time 67394457844179 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120 time 67394457853879 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120 time 67394457856339 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120 time 67394457865659 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120 Traceback (most recent call last): File "/var/home/acme/git/perf/tools/perf/python/tracepoint.py", line 48, in <module> main() File "/var/home/acme/git/perf/tools/perf/python/tracepoint.py", line 37, in main print("time %u prev_comm=%s prev_pid=%d prev_prio=%d prev_state=0x%x ==> next_comm=%s next_pid=%d next_prio=%d" % ( BrokenPipeError: [Errno 32] Broken pipe # Signed-off-by: Tanu M <tanu235m@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/linux-perf-users/CAPS78prawYzRZnyhWjgOnGw4EwoswNwztvfZFdCOPOydFzVwzQ@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Haowen Bai
|
f717d89a2b |
perf evlist: Directly return instead of using local ret variable
Addresses this coccinelle warning: ./tools/perf/util/evlist.c:1333:5-8: Unneeded variable: "err". Return "- ENOMEM" on line 1358 Signed-off-by: Haowen Bai <baihaowen@meizu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/1648432532-23151-1-git-send-email-baihaowen@meizu.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
0df6ade711 |
perf evlist: Rename cpus to user_requested_cpus
evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps of all evsels. For non-task targets, cpus is set to be cpus requested from the command line, defaulting to all online cpus if no cpus are specified. For an uncore event, all_cpus may be just CPU 0 or every online CPU. This causes all_cpus to have fewer values than the cpus variable which is confusing given the 'all' in the name. To try to make the behavior clearer, rename cpus to user_requested_cpus and add comments on the two struct variables. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20220328232648.2127340-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
John Garry
|
d4ff926592 |
perf tools: Stop depending on .git files for building PERF-VERSION-FILE
This essentially reverts commit c72e3f04b45fb2e5 ("tools/perf/build: Speed up git-version test on re-make") and commit 4e666cdb06eede20 ("perf tools: Fix dependency for version file creation") In commit c72e3f04b45fb2e5 ("tools/perf/build: Speed up git-version test on re-make"), a makefile dependency on .git/HEAD was added. The background is that running PERF-VERSION-FILE is relatively slow, and commands like "git describe" are particularly slow. In commit 4e666cdb06eede20 ("perf tools: Fix dependency for version file creation"), an additional dependency on .git/ORIG_HEAD was added, as .git/HEAD may not change for "git reset --hard HEAD^" command. However, depending on whether we're on a branch or not, a "git cherry-pick" may not lead to the version being updated. As discussed with the git community in [0], using git internal files for dependencies is not reliable. Commit 4e666cdb06ee also breaks some build scenarios [1]. As mentioned, c72e3f04b45fb2e5 ("tools/perf/build: Speed up git-version test on re-make") was added to speed up the build. However in commit 7572733b84997d23 ("perf tools: Fix version kernel tag") we removed the call to "git describe", so just revert Makefile.perf back to same as pre c72e3f04b45fb2e5 ("tools/perf/build: Speed up git-version test on re-make") and the build should not be so slow, as below: Pre 7572733b8499: $> time util/PERF-VERSION-GEN PERF_VERSION = 5.17.rc8.g4e666cdb06ee real 0m0.110s user 0m0.091s sys 0m0.019s Post 7572733b8499: $> time util/PERF-VERSION-GEN PERF_VERSION = 5.17.rc8.g7572733b8499 real 0m0.039s user 0m0.036s sys 0m0.007s [0] https://lore.kernel.org/git/87wngkpddp.fsf@igel.home/T/#m4a4dd6de52fdbe21179306cd57b3761eb07f45f8 [1] https://lore.kernel.org/linux-perf-users/20220329093120.4173283-1-matthieu.baerts@tessares.net/T/#u Committer testing: After a fresh rebuild using 'make -C tools/perf O=/tmp/build/perf install-bin': $ perf -v perf version 5.17.g162f9db407b6 $ git log --oneline -1 162f9db407b6a6e5 (HEAD -> perf/core) perf tools: Stop depending on .git files for building PERF-VERSION-FILE $ Now using a detached tarball, i.e. outside the kernel source tree: $ ls -la perf*tar ls: cannot access 'perf*tar': No such file or directory $ make perf-tar-src-pkg TAR PERF_VERSION = 5.17.g31d10b3ef133 $ ls -la perf*tar -rw-r--r--. 1 acme acme 22241280 Mar 30 13:26 perf-5.17.0.tar $ mv perf-5.17.0.tar /tmp $ cd /tmp $ tar xf perf-5.17.0.tar $ cd perf-5.17.0/ $ make -C tools/perf |& tail CC util/pmu.o CC util/pmu-flex.o CC util/expr-flex.o CC util/expr.o LD util/scripting-engines/perf-in.o LD util/intel-pt-decoder/perf-in.o LD util/perf-in.o LD perf-in.o LINK perf make: Leaving directory '/tmp/perf-5.17.0/tools/perf' $ tools/perf/perf -v perf version 5.17.g31d10b3ef133 $ pwd /tmp/perf-5.17.0 $ cat PERF-VERSION-FILE #define PERF_VERSION "5.17.g31d10b3ef133" $ Fixes: 4e666cdb06eede20 ("perf tools: Fix dependency for version file creation") Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/1648635774-14581-1-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
9a195da42f |
perf beauty: Update copy of linux/socket.h with the kernel sources
To pick the changes in: a6a6fe27bab48f0d ("net/smc: Dynamic control handshake limitation by socket options") This automagically adds support for the SOL_MNC socket level: $ diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h --- tools/perf/trace/beauty/include/linux/socket.h 2022-03-14 17:55:22.277148656 -0300 +++ include/linux/socket.h 2022-03-27 19:12:48.908250063 -0300 @@ -366,6 +366,7 @@ #define SOL_XDP 283 #define SOL_MPTCP 284 #define SOL_MCTP 285 +#define SOL_SMC 286 /* IPX options */ #define IPX_TYPE 1 $ tools/perf/trace/beauty/socket.sh > before $ cp include/linux/socket.h tools/perf/trace/beauty/include/linux/socket.h $ tools/perf/trace/beauty/socket.sh > after $ diff -u before after --- before 2022-03-29 11:47:56.390258780 -0300 +++ after 2022-03-29 11:48:03.158436189 -0300 @@ -67,6 +67,7 @@ [283] = "XDP", [284] = "MPTCP", [285] = "MCTP", + [286] = "SMC", }; DEFINE_STRARRAY(socket_level, "SOL_"); $ This will allow 'perf trace' to translate 286 into "SMC" as is done with the other socket levels: # perf trace -e setsockopt --max-events 4 344.916 ( 0.003 ms): Socket Thread/3816 setsockopt(fd: 168, level: TCP, optname: 5, optval: 0x7f5797b9c4f8, optlen: 4) = 0 344.920 ( 0.002 ms): Socket Thread/3816 setsockopt(fd: 168, level: TCP, optname: 6, optval: 0x7f5797b9c4f4, optlen: 4) = 0 1246.974 ( 0.010 ms): systemd-resolv/1128 setsockopt(fd: 22, level: IP, optname: 11, optval: 0x7ffc96cd7244, optlen: 4) = 0 1246.986 ( 0.002 ms): systemd-resolv/1128 setsockopt(fd: 22, level: IP, optname: 8, optval: 0x7ffc96cd7264, optlen: 4) = 0 This addresses this perf build warning: Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h' diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: D. Wythe <alibuda@linux.alibaba.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/YkMdpzzjPu5VZtW3@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
4d4d00dd32 |
perf tools: Update copy of libbpf's hashmap.c
To pick the changes in: fba60b171a032283 ("libbpf: Use IS_ERR_OR_NULL() in hashmap__free()") That don't entail any changes in tools/perf. This addresses this perf build warning: Warning: Kernel ABI header at 'tools/perf/util/hashmap.h' differs from latest version at 'tools/lib/bpf/hashmap.h' diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h Not a kernel ABI, its just that this uses the mechanism in place for checking kernel ABI files drift. Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mauricio Vásquez <mauricio@kinvolk.io> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/YkMb2SAIai2VeuUD@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
8a96f454f5 |
perf stat: Avoid SEGV if core.cpus isn't set
Passing NULL to perf_cpu_map__max doesn't make sense as there is no valid max. Avoid this problem by null checking in perf_stat_init_aggr_mode. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20220328062414.1893550-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Linus Torvalds
|
b8321ed4a4 |
Kbuild updates for v5.18
- Add new environment variables, USERCFLAGS and USERLDFLAGS to allow additional flags to be passed to user-space programs. - Fix missing fflush() bugs in Kconfig and fixdep - Fix a minor bug in the comment format of the .config file - Make kallsyms ignore llvm's local labels, .L* - Fix UAPI compile-test for cross-compiling with Clang - Extend the LLVM= syntax to support LLVM=<suffix> form for using a particular version of LLVm, and LLVM=<prefix> form for using custom LLVM in a particular directory path. - Clean up Makefiles -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmJFGloVHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGH0kP/j6Vx5BqEv3tP2Q+UANxLqITleJs IFpbSesz/BhlG7I/IapWmCDSqFbYd5uJTO4ko8CsPmZHcxr6Gw3y+DN5yQACKaG/ p9xiF6GjPyKR8+VdcT2tV50+dVY8ANe/DxCyzKrJd/uyYxgARPKJh0KRMNz+d9lj ixUpCXDhx/XlKzPIlcxrvhhjevKz+NnHmN0fe6rzcOw9KzBGBTsf20Q3PqUuBOKa rWHsRGcBPA8eKLfWT1Us1jjic6cT2g4aMpWjF20YgUWKHgWVKcNHpxYKGXASVo/z ewdDnNfmwo7f7fKMCDDro9iwFWV/BumGtn43U00tnqdBcTpFojPlEOga37UPbZDF nmTblGVUhR0vn4PmfBy8WkAkbW+IpVatKwJGV4J3KjSvdWvZOmVj9VUGLVAR0TXW /YcgRs6EtG8Hn0IlCj0fvZ5wRWoDLbP2DSZ67R/44EP0GaNQPwUe4FI1izEE4EYX oVUAIxcKixWGj4RmdtmtMMdUcZzTpbgS9uloMUmS3u9LK0Ir/8tcWaf2zfMO6Jl2 p4Q31s1dUUKCnFnj0xDKRyKGUkxYebrHLfuBqi0RIc0xRpSlxoXe3Dynm9aHEQoD ZSV0eouQJxnaxM1ck5Bu4AHLgEebHfEGjWVyUHno7jFU5EI9Wpbqpe4pCYEEDTm1 +LJMEpdZO0dFvpF+ =84rW -----END PGP SIGNATURE----- Merge tag 'kbuild-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Add new environment variables, USERCFLAGS and USERLDFLAGS to allow additional flags to be passed to user-space programs. - Fix missing fflush() bugs in Kconfig and fixdep - Fix a minor bug in the comment format of the .config file - Make kallsyms ignore llvm's local labels, .L* - Fix UAPI compile-test for cross-compiling with Clang - Extend the LLVM= syntax to support LLVM=<suffix> form for using a particular version of LLVm, and LLVM=<prefix> form for using custom LLVM in a particular directory path. - Clean up Makefiles * tag 'kbuild-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: Make $(LLVM) more flexible kbuild: add --target to correctly cross-compile UAPI headers with Clang fixdep: use fflush() and ferror() to ensure successful write to files arch: syscalls: simplify uapi/kapi directory creation usr/include: replace extra-y with always-y certs: simplify empty certs creation in certs/Makefile certs: include certs/signing_key.x509 unconditionally kallsyms: ignore all local labels prefixed by '.L' kconfig: fix missing '# end of' for empty menu kconfig: add fflush() before ferror() check kbuild: replace $(if A,A,B) with $(or A,B) kbuild: Add environment variables for userprogs flags kbuild: unify cmd_copy and cmd_shipped |
||
Linus Torvalds
|
7b58b82b86 |
perf tools changes for v5.18: 1st batch
New features: perf ftrace: - Add -n/--use-nsec option to the 'latency' subcommand. Default: usecs: $ sudo perf ftrace latency -T dput -a sleep 1 # DURATION | COUNT | GRAPH | 0 - 1 us | 2098375 | ############################# | 1 - 2 us | 61 | | 2 - 4 us | 33 | | 4 - 8 us | 13 | | 8 - 16 us | 124 | | 16 - 32 us | 123 | | 32 - 64 us | 1 | | 64 - 128 us | 0 | | 128 - 256 us | 1 | | 256 - 512 us | 0 | | Better granularity with nsec: $ sudo perf ftrace latency -T dput -a -n sleep 1 # DURATION | COUNT | GRAPH | 0 - 1 us | 0 | | 1 - 2 ns | 0 | | 2 - 4 ns | 0 | | 4 - 8 ns | 0 | | 8 - 16 ns | 0 | | 16 - 32 ns | 0 | | 32 - 64 ns | 0 | | 64 - 128 ns | 1163434 | ############## | 128 - 256 ns | 914102 | ############# | 256 - 512 ns | 884 | | 512 - 1024 ns | 613 | | 1 - 2 us | 31 | | 2 - 4 us | 17 | | 4 - 8 us | 7 | | 8 - 16 us | 123 | | 16 - 32 us | 83 | | perf lock: - Add -c/--combine-locks option to merge lock instances in the same class into a single entry. # perf lock report -c Name acquired contended avg wait(ns) total wait(ns) max wait(ns) min wait(ns) rcu_read_lock 251225 0 0 0 0 0 hrtimer_bases.lock 39450 0 0 0 0 0 &sb->s_type->i_l... 10301 1 662 662 662 662 ptlock_ptr(page) 10173 2 701 1402 760 642 &(ei->i_block_re... 8732 0 0 0 0 0 &xa->xa_lock 8088 0 0 0 0 0 &base->lock 6705 0 0 0 0 0 &p->pi_lock 5549 0 0 0 0 0 &dentry->d_lockr... 5010 4 1274 5097 1844 789 &ep->lock 3958 0 0 0 0 0 - Add -F/--field option to customize the list of fields to output: $ perf lock report -F contended,wait_max -k avg_wait Name contended max wait(ns) avg wait(ns) slock-AF_INET6 1 23543 23543 &lruvec->lru_lock 5 18317 11254 slock-AF_INET6 1 10379 10379 rcu_node_1 1 2104 2104 &dentry->d_lockr... 1 1844 1844 &dentry->d_lockr... 1 1672 1672 &newf->file_lock 15 2279 1025 &dentry->d_lockr... 1 792 792 - Add --synth=no option for record, as there is no need to symbolize, lock names comes from the tracepoints. perf record: - Threaded recording, opt-in, via the new --threads command line option. - Improve AMD IBS (Instruction-Based Sampling) error handling messages. perf script: - Add 'brstackinsnlen' field (use it with -F) for branch stacks. - Output branch sample type in 'perf script'. perf report: - Add "addr_from" and "addr_to" sort dimensions. - Print branch stack entry type in 'perf report --dump-raw-trace' - Fix symbolization for chrooted workloads. Hardware tracing: Intel PT: - Add CFE (Control Flow Event) and EVD (Event Data) packets support. - Add MODE.Exec IFLAG bit support. Explanation about these features from the "Intel® 64 and IA-32 architectures software developer’s manual combined volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" PDF at: https://cdrdv2.intel.com/v1/dl/getContent/671200 At page 3951: <quote> 32.2.4 Event Trace is a capability that exposes details about the asynchronous events, when they are generated, and when their corresponding software event handler completes execution. These include: o Interrupts, including NMI and SMI, including the interrupt vector when defined. o Faults, exceptions including the fault vector. — Page faults additionally include the page fault address, when in context. o Event handler returns, including IRET and RSM. o VM exits and VM entries.¹ — VM exits include the values written to the “exit reason” and “exit qualification” VMCS fields. INIT and SIPI events. o TSX aborts, including the abort status returned for the RTM instructions. o Shutdown. Additionally, it provides indication of the status of the Interrupt Flag (IF), to indicate when interrupts are masked. </quote> ARM CoreSight: - Use advertised caps/min_interval as default sample_period on ARM spe. - Update deduction of TRCCONFIGR register for branch broadcast on ARM's CoreSight ETM. Vendor Events (JSON): Intel: - Update events and metrics for: Alderlake, Broadwell, Broadwell DE, BroadwellX, CascadelakeX, Elkhartlake, Bonnell, Goldmont, GoldmontPlus, Westmere EP-DP, Haswell, HaswellX, Icelake, IcelakeX, Ivybridge, Ivytown, Jaketown, Knights Landing, Nehalem EP, Sandybridge, Silvermont, Skylake, Skylake Server, SkylakeX, Tigerlake, TremontX, Westmere EP-SP, Westmere EX. ARM: - Add support for HiSilicon CPA PMU aliasing. perf stat: - Fix forked applications enablement of counters. - The 'slots' should only be printed on a different order than the one specified on the command line when 'topdown' events are present, fix it. Miscellaneous: - Sync msr-index, cpufeatures header files with the kernel sources. - Stop using some deprecated libbpf APIs in 'perf trace'. - Fix some spelling mistakes. - Refactor the maps pointers usage to pave the way for using refcount debugging. - Only offer the --tui option on perf top, report and annotate when perf was built with libslang. - Don't mention --to-ctf in 'perf data --help' when not linking with the required library, libbabeltrace. - Use ARRAY_SIZE() instead of ad hoc equivalent, spotted by array_size.cocci. - Enhance the matching of sub-commands abbreviations: 'perf c2c rec' -> 'perf c2c record' 'perf c2c recport -> error - Set build-id using build-id header on new mmap records. - Fix generation of 'perf --version' string. perf test: - Add test for the arm_spe event. - Add test to check unwinding using fame-pointer (fp) mode on arm64. - Make metric testing more robust in 'perf test'. - Add error message for unsupported branch stack cases. libperf: - Add API for allocating new thread map array. - Fix typo in perf_evlist__open() failure error messages in libperf tests. perf c2c: - Replace bitmap_weight() with bitmap_empty() where appropriate. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYj8viwAKCRCyPKLppCJ+ J8K3AQDpN45P4/TWJxVWhZlvYzJtWDSboXHZJfmBiEd4Xu2zbwD7BFW02f1ATHPr dGBFXxRQQufBIqfE+OQXG59Awp1m8wE= =1l8S -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools updates from Arnaldo Carvalho de Melo: "New features: perf ftrace: - Add -n/--use-nsec option to the 'latency' subcommand. Default: usecs: $ sudo perf ftrace latency -T dput -a sleep 1 # DURATION | COUNT | GRAPH | 0 - 1 us | 2098375 | ############################# | 1 - 2 us | 61 | | 2 - 4 us | 33 | | 4 - 8 us | 13 | | 8 - 16 us | 124 | | 16 - 32 us | 123 | | 32 - 64 us | 1 | | 64 - 128 us | 0 | | 128 - 256 us | 1 | | 256 - 512 us | 0 | | Better granularity with nsec: $ sudo perf ftrace latency -T dput -a -n sleep 1 # DURATION | COUNT | GRAPH | 0 - 1 us | 0 | | 1 - 2 ns | 0 | | 2 - 4 ns | 0 | | 4 - 8 ns | 0 | | 8 - 16 ns | 0 | | 16 - 32 ns | 0 | | 32 - 64 ns | 0 | | 64 - 128 ns | 1163434 | ############## | 128 - 256 ns | 914102 | ############# | 256 - 512 ns | 884 | | 512 - 1024 ns | 613 | | 1 - 2 us | 31 | | 2 - 4 us | 17 | | 4 - 8 us | 7 | | 8 - 16 us | 123 | | 16 - 32 us | 83 | | perf lock: - Add -c/--combine-locks option to merge lock instances in the same class into a single entry. # perf lock report -c Name acquired contended avg wait(ns) total wait(ns) max wait(ns) min wait(ns) rcu_read_lock 251225 0 0 0 0 0 hrtimer_bases.lock 39450 0 0 0 0 0 &sb->s_type->i_l... 10301 1 662 662 662 662 ptlock_ptr(page) 10173 2 701 1402 760 642 &(ei->i_block_re... 8732 0 0 0 0 0 &xa->xa_lock 8088 0 0 0 0 0 &base->lock 6705 0 0 0 0 0 &p->pi_lock 5549 0 0 0 0 0 &dentry->d_lockr... 5010 4 1274 5097 1844 789 &ep->lock 3958 0 0 0 0 0 - Add -F/--field option to customize the list of fields to output: $ perf lock report -F contended,wait_max -k avg_wait Name contended max wait(ns) avg wait(ns) slock-AF_INET6 1 23543 23543 &lruvec->lru_lock 5 18317 11254 slock-AF_INET6 1 10379 10379 rcu_node_1 1 2104 2104 &dentry->d_lockr... 1 1844 1844 &dentry->d_lockr... 1 1672 1672 &newf->file_lock 15 2279 1025 &dentry->d_lockr... 1 792 792 - Add --synth=no option for record, as there is no need to symbolize, lock names comes from the tracepoints. perf record: - Threaded recording, opt-in, via the new --threads command line option. - Improve AMD IBS (Instruction-Based Sampling) error handling messages. perf script: - Add 'brstackinsnlen' field (use it with -F) for branch stacks. - Output branch sample type in 'perf script'. perf report: - Add "addr_from" and "addr_to" sort dimensions. - Print branch stack entry type in 'perf report --dump-raw-trace' - Fix symbolization for chrooted workloads. Hardware tracing: Intel PT: - Add CFE (Control Flow Event) and EVD (Event Data) packets support. - Add MODE.Exec IFLAG bit support. Explanation about these features from the "Intel® 64 and IA-32 architectures software developer’s manual combined volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" PDF at: https://cdrdv2.intel.com/v1/dl/getContent/671200 At page 3951: "32.2.4 Event Trace is a capability that exposes details about the asynchronous events, when they are generated, and when their corresponding software event handler completes execution. These include: o Interrupts, including NMI and SMI, including the interrupt vector when defined. o Faults, exceptions including the fault vector. - Page faults additionally include the page fault address, when in context. o Event handler returns, including IRET and RSM. o VM exits and VM entries.¹ - VM exits include the values written to the “exit reason” and “exit qualification” VMCS fields. INIT and SIPI events. o TSX aborts, including the abort status returned for the RTM instructions. o Shutdown. Additionally, it provides indication of the status of the Interrupt Flag (IF), to indicate when interrupts are masked" ARM CoreSight: - Use advertised caps/min_interval as default sample_period on ARM spe. - Update deduction of TRCCONFIGR register for branch broadcast on ARM's CoreSight ETM. Vendor Events (JSON): Intel: - Update events and metrics for: Alderlake, Broadwell, Broadwell DE, BroadwellX, CascadelakeX, Elkhartlake, Bonnell, Goldmont, GoldmontPlus, Westmere EP-DP, Haswell, HaswellX, Icelake, IcelakeX, Ivybridge, Ivytown, Jaketown, Knights Landing, Nehalem EP, Sandybridge, Silvermont, Skylake, Skylake Server, SkylakeX, Tigerlake, TremontX, Westmere EP-SP, and Westmere EX. ARM: - Add support for HiSilicon CPA PMU aliasing. perf stat: - Fix forked applications enablement of counters. - The 'slots' should only be printed on a different order than the one specified on the command line when 'topdown' events are present, fix it. Miscellaneous: - Sync msr-index, cpufeatures header files with the kernel sources. - Stop using some deprecated libbpf APIs in 'perf trace'. - Fix some spelling mistakes. - Refactor the maps pointers usage to pave the way for using refcount debugging. - Only offer the --tui option on perf top, report and annotate when perf was built with libslang. - Don't mention --to-ctf in 'perf data --help' when not linking with the required library, libbabeltrace. - Use ARRAY_SIZE() instead of ad hoc equivalent, spotted by array_size.cocci. - Enhance the matching of sub-commands abbreviations: 'perf c2c rec' -> 'perf c2c record' 'perf c2c recport -> error - Set build-id using build-id header on new mmap records. - Fix generation of 'perf --version' string. perf test: - Add test for the arm_spe event. - Add test to check unwinding using fame-pointer (fp) mode on arm64. - Make metric testing more robust in 'perf test'. - Add error message for unsupported branch stack cases. libperf: - Add API for allocating new thread map array. - Fix typo in perf_evlist__open() failure error messages in libperf tests. perf c2c: - Replace bitmap_weight() with bitmap_empty() where appropriate" * tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (143 commits) perf evsel: Improve AMD IBS (Instruction-Based Sampling) error handling messages perf python: Add perf_env stubs that will be needed in evsel__open_strerror() perf tools: Enhance the matching of sub-commands abbreviations libperf tests: Fix typo in perf_evlist__open() failure error messages tools arm64: Import cputype.h perf lock: Add -F/--field option to control output perf lock: Extend struct lock_key to have print function perf lock: Add --synth=no option for record tools headers cpufeatures: Sync with the kernel sources tools headers cpufeatures: Sync with the kernel sources perf stat: Fix forked applications enablement of counters tools arch x86: Sync the msr-index.h copy with the kernel sources perf evsel: Make evsel__env() always return a valid env perf build-id: Fix spelling mistake "Cant" -> "Can't" perf header: Fix spelling mistake "could't" -> "couldn't" perf script: Add 'brstackinsnlen' for branch stacks perf parse-events: Move slots only with topdown perf ftrace latency: Update documentation perf ftrace latency: Add -n/--use-nsec option perf tools: Fix version kernel tag ... |