IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
"retain good changes" means that I left the help string split up instead
of having this weird thing where it tries to merge together the last three
lines and it looks **really** bad
Signed-off-by: Gabriel Ravier <gabravier@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
There is a spelling mistake in an error message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
If '-o' was used more than 64 times in a single invocation of gpio-hammer,
this could lead to an overflow of the 'lines' array. This commit fixes
this by avoiding the overflow and giving a proper diagnostic back to the
user
Signed-off-by: Gabriel Ravier <gabravier@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Some specific tests in powerpc can take longer than the default 45
seconds that added in commit 852c8cbf34d3 ("selftests/kselftest/runner.sh:
Add 45 second timeout per test") to run, the following test result was
collected across 2 Power8 nodes and 1 Power9 node in our pool:
powerpc/benchmarks/futex_bench - 52s
powerpc/dscr/dscr_sysfs_test - 116s
powerpc/signal/signal_fuzzer - 88s
powerpc/tm/tm_unavailable_test - 168s
powerpc/tm/tm-poison - 240s
Thus they will fail with TIMEOUT error. Disable the timeout setting
for these sub-tests to allow them finish properly.
https://bugs.launchpad.net/bugs/1864642
Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test")
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200318060004.10685-1-po-hsu.lin@canonical.com
The test case tm-signal-context-force-tm expects a segfault to happen
on returning from signal handler, and then does a setcontext() to run
the test again. However, the test doesn't always segfault, causing the
test to run a single time.
This patch fixes the test by putting it within a loop and jumping, via
setcontext, just prior to the loop in case it segfaults. This way we
get the desired behavior (run the test COUNT_MAX times) regardless if
it segfaults or not. This also reduces the use of setcontext for
control flow logic, keeping it only in the segfault handler.
Also, since 'count' is changed within the signal handler, it is
declared as volatile to prevent any compiler optimization getting
confused with asynchronous changes.
Signed-off-by: Gustavo Luiz Duarte <gustavold@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200211033831.11165-3-gustavold@linux.ibm.com
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) A new selftest for nf_queue, from Florian Westphal. This test
covers two recent fixes: 07f8e4d0fddb ("tcp: also NULL skb->dev
when copy was needed") and b738a185beaa ("tcp: ensure skb->dev is
NULL before leaving TCP stack").
2) The fwd action breaks with ifb. For safety in next extensions,
make sure the fwd action only runs from ingress until it is extended
to be used from a different hook.
3) The pipapo set type now reports EEXIST in case of subrange overlaps.
Update the rbtree set to validate range overlaps, so far this
validation is only done only from userspace. From Stefano Brivio.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a test case to check nf queue infrastructure.
Could be extended in the future to also cover serialization of
conntrack, uid and secctx attributes in nfqueue.
For now, this checks that 'queue bypass' works, that a queue rule with
no bypass option blocks traffic and that userspace receives the expected
number of packets.
For this we add two queues and hook all of
prerouting/input/forward/output/postrouting.
Packets get queued twice with a dummy base chain in between:
This passes with current nf tree, but reverting
commit 946c0d8e6ed4 ("netfilter: nf_queue: fix reinject verdict handling")
makes this trip (it processes 30 instead of expected 20 packets).
v2: update config file with queue and other options missing/needed for
other tests.
v3: also test with tcp, this reveals problem with commit
28f8bfd1ac94 ("netfilter: Support iif matches in POSTROUTING"), due to
skb->dev pointing at another skb in the retransmit rbtree (skb->dev
aliases to rbnode child).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull perf tooling fixes from Ingo Molnar:
"A handful of tooling fixes all across the map, no kernel changes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools headers uapi: Update linux/in.h copy
perf probe: Do not depend on dwfl_module_addrsym()
perf probe: Fix to delete multiple probe event
perf parse-events: Fix reading of invalid memory in event parsing
perf python: Fix clang detection when using CC=clang-version
perf map: Fix off by one in strncpy() size argument
tools: Let O= makes handle a relative path with -C option
Perf gets dso details from two different sources. 1st, from builid
headers in perf.data and 2nd from MMAP2 samples. Dso from buildid
header does not have dso_id detail. And dso from MMAP2 samples does
not have buildid information. If detail of the same dso is present
at both the places, filename is common.
Previously, __dsos__findnew_link_by_longname_id() used to compare only
long or short names, but Commit 0e3149f86b99 ("perf dso: Move dso_id
from 'struct map' to 'struct dso'") also added a dso_id comparison.
Because of that, now perf is creating two different dso objects of the
same file, one from buildid header (with dso_id but without buildid)
and second from MMAP2 sample (with buildid but without dso_id).
This is causing issues with archive, buildid-list etc subcommands. Fix
this by comparing dso_id only when it's present. And incase dso is
present in 'dsos' list without dso_id, inject dso_id detail as well.
Before:
$ sudo ./perf buildid-list -H
0000000000000000000000000000000000000000 /usr/bin/ls
0000000000000000000000000000000000000000 /usr/lib64/ld-2.30.so
0000000000000000000000000000000000000000 /usr/lib64/libc-2.30.so
$ ./perf archive
perf archive: no build-ids found
After:
$ ./perf buildid-list -H
b6b1291d0cead046ed0fa5734037fa87a579adee /usr/bin/ls
641f0c90cfa15779352f12c0ec3c7a2b2b6f41e8 /usr/lib64/ld-2.30.so
675ace3ca07a0b863df01f461a7b0984c65c8b37 /usr/lib64/libc-2.30.so
$ ./perf archive
Now please run:
$ tar xvf perf.data.tar.bz2 -C ~/.debug
wherever you need to run 'perf report' on.
Committer notes:
Renamed is_empty_dso_id() to dso_id__empty() and inject_dso_id() to
dso__inject_id() to keep namespacing consistent.
Fixes: 0e3149f86b99 ("perf dso: Move dso_id from 'struct map' to 'struct dso'")
Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20200324042424.68366-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'snprintf' returns the number of characters which would be generated for
the given input.
If the returned value is *greater than* or equal to the buffer size, it
means that the output has been truncated.
Fix the overflow test accordingly.
Fixes: 7780c25bae59f ("perf tools: Allow ability to map cpus to nodes easily")
Fixes: 92a7e1278005b ("perf cpumap: Add cpu__max_present_cpu()")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: He Zhe <zhe.he@windriver.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200324070319.10901-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add creating event aliases to the pmu-events test.
So currently we verify that the generated pmu-events.c is as expected for
some test events. Now test that we generate aliases as expected for those
events during normal operation.
For that, we cycle through each HW PMU in the system, and use the test
events to create aliases, and verify those against known, expected values.
For core PMUs, we should create an alias for every event in
test_cpu_events[].
However, for uncore PMUs, they need to be matched by the pmu_event.pmu
member, so use test_uncore_events[]; so check the match beforehand with
pmu_uncore_alias_match().
A sample run is as follows for my x86 machine:
john@linux-3c19:~/linux> tools/perf/perf test -vv 10
10: PMU events :
--- start ---
...
testing PMU uncore_arb aliases: no events to match
testing PMU cstate_pkg aliases: no events to match
skipping testing PMU breakpoint
testing aliases PMU uncore_cbox_1: matched event unc_cbo_xsnp_response.miss_eviction
testing PMU uncore_cbox_1 aliases: pass
testing PMU power aliases: no events to match
testing aliases PMU cpu: matched event bp_l1_btb_correct
testing aliases PMU cpu: matched event bp_l2_btb_correct
testing aliases PMU cpu: matched event segment_reg_loads.any
testing aliases PMU cpu: matched event dispatch_blocked.any
testing aliases PMU cpu: matched event eist_trans
testing PMU cpu aliases: pass
testing PMU intel_pt aliases: no events to match
skipping testing PMU software
skipping testing PMU intel_bts
testing PMU uncore_imc aliases: no events to match
testing aliases PMU uncore_cbox_0: matched event unc_cbo_xsnp_response.miss_eviction
testing PMU uncore_cbox_0 aliases: pass
testing PMU cstate_core aliases: no events to match
skipping testing PMU tracepoint
testing PMU msr aliases: no events to match
test child finished with 0
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1584442939-8911-8-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf pmu-events test will want to use pmu_uncore_alias_match(), so
make it public.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1584442939-8911-7-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a function to decide whether a PMU is a core PMU.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1584442939-8911-6-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The initial test will verify that the test tables in generated pmu-events.c
match against known, expected values.
For known events added in pmu-events/arch/test, we need to add an entry
in test_cpu_aliases_events[] or test_uncore_events[].
A sample run is as follows for x86:
john@linux-3c19:~/linux> tools/perf/perf test -vv 10
10: PMU event aliases :
--- start ---
test child forked, pid 5316
testing event table bp_l1_btb_correct: pass
testing event table bp_l2_btb_correct: pass
testing event table segment_reg_loads.any: pass
testing event table dispatch_blocked.any: pass
testing event table eist_trans: pass
testing event table uncore_hisi_ddrc.flux_wcmd: pass
testing event table unc_cbo_xsnp_response.miss_eviction: pass
test child finished with 0
---- end ----
PMU event aliases: Ok
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
[ Fixup test_cpu_events[] and test_uncore_events[] sentinels to initialize one of its members to NULL, fixing the build in older compilers ]
Link: http://lore.kernel.org/lkml/1584442939-8911-5-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Create pmu_add_cpu_aliases_map() from pmu_add_cpu_aliases(), so the caller
can pass the map; the pmu-events test would use this since there would
be no CPUID matching to a mapfile there.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1584442939-8911-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the goal of supporting pmu-events test case, introduce support for
a test events folder.
These test events can be used for testing generation of pmu-event tables
and alias creation for any arch.
When running the pmu-events test case, these test events will be used as
the platform-agnostic events, so aliases can be created per-PMU and
validated against known expected values.
To support the test events, add a "testcpu" entry in pmu_events_map[].
The pmu-events test will be able to lookup the events map for "testcpu",
to verify the generated tables against expected values.
The resultant generated pmu-events.c will now look like the following:
struct pmu_event pme_ampere_emag[] = {
{
.name = "ldrex_spec",
.event = "event=0x6c",
.desc = "Exclusive operation spe...",
.topic = "intrinsic",
.long_desc = "Exclusive operation ...",
},
...
};
struct pmu_event pme_test_cpu[] = {
{
.name = "uncore_hisi_ddrc.flux_wcmd",
.event = "event=0x2",
.desc = "DDRC write commands. Unit: hisi_sccl,ddrc ",
.topic = "uncore",
.long_desc = "DDRC write commands",
.pmu = "hisi_sccl,ddrc",
},
{
.name = "unc_cbo_xsnp_response.miss_eviction",
.event = "umask=0x81,event=0x22",
.desc = "Unit: uncore_cbox A cross-core snoop resulted ...",
.topic = "uncore",
.long_desc = "A cross-core snoop resulted from L3 ...",
.pmu = "uncore_cbox",
},
{
.name = "eist_trans",
.event = "umask=0x0,period=200000,event=0x3a",
.desc = "Number of Enhanced Intel SpeedStep(R) ...",
.topic = "other",
},
{
.name = 0,
},
};
struct pmu_events_map pmu_events_map[] = {
...
{
.cpuid = "0x00000000500f0000",
.version = "v1",
.type = "core",
.table = pme_ampere_emag
},
...
{
.cpuid = "testcpu",
.version = "v1",
.type = "core",
.table = pme_test_cpu,
},
{
.cpuid = 0,
.version = 0,
.type = 0,
.table = 0,
},
};
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1584442939-8911-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add some test PMU events. The events are randomly chosen from x86 and
arm64 JSONs. The events include CPU and uncore events.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1584442939-8911-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get the changes in:
267762538705 ("seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number")
That ends up automatically adding the new IPPROTO_ETHERNET to the socket
args beautifiers:
$ tools/perf/trace/beauty/socket_ipproto.sh > before
Apply this patch:
$ tools/perf/trace/beauty/socket_ipproto.sh > after
$ diff -u before after
--- before 2020-03-19 11:48:36.876673819 -0300
+++ after 2020-03-19 11:49:00.148541377 -0300
@@ -6,6 +6,7 @@
[132] = "SCTP",
[136] = "UDPLITE",
[137] = "MPLS",
+ [143] = "ETHERNET",
[17] = "UDP",
[1] = "ICMP",
[22] = "IDP",
$
Addresses this tools/perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h
Cc: David S. Miller <davem@davemloft.net>
Cc: Paolo Lungaroni <paolo.lungaroni@cnit.it>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch updates the PMCs for AMD Zen1 core based processors (Family
17h; Models 0 through 2F) to be in accordance with PMCs as
documented in the latest versions of the AMD Processor Programming
Reference [1], [2] and [3]. Note that some events, such as FPU pipe
assignment are missing in [1], and therefore [3] is included for full
coverage of events.
PMCs added:
fpu_pipe_assignment.dual{0|1|2|3}
fpu_pipe_assignment.total{0|1|2|3}
ls_mab_alloc.dc_prefetcher
ls_mab_alloc.stores
ls_mab_alloc.loads
bp_dyn_ind_pred
bp_de_redirect
PMC removed:
ex_ret_cond_misp
Cumulative counts, fpu_pipe_assignment.total and
fpu_pipe_assignment.dual, existed in v1, but did expose port-level
counters.
ex_ret_cond_misp has been removed as it has been removed from the latest
versions of the PPR, and when tested, always seems to sample zero as
tested on a Ryzen 3400G system.
[1]: Processor Programming Reference (PPR) for AMD Family 17h Models
01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019.
[2]: Processor Programming Reference (PPR) for AMD Family 17h Model 18h,
Revision B1 Processors, 55570-B1 Rev 3.14 - Sep 26, 2019.
[3]: OSRR for AMD Family 17h processors, Models 00h-2Fh, 56255 Rev 3.03 - July, 2018
All of the PPRs can be found at:
https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Vijay Thakkar <vijaythakkar@me.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: vijay thakkar <vijaythakkar@me.com>
Link: http://lore.kernel.org/lkml/20200318190002.307290-4-vijaythakkar@me.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch changes the previous blanket detection of AMD Family 17h
processors to be more specific to Zen1 core based products only by
replacing model detection regex pattern [[:xdigit:]]+ with
([12][0-9A-F]|[0-9A-F]), restricting to models 0 though 2f only.
This change is required to allow for the addition of separate PMU events
for Zen2 core based models in the following patches as those belong to
family 17h but have different PMCs. Current PMU events directory has
also been renamed to "amdzen1" from "amdfam17h" to reflect this
specificity.
Note that although this change does not break PMU counters for existing
zen1 based systems, it does disable the current set of counters for zen2
based systems. Counters for zen2 have been added in the following
patches in this patchset.
Signed-off-by: Vijay Thakkar <vijaythakkar@me.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200318190002.307290-2-vijaythakkar@me.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit f01642e4912b ("perf metricgroup: Support multiple events for
metricgroup") introduced support for multiple events in a metric group.
But with the current upstream, metric events names are not printed
properly incase we try to run multiple metric groups with overlapping
event.
With current upstream version, incase of overlapping metric events issue
is, we always start our comparision logic from start. So, the events
which already matched with some metric group also take part in
comparision logic. Because of that when we have overlapping events, we
end up matching current metric group event with already matched one.
For example, in skylake machine we have metric event CoreIPC and
Instructions. Both of them need 'inst_retired.any' event value. As
events in Instructions is subset of events in CoreIPC, they endup in
pointing to same 'inst_retired.any' value.
In skylake platform:
command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
Performance counter stats for 'CPU(s) 0':
1,254,992,790 inst_retired.any # 1254992790.0
Instructions
# 1.3 CoreIPC
977,172,805 cycles
1,254,992,756 inst_retired.any
1.000802596 seconds time elapsed
command:# sudo ./perf stat -M UPI,IPC sleep 1
Performance counter stats for 'sleep 1':
948,650 uops_retired.retire_slots
866,182 inst_retired.any # 0.7 IPC
866,182 inst_retired.any
1,175,671 cpu_clk_unhalted.thread
Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
track of events which already matched with some group by setting it
true. So, we skip all used events in list when we start comparision
logic. Patch also make some changes in comparision logic, incase we get
a match miss, we discard the whole match and start again with first
event id in metric event.
With this patch:
In skylake platform:
command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
Performance counter stats for 'CPU(s) 0':
3,348,415 inst_retired.any # 0.3 CoreIPC
11,779,026 cycles
3,348,381 inst_retired.any # 3348381.0
Instructions
1.001649056 seconds time elapsed
command:# ./perf stat -M UPI,IPC sleep 1
Performance counter stats for 'sleep 1':
1,023,148 uops_retired.retire_slots # 1.1 UPI
924,976 inst_retired.any
924,976 inst_retired.any # 0.6 IPC
1,489,414 cpu_clk_unhalted.thread
1.003064672 seconds time elapsed
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200221101121.28920-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is a slight misalignment in -A -I output.
For example:
# perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000
# time CPU counts unit events
1.000440863 CPU0 1,068,388 cpu/event=cpu-cycles/
1.000440863 CPU1 875,954 cpu/event=cpu-cycles/
1.000440863 CPU2 3,072,538 cpu/event=cpu-cycles/
1.000440863 CPU3 4,026,870 cpu/event=cpu-cycles/
1.000440863 CPU4 5,919,630 cpu/event=cpu-cycles/
1.000440863 CPU5 2,714,260 cpu/event=cpu-cycles/
1.000440863 CPU6 2,219,240 cpu/event=cpu-cycles/
1.000440863 CPU7 1,299,232 cpu/event=cpu-cycles/
The value of counts is not aligned with the column "counts" and
the event name is not aligned with the column "events".
With this patch, the output is,
# perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000
# time CPU counts unit events
1.000423009 CPU0 997,421 cpu/event=cpu-cycles/
1.000423009 CPU1 1,422,042 cpu/event=cpu-cycles/
1.000423009 CPU2 484,651 cpu/event=cpu-cycles/
1.000423009 CPU3 525,791 cpu/event=cpu-cycles/
1.000423009 CPU4 1,370,100 cpu/event=cpu-cycles/
1.000423009 CPU5 442,072 cpu/event=cpu-cycles/
1.000423009 CPU6 205,643 cpu/event=cpu-cycles/
1.000423009 CPU7 1,302,250 cpu/event=cpu-cycles/
Now output is aligned.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200218071614.25736-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When performing "perf report --group", it shows the event group information
together. In previous patch, we have supported a new option "--group-sort-idx"
to sort the output by the event at the index n in event group.
It would be nice if we can use a hotkey in browser to select a event
to sort.
For example,
# perf report --group
Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, ...
Overhead Command Shared Object Symbol
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
1.56% 0.01% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494ce
1.56% 0.00% 0.00% 0.00% mgen [kernel.kallsyms] [k] task_tick_fair
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
0.00% 0.03% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] g_main_context_check
0.00% 0.03% 0.00% 0.00% swapper [kernel.kallsyms] [k] apic_timer_interrupt
0.00% 0.03% 0.00% 0.00% swapper [kernel.kallsyms] [k] check_preempt_curr
When user press hotkey '3' (event index, starting from 0), it indicates
to sort output by the forth event in group.
Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, ...
Overhead Command Shared Object Symbol
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
0.00% 0.00% 0.00% 0.06% swapper [kernel.kallsyms] [k] hrtimer_start_range_ns
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] update_curr
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] apic_timer_interrupt
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] native_apic_msr_eoi_write
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] __update_load_avg_se
v6:
---
Jiri provided a good improvement to eliminate unneeded refresh.
This improvement is added to v6.
v2:
---
1. Report warning at helpline when index is invalid.
2. Report warning at helpline when it's not group event.
3. Use "case '0' ... '9'" to refine the code
4. Split K_RELOAD implementation to another patch.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200220013616.19916-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sometimes we may need to reload the browser to update the output since
some options are changed.
This patch creates a new key K_RELOAD. Once the __cmd_report() returns
K_RELOAD, it would repeat the whole process, such as, read samples from
data file, sort the data and display in the browser.
v5:
---
1. Fix the 'make NO_SLANG=1' error. Define K_RELOAD in util/hist.h.
2. Skip setup_sorting() in repeat path if last key is K_RELOAD.
v4:
---
Need to quit in perf_evsel_menu__run if key is K_RELOAD.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200220013616.19916-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When performing "perf report --group", it shows the event group
information together. By default, the output is sorted by the first
event in group.
It would be nice for user to select any event for sorting. This patch
introduces a new option "--group-sort-idx" to sort the output by the
event at the index n in event group.
For example,
Before:
# perf report --group --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
# Event count (approx.): 6451235635
#
# Overhead Command Shared Object Symbol
# ................................ ......... ....................... ...................................
#
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
1.56% 0.01% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494ce
1.56% 0.00% 0.00% 0.00% mgen [kernel.kallsyms] [k] task_tick_fair
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
0.00% 0.03% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] g_main_context_check
0.00% 0.03% 0.00% 0.00% swapper [kernel.kallsyms] [k] apic_timer_interrupt
...
After:
# perf report --group --stdio --group-sort-idx 3
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
# Event count (approx.): 6451235635
#
# Overhead Command Shared Object Symbol
# ................................ ......... ....................... ...................................
#
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
0.00% 0.00% 0.00% 0.06% swapper [kernel.kallsyms] [k] hrtimer_start_range_ns
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] update_curr
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] apic_timer_interrupt
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] native_apic_msr_eoi_write
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] __update_load_avg_se
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] scheduler_tick
Now the output is sorted by the fourth event in group.
v7:
---
Rebase to latest perf/core, no other change.
v4:
---
1. Update Documentation/perf-report.txt to mention
'--group-sort-idx' support multiple groups with different
amount of events and it should be used on grouped events.
2. Update __hpp__group_sort_idx(), just return when the
idx is out of limit.
3. Return failure on symbol_conf.group_sort_idx && !session->evlist->nr_groups.
So now we don't need to use together with --group.
v3:
---
Refine the code in __hpp__group_sort_idx().
Before:
for (i = 1; i < nr_members; i++) {
if (i == idx) {
ret = field_cmp(fields_a[i], fields_b[i]);
if (ret)
goto out;
}
}
After:
if (idx >= 1 && idx < nr_members) {
ret = field_cmp(fields_a[idx], fields_b[idx]);
if (ret)
goto out;
}
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200220013616.19916-2-yao.jin@linux.intel.com
[ Renamed pair_fields_alloc() to hist_entry__new_pair() and combined decl + assignment of vars ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In previous patch, we have supported the annotation functionality even
without symbols.
For this patch, it supports the hotkey 'a' on address in report view.
Note that, for branch mode, we only support the annotation for "branch
to" address.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200227043939.4403-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For perf report on stripped binaries it is currently impossible to do
annotation. The annotation state is all tied to symbols, but there are
either no symbols, or symbols are not covering all the code.
We should support the annotation functionality even without symbols.
This patch fakes a symbol and the symbol name is the string of address.
After that, we just follow current annotation working flow.
For example,
1. perf report
Overhead Command Shared Object Symbol
20.67% div libc-2.27.so [.] __random_r
17.29% div libc-2.27.so [.] __random
10.59% div div [.] 0x0000000000000628
9.25% div div [.] 0x0000000000000612
6.11% div div [.] 0x0000000000000645
2. Select the line of "10.59% div div [.] 0x0000000000000628" and ENTER.
Annotate 0x0000000000000628
Zoom into div thread
Zoom into div DSO (use the 'k' hotkey to zoom directly into the kernel)
Browse map details
Run scripts for samples of symbol [0x0000000000000628]
Run scripts for all samples
Switch to another data file in PWD
Exit
3. Select the "Annotate 0x0000000000000628" and ENTER.
Percent│
│
│
│ Disassembly of section .text:
│
│ 0000000000000628 <.text+0x68>:
│ divsd %xmm4,%xmm0
│ divsd %xmm3,%xmm1
│ movsd (%rsp),%xmm2
│ addsd %xmm1,%xmm0
│ addsd %xmm2,%xmm0
│ movsd %xmm0,(%rsp)
Now we can see the dump of object starting from 0x628.
v5:
---
Remove the hotkey 'a' implementation from this patch. It
will be moved to a separate patch.
v4:
---
1. Support the hotkey 'a'. When we press 'a' on address,
now it supports the annotation.
2. Change the patch title from
"Support interactive annotation of code without symbols" to
"perf report: Support interactive annotation of code without symbols"
v3:
---
Keep just the ANNOTATION_DUMMY_LEN, and remove the
opts->annotate_dummy_len since it's the "maybe in future
we will provide" feature.
v2:
---
Fix a crash issue when annotating an address in "unknown" object.
The steps to reproduce this issue:
perf record -e cycles:u ls
perf report
75.29% ls ld-2.27.so [.] do_lookup_x
23.64% ls ld-2.27.so [.] __GI___tunables_init
1.04% ls [unknown] [k] 0xffffffff85c01210
0.03% ls ld-2.27.so [.] _start
When annotating 0xffffffff85c01210, the crash happens.
v2 adds checking for ms->map in add_annotate_opt(). If the object is
"unknown", ms->map is NULL.
Committer notes:
Renamed new_annotate_sym() to symbol__new_unresolved().
Use PRIx64 to fix this issue in some 32-bit arches:
ui/browsers/hists.c: In function 'symbol__new_unresolved':
ui/browsers/hists.c:2474:38: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
snprintf(name, sizeof(name), "%-#.*lx", BITS_PER_LONG / 4, addr);
~~~~~~^ ~~~~
%-#.*llx
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200227043939.4403-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull RCU changes from Paul E. McKenney:
- Make kfree_rcu() use kfree_bulk() for added performance
- RCU updates
- Callback-overload handling updates
- Tasks-RCU KCSAN and sparse updates
- Locking torture test and RCU torture test updates
- Documentation updates
- Miscellaneous fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add missing Makefile for net/forwarding tests and include it to
the targets list, otherwise forwarding tests are not installed
in case of cross-compilation.
Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kunit.py utility builds an ARCH=um kernel and then runs it. Add
optional --make_options flag to kunit.py allowing for the operator to
specify extra build options.
This allows use of the clang compiler for kunit:
tools/testing/kunit/kunit.py run --defconfig \
--make_options CC=clang --make_options HOSTCC=clang
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: David Gow <davidgow@google.com>
Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
To reduce the reliance of trace samples (trace*_user) on bpf_load,
move read_trace_pipe to trace_helpers. By moving this bpf_loader helper
elsewhere, trace functions can be easily migrated to libbbpf.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200321100424.1593964-2-danieltimlee@gmail.com
This patch adds test to exercise the bpf_sk_storage_get()
and bpf_sk_storage_delete() helper from the bpf_dctcp.c.
The setup and check on the sk_storage is done immediately
before and after the connect().
This patch also takes this chance to move the pthread_create()
after the connect() has been done. That will remove the need of
the "wait_thread" label.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200320152107.2169904-1-kafai@fb.com
Add an alternative format that can be more easily used for further
processing later on.
Note that we add a timestamp in the first column for both, the regular
and the new csv format.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Message-Id: <20200306114250.57585-5-raspl@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This now controls both, the refresh rate of the interactive mode as well
as the logging mode. Which, as a consequence, means that the default of
logging mode is now 3s, too (use command line switch '-s' to adjust to
your liking).
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Message-Id: <20200306114250.57585-4-raspl@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
optparse is deprecated for a while, hence switching over to argparse
(which also works with python2).
As a consequence, help output has some subtle changes, the most
significant one being that the options are all listed explicitly
instead of a universal '[options]' indicator. Also, some of the error
messages are phrased slightly different.
While at it, squashed a number of minor PEP8 issues.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Message-Id: <20200306114250.57585-3-raspl@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make sure command line arguments are sorted alphabetically
everywhere, and adjusted existing texts for interactive command 's' to
become consistent with the long form --set-delay.
Throwing in some PEP8 fixes (all cosmetics) for good measure.
Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Message-Id: <20200306114250.57585-2-raspl@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After copying Arm64's perf archive with object files and perf.data file
to x86 laptop, the x86's perf kernel symbol resolution fails. It
outputs 'unknown' for all symbols parsing.
This issue is root caused by the function elf__needs_adjust_symbols(),
x86 perf tool uses one weak version, Arm64 (and powerpc) has rewritten
their own version. elf__needs_adjust_symbols() decides if need to parse
symbols with the relative offset address; but x86 building uses the weak
function which misses to check for the elf type 'ET_DYN', so that it
cannot parse symbols in Arm DSOs due to the wrong result from
elf__needs_adjust_symbols().
The DSO parsing should not depend on any specific architecture perf
building; e.g. x86 perf tool can parse Arm and Arm64 DSOs, vice versa.
And confirmed by Naveen N. Rao that powerpc64 kernels are not being
built as ET_DYN anymore and change to ET_EXEC.
This patch removes the arch specific functions for Arm64 and powerpc and
changes elf__needs_adjust_symbols() as a common function.
In the common elf__needs_adjust_symbols(), it checks an extra condition
'ET_DYN' for elf header type. With this fixing, the Arm64 DSO can be
parsed properly with x86's perf tool.
Before:
# perf script
main 3258 1 branches: 0 [unknown] ([unknown]) => ffff800010c4665c [unknown] ([kernel.kallsyms])
main 3258 1 branches: ffff800010c46670 [unknown] ([kernel.kallsyms]) => ffff800010c4eaec [unknown] ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eaec [unknown] ([kernel.kallsyms]) => ffff800010c4eb00 [unknown] ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eb08 [unknown] ([kernel.kallsyms]) => ffff800010c4e780 [unknown] ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4e7a0 [unknown] ([kernel.kallsyms]) => ffff800010c4eeac [unknown] ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eebc [unknown] ([kernel.kallsyms]) => ffff800010c4ed80 [unknown] ([kernel.kallsyms])
After:
# perf script
main 3258 1 branches: 0 [unknown] ([unknown]) => ffff800010c4665c coresight_timeout+0x54 ([kernel.kallsyms])
main 3258 1 branches: ffff800010c46670 coresight_timeout+0x68 ([kernel.kallsyms]) => ffff800010c4eaec etm4_enable_hw+0x3cc ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eaec etm4_enable_hw+0x3cc ([kernel.kallsyms]) => ffff800010c4eb00 etm4_enable_hw+0x3e0 ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eb08 etm4_enable_hw+0x3e8 ([kernel.kallsyms]) => ffff800010c4e780 etm4_enable_hw+0x60 ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4e7a0 etm4_enable_hw+0x80 ([kernel.kallsyms]) => ffff800010c4eeac etm4_enable+0x2d4 ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eebc etm4_enable+0x2e4 ([kernel.kallsyms]) => ffff800010c4ed80 etm4_enable+0x1a8 ([kernel.kallsyms])
v3: Changed to check for ET_DYN across all architectures.
v2: Fixed Arm64 and powerpc native building.
Reported-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Enrico Weigelt <info@metux.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: http://lore.kernel.org/lkml/20200306015759.10084-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reproducible with a clang asan build and then running perf test in
particular 'Parse event definition strings'.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20200314170356.62914-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Handy for testing with distro kernels.
Warn that the resulting module is completely unsupported,
and isn't intended for production use.
Usage:
make oot # builds vhost_test.ko, vhost.ko
make oot-clean # cleans out files created
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Many systems build/test up-to-date kernels with older libcs, and
an older glibc (2.17) lacks the definition of SOL_DCCP in
/usr/include/bits/socket.h (it was added in the 4.6 timeframe).
Adding the definition to the test program avoids a compilation
failure that gets in the way of building tools/testing/selftests/net.
The test itself will work once the definition is added; either
skipping due to DCCP not being configured in the kernel under test
or passing, so there are no other more up-to-date glibc dependencies
here it seems beyond that missing definition.
Fixes: 11fb60d1089f ("selftests: net: reuseport_addr_any: add DCCP")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Statistics on timestamps is useful to quantify average and tail latency.
Print timestamp statistics in count/avg/min/max format.
Signed-off-by: Jian Yang <jianyang@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the following new flags:
-e: use level-triggered epoll() instead of poll().
-E: use event-triggered epoll() instead of poll().
Signed-off-by: Jian Yang <jianyang@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A longer sleep duration between sendmsg()s makes more cachelines to be
evicted and results in higher latency. Making the duration configurable.
Add the following new flags:
-S: Configurable sleep duration.
-b: Busy loop instead of poll().
Remove the following flag:
-D: No delay between packets: subsumed by -S.
Signed-off-by: Jian Yang <jianyang@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Txtimestamp reports latencies in uses resolution, while nsec is needed
in cases such as measuring latencies on localhost.
Add the following new flag:
-N: print timestamps and durations in nsec (instead of usec)
Signed-off-by: Jian Yang <jianyang@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>