IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Currently in perf IPC and other metrics cannot be directly shown
separately for both user and kernel in a single run. The problem was
that the metrics matching code did not check event qualifiers.
With this patch the following case works correctly.
% perf stat -e cycles:k,cycles:u,instructions:k,instructions:u true
Performance counter stats for 'true':
531,718 cycles:k
203,895 cycles:u
338,151 instructions:k # 0.64 insns per cycle
105,961 instructions:u # 0.52 insns per cycle
0.002989739 seconds time elapsed
Previously it would misreport the ratios because they were matching the
wrong value.
The patch is fairly big, but quite mechanic as it just adds context
indexes everywhere.
Reported-by: William Cohen <wcohen@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: William Cohen <wcohen@redhat.com>
Link: http://lkml.kernel.org/r/1428441919-23099-3-git-send-email-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding support to return error information from parse_events function.
Following struct will be populated by parse_events function on return:
struct parse_events_error {
int idx;
char *str;
char *help;
};
where 'idx' is the position in the string where the parsing failed,
'str' contains dynamically allocated error string describing the error
and 'help' is optional help string.
The change contains reporting function, which currently does not display
anything. The code changes to supply error data for specific event types
are coming in next patches. However this is what the expected output is:
$ sudo perf record -e 'sched:krava' ls
event syntax error: 'sched:krava'
\___ unknown tracepoint
...
$ perf record -e 'cpu/even=0x1/' ls
event syntax error: 'cpu/even=0x1/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period,branch_type
...
$ perf record -e cycles,cache-mises ls
event syntax error: '..es,cache-mises'
\___ parser error
...
The output functions cut the beginning of the event string so the error
starts up to 10th character and cut the end of the string of it crosses
the terminal width.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1429729824-13932-2-git-send-email-jolsa@kernel.org
[ Renamed 'error' variables to 'err', not to clash with util.h error() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use of a bad filter currently generates the message:
Error: failed to set filter with 22 (Invalid argument)
Add the event name to make it clear to which event the filter
failed to apply:
Error: Failed to set filter "foo" on event sched:sg_lb_stats: 22: Invalid argument
To test it use something like:
# perf record -e sched:sched_switch -e sched:*fork --filter parent_pid==1 -e sched:*wait* --filter bla usleep 1
Error: failed to set filter "bla" on event sched:sched_stat_iowait with 22 (Invalid argument)
#
Based-on-a-patch-by: David Ahern <dsahern@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-d7gq2fjvaecozp9o2i0siifu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When cycles or instructions do not print anything, as in being,
--per-socket or --per-core modi, the ratio column was not correctly
indented for them. This lead to some ratios not lining up with the
others. Always indent correctly when nothing is printed.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1426087682-22765-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The information how much a counter ran in 'perf stat' can be quite
interesting for other tools to judge how trustworthy a measurement is.
Currently it is only output in non CSV mode.
This patches make perf stat always output the running time and the
enabled/running ratio in CSV mode.
This adds two new fields at the end for each line. I assume that
existing tools ignore new fields at the end, so it's on by default.
Only CSV mode is affected, no difference otherwise.
v2: Add extra print_running function
v3: Avoid printing nan
v4: Remove some elses and add brackets.
v5: Move non CSV case into print_running
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1426083387-17006-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 1971f59 (perf stat: Use read_counter in read_counter_aggr )
broke the perf stat output for unsupported counters.
$ perf stat -v -a -C 0 -e CCI_400/config=24/ sleep 1
Warning:
CCI_400/config=24/ event is not supported by the kernel.
Performance counter stats for 'system wide':
0 CCI_400/config=24/
1.080265400 seconds time elapsed
Where it used to be :
$ perf stat -v -a -C 0 -e CCI_400/config=24/ sleep 1
Warning:
CCI_400/config=24/ event is not supported by the kernel.
Performance counter stats for 'system wide':
<not supported> CCI_400/config=24/
1.083840675 seconds time elapsed
This patch fixes the issues by checking if the counter is supported,
before reading and logging the counter value.
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Acked-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1423852858-8455-1-git-send-email-suzuki.poulose@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On systems with more than one socket perf stat --per-core would either
segfault or stop before outputting all cores.
The problem was that the output code referenced the id including the
socket number in the higher bits, which is far beyond any per cpu array.
Mask out the socket number before referencing cpus in abs_printout.
I also renamed the variable in nsec_printout to be clear what it is,
even though it doesn't reference cpus.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1411591846-32736-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf stat did initialize the stats structure used to compute
stddev etc. incorrectly. It merely zeroes it. But one member
(min) needs to be set to a non zero value. This causes min
to be not computed at all. Call init_stats() correctly.
It doesn't matter for stat currently because it doesn't use
min, but it's still better to do it correctly.
The other users of statistics are already correct.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1395768699-16060-1-git-send-email-andi@firstfloor.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
When starting a workload 'stat' wasn't using prepare_workload evlist
method's signal based exec() error reporting mechanism.
Use it so that the we don't report 'not counted' counters.
Before:
[acme@zoo linux]$ perf stat dfadsfa
dfadsfa: No such file or directory
Performance counter stats for 'dfadsfa':
<not counted> task-clock
<not counted> context-switches
<not counted> cpu-migrations
<not counted> page-faults
<not counted> cycles
<not counted> stalled-cycles-frontend
<not supported> stalled-cycles-backend
<not counted> instructions
<not counted> branches
<not counted> branch-misses
0.001831462 seconds time elapsed
[acme@zoo linux]$
After:
[acme@zoo linux]$ perf stat dfadsfa
dfadsfa: No such file or directory
[acme@zoo linux]$
Reported-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-5yui3bv7e3hitxucnjsn6z8q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds perf stat support for handling event units and
scales as exported by the kernel.
The kernel can export PMU events actual unit and scaling factor
via sysfs:
$ ls -1 /sys/devices/power/events/energy-*
/sys/devices/power/events/energy-cores
/sys/devices/power/events/energy-cores.scale
/sys/devices/power/events/energy-cores.unit
/sys/devices/power/events/energy-pkg
/sys/devices/power/events/energy-pkg.scale
/sys/devices/power/events/energy-pkg.unit
$ cat /sys/devices/power/events/energy-cores.scale
2.3283064365386962890625e-10
$ cat cat /sys/devices/power/events/energy-cores.unit
Joules
This patch modifies the pmu event alias code to check
for the presence of the .unit and .scale files to load
the corresponding values. They are then used by perf stat
transparently:
# perf stat -a -e power/energy-pkg/,power/energy-cores/,cycles -I 1000 sleep 1000
# time counts unit events
1.000214717 3.07 Joules power/energy-pkg/ [100.00%]
1.000214717 0.53 Joules power/energy-cores/
1.000214717 12965028 cycles [100.00%]
2.000749289 3.01 Joules power/energy-pkg/
2.000749289 0.52 Joules power/energy-cores/
2.000749289 15817043 cycles
When the event does not have an explicit unit exported by
the kernel, nothing is printed. In csv output mode, there
will be an empty field.
Special thanks to Jiri for providing the supporting code
in the parser to trigger reading of the scale and unit files.
Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Cc: maria.n.dimakopoulou@gmail.com
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/1384275531-10892-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When only the instructions event is requested:
$ perf stat -e instructions git s
M builtin-stat.c
Performance counter stats for 'git s':
917,453,420 instructions # 0.00 insns per cycle
0.213002926 seconds time elapsed
The 0.00 insns per cycle comment in the output is totally bogus and
misleading. It happens because update_shadow_stats() doesn't touch
runtime_cycles_stats when only the instructions event is requested. So,
omit printing the bogus data altogether.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1380616604-4077-1-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When only the cycles event is requested:
$ perf stat -e cycles dd if=/dev/zero of=/dev/null count=1000000
1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB) copied, 0.26123 s, 2.0 GB/s
Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':
911,626,453 cycles # 0.000 GHz
0.262113350 seconds time elapsed
The 0.000 GHz comment in the output is totally bogus and misleading. It
happens because update_shadow_stats() doesn't touch runtime_nsecs_stats;
it is only written when a requested counter matches a SW_TASK_CLOCK. In
our case, since we have only requested HW_CPU_CYCLES,
runtime_nsecs_stats is unavailable. So, omit printing the comment
altogether.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1380539585-23859-3-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support to perf stat to print the basic transactional execution statistics:
Total cycles, Cycles in Transaction, Cycles in aborted transsactions
using the in_tx and in_tx_checkpoint qualifiers.
Transaction Starts and Elision Starts, to compute the average transaction
length.
This is a reasonable overview over the success of the transactions.
Also support architectures that have a transaction aborted cycles
counter like POWER8. Since that is awkward to handle in the kernel
abstract handle both cases here.
Enable with a new --transaction / -T option.
This requires measuring these events in a group, since they depend on each
other.
This is implemented by using TM sysfs events exported by the kernel
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnaldo Carvalho de Melo <acme@infradead.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1377128846-977-5-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When measuring workloads the startup phase -- doing page faults, dynamic
linking, opening files -- is often very different from the rest of the
workload. Especially with smaller kernels and using counter
multiplexing this can give significant measurement errors.
Multiplexing assumes that the workload is mostly the same over longer
periods. But at startup there is typically some spike of activity which
is relatively short. If many groups are multiplexing the one group
seeing the spike, and which is then scaled up over the time to run all
groups, may see a significant error.
Also in general it's often not useful to measure the startup, because it
is so different from the rest.
One way around this is to use interval mode and discard the first
sample, but this can be awkward because interval mode doesn't support
intervals of less than 100ms, and also a useful interval is not
necessarily the same as a useful startup delay.
This patch adds a new --initial-delay / -D option to skip measuring for
the startup phase. The time can be specified in ms
Here's a simple example:
perf stat -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
3,721 page-faults
...
If we just wait 20 ms the number of page faults is 1/3 less:
perf stat -D 20 -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
2,823 page-faults
...
So we filtered out most of the startup noise from bash.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1375490473-1503-4-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch fixes a problem with perf stat whereby on termination it may
send a SIGTERM signal to random processes on systems with high PID
recycling. I got some actual bug reports on this.
There is race between the SIGCHLD and sig_atexit() handlers. This patch
addresses this problem by clearing child_pid in the SIGCHLD handler.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130604154426.GA2928@quad
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds the --per-core option to perf stat.
This option is used to aggregate system-wide counts
on a per physical core basis. On processors with
hyperthreading, this means counts of all HT threads
running on a physical core are aggregated.
This mode is useful to find imblance between physical
cores running an uniform workload. Cores are identified
by socket: S0-C1, means physical core 1 on socket 0. Note
that cores are identified using their physical core id,
thus their numbering may not be continuous.
Per core aggregation can be combined with interval printing:
# perf stat -a --per-core -I 1000 -e cycles sleep 1000
# time core cpus counts events
1.000090030 S0-C0 1 4,765,747 cycles
1.000090030 S0-C1 1 5,580,647 cycles
1.000090030 S0-C2 1 221,181 cycles
1.000090030 S0-C3 1 266,092 cycles
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1360846649-6411-4-git-send-email-eranian@google.com
[ committer note: Remove parts already applied on 86ee6e1 to keep bisectability ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>