IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
There have one problem about hw_breakpoint perf event, as watched, the
events reported to userspace is not correctly, sometime one trigger
bp_event report several events, sometime bp_event cannot go through to
user.
The root cause is attr->freq is 1 passed to kernel defaultly in bp
events, this make kernel calculate event period not as expect, make
sample period to 1 will change attr->freq to 0, to fix this problem.
This patch is similar with commit f92128 about tracepoint events:
perf: Make the trace events sample period default to 1
Signed-off-by: Jovi Zhang <bookjovi@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/CACV3sbLF8taiCq_VYW-sgRJyupeMzg58C7ZXfMe3xZUiH_Mx6w@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Trace events have a period (weight) of 1 by default. This can be
overriden on events definition by using the __perf_count() macro.
For example, the sched_stat_runtime() is weighted with the runtime of
the task that fired the event.
By default, perf handles such weighted event by dividing it into
individual events carrying a weight of 1. For example if
sched_stat_runtime is fired and the task has run 5000000 nsecs, perf
divides it into 5000000 events in the buffer.
This behaviour makes weighted events unusable because they quickly
fullfill the buffers and we lose most events.
The commit 5d81e5cfb3 ("events: Don't
divide events if it has field period") solves this problem by sending
only one event when PERF_SAMPLE_PERIOD flag is set. The weight is
carried in the sample itself such that we don't need to demultiplex it
anymore.
This patch provides the last missing piece to use this feature by
setting PERF_SAMPLE_PERIOD from perf tools when we deal with trace
events.
Before:
$ ./perf record -e sched:* -a sleep 1
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 1.619 MB perf.data (~70749 samples) ]
Warning:
Processed 16909 events and lost 1 chunks!
Check IO/CPU overload!
$ ./perf script
perf 1894 [003] 824.898327: sched_migrate_task: comm=perf pid=1898 prio=120 orig_cpu=2 dest_cpu=0
perf 1894 [003] 824.898335: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898336: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898337: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898338: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898339: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898340: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
perf 1894 [003] 824.898341: sched_stat_sleep: comm=perf pid=1898 delay=113179500 [ns]
[...]
After:
$ ./perf record -e sched:* -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.074 MB perf.data (~3228 samples) ]
$ ./perf script
perf 1461 [000] 554.286957: sched_migrate_task: comm=perf pid=1465 prio=120 orig_cpu=3 dest_cpu=1
perf 1461 [000] 554.286964: sched_stat_sleep: comm=perf pid=1465 delay=133047190 [ns]
perf 1461 [000] 554.286967: sched_wakeup: comm=perf pid=1465 prio=120 success=1 target_cpu=001
swapper 0 [001] 554.286976: sched_stat_wait: comm=perf pid=1465 delay=0 [ns]
swapper 0 [001] 554.286983: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf
[...]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1342631456-7233-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf improvements from Arnaldo Carvalho de Melo:
* Replace event_name with perf_evsel__name, that handles the event
modifiers and doesn't use static variables.
* GTK browser improvements, from Namhyung Kim
* Fix possible NULL pointer deref in the TUI annotate browser, from
Samuel Liao
* Add sort by source file:line number, using addr2line.
* Allow printing histogram text snapshots at any point in top/report.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We want to reuse the event grammar for parsing aliased terms.
The obvious reason is we dont need to add new code when there's
already support for this in event grammar.
Doing this by adding terms and event start entries into event
parse grammar. The grammar forks on the begining based on the
starting token, which is supplied via bison interface into the
lexer. The lexer then returns the starting token as the first
token, thus making the grammar switch accordingly.
Currently 2 starting tokens/grammars are supported:
PE_START_TERMS, PE_START_EVENTS
The PE_START_TERMS related grammar uses 'event_config' part
of the grammar for term parsing.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1339741902-8449-12-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The modifiers:
k kernel space
u user space
h hypervisor
G guest
H host
p, pp, ppp precision level (PEBS)
that can be suffixed to an event were lost when tools used event_name()
to reconstruct them from the perf_event_attr entries in a perf.data
file.
Fix it by following the defaults used for these modifiers in the current
codebase, so:
$ perf record -e instructions:u usleep 1 2> /dev/null
$ perf evlist
instructions:u
$ perf record -e cycles:k usleep 1 2> /dev/null
$ perf evlist
cycles:k
$ perf record -e cycles:kh usleep 1 2> /dev/null
$ perf evlist
cycles:kh
$ perf record -e cache-misses:G usleep 1 2> /dev/null
$ perf evlist
cache-misses:G
$ perf record -e cycles:ppk usleep 1 2> /dev/null
$ perf evlist
cycles:kpp
$
Also works with 'top', 'report', etc.
More work needed to cover tracepoints and software events while not
dragging lots of baggage to the python binding, this is a minimal fix
for v3.5.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4hl5glle0hxlklw4usva1mkt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding a new hardcoded term 'name' allowing to specify a name for the
pmu event. The term is defined along with standard pmu terms. If no
'name' term is given, the event name follows following template:
"raw 0x<perf_event_attr::config>"
running:
perf stat -e cpu/config=1,name=krava1/u ls
will produce following output:
...
Performance counter stats for 'ls':
0 krava1
...
running:
perf stat -e cpu/config=1/u ls
will produce following output:
...
Performance counter stats for 'ls':
0 raw 0x1
...
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337584373-2741-6-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Added new event rule to the event definition grammar:
event_def: event_pmu |
...
event_pmu: PE_NAME '/' event_config '/'
Using this rule, event could be now specified like:
cpu/config=1,config1=2,config2=3/u
where pmu name 'cpu' is looked up via following path:
${sysfs_mount}/bus/event_source/devices/${pmu}
and config options are bound to the pmu's format definiton:
${sysfs_mount}/bus/event_source/devices/${pmu}/format
The hardcoded config options still stays and have precedence
over any format field defined with same name.
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-50d8nr94f8k4wkezutrxvthe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding a new rule to the event grammar to be able to specify
values of additional attributes of symbolic event.
The new syntax for event symbolic definition is:
event_legacy_symbol: PE_NAME_SYM '/' event_config '/' |
PE_NAME_SYM sep_slash_dc
event_config: event_config ',' event_term | event_term
event_term: PE_NAME '=' PE_NAME |
PE_NAME '=' PE_VALUE
PE_NAME
sep_slash_dc: '/' | ':' |
At the moment the config options are hardcoded to be used for legacy
symbol events to define several perf_event_attr fields. It is:
'config' to define perf_event_attr::config
'config1' to define perf_event_attr::config1
'config2' to define perf_event_attr::config2
'period' to define perf_event_attr::sample_period
Legacy events could be now specified as:
cycles/period=100000/
If term is specified without the value assignment, then 1 is
assigned by default.
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-mgkavww9790jbt2jdkooyv4q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch fixes a buffer overrun bug in
tracepoint_id_to_path(). The bug manisfested itself as a memory
error reported by perf record. I ran into it with perf sched:
$ perf sched rec noploop 2 noploop for 2 seconds
[ perf record: Woken up 14 times to write data ]
[ perf record: Captured and wrote 42.701 MB perf.data (~1865622 samples) ]
Fatal: No memory to alloc tracepoints list
It turned out that tracepoint_id_to_path() was reading the
tracepoint id using read() but the buffer was not large enough
to include the \n terminator for id with 4 digits or more.
The patch fixes the problem by extending the buffer to a more
reasonable size covering all possible id length include \n
terminator. Note that atoll() stops at the first non digit
character, thus it is not necessary to clear the buffer between
each read.
Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: fweisbec@gmail.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/20120313155102.GA6465@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make use of exclude_guest and exlude_host in perf-kvm to do only
guest-only counting by default.
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
[ committer note: Moved perf_{guest,host} & event_attr_init to util.c ]
[ so as not to drag more stuff to the python binding]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We don't need to have two PATH_MAX char sized arrays holding it, just
one in util/debugfs.c will do.
Also rename debugfs_path to tracing_events_path, as it is not the path
to debugfs, that is debugfs_mountpoint. Both are now accessible.
This will allow accessing this code in the perf python binding without
having to drag in perf.c and util/parse-events.c.
The defaults for these variables are the canonical "/sys/kernel/debug"
and "/sys/kernel/debug/tracing/events/", removing the need for simple
tools to call debugfs_mount(NULL).
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ug9jvtjrsqbluuhqqxpvg30f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There was a problem with the parse_events() code not printing the
correct event name when an event was unknown and starting with an 'r'.
The source of the problem was the way raw notation was parsed.
Without the patch:
$ perf stat -e retired_foo
invalid event modifier: 'tired_foo'
With the patch:
$ perf stat -e retired_foo
invalid or unsupported event: 'retired_foo'
This also covers the case where the name of the event was not printed at
all when perf was linked with libpfm4.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20110723021043.GA20178@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add "node" as a simple alias for NODE cache events.
The addition of NODE cache events broke the parse_alias
function, so any mismatched event caused the segfault, like:
# ./perf stat -e krava ls
The hw_cache/hw_cache_op/hw_cache_result arrays needs to follow
PERF_COUNT_HW_CACHE_*MAX enums. Adding those MAXs to be size
of those arrays, so possible ommision in future wil not lead to
segfault.
Adding read/write/prefetch as allowed operations for node cache
event.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/20110713205818.GB7827@jolsa.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes an issue with event parsing.
The following commit appears to have broken the
ability to specify a comma separated list of events:
commit ceb53fbf6d
Author: Ingo Molnar <mingo@elte.hu>
Date: Wed Apr 27 04:06:33 2011 +0200
perf stat: Fail more clearly when an invalid modifier is specified
This patch fixes this while preserving the desired effect:
$ perf stat -e instructions:u,instructions:k ls /dev/null /dev/null
Performance counter stats for 'ls /dev/null':
365956 instructions:u # 0.00 insns per cycle
731806 instructions:k # 0.00 insns per cycle
0.001108862 seconds time elapsed
$ perf stat -e task-clock-msecs true
invalid event modifier: '-msecs'
Run 'perf list' for a list of valid events and modifiers
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: acme@redhat.com
Cc: peterz@infradead.org
Cc: fweisbec@gmail.com
Link: http://lkml.kernel.org/r/20110517133619.GA6999@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The new default output looks like this:
Performance counter stats for './loop_1b_instructions':
236.010686 task-clock # 0.996 CPUs utilized
0 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
99 page-faults # 0.000 M/sec
756,487,646 cycles # 3.205 GHz
354,938,996 stalled-cycles # 46.92% of all cycles are idle
1,001,403,797 instructions # 1.32 insns per cycle
# 0.35 stalled cycles per insn
100,279,773 branches # 424.895 M/sec
12,646 branch-misses # 0.013 % of all branches
0.236902540 seconds time elapsed
We dropped cache-refs and cache-misses and added stalled-cycles - this is a
more generic "how well utilized is the CPU" metric.
If the stalled-cycles ratio is too high then more specific measurements can be
taken to figure out the source of the inefficiency.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-pbpl2l4mn797s69bclfpwkwn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Right now we display this by default:
0.202204 task-clock-msecs # 0.282 CPUs
0 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
85 page-faults # 0.420 M/sec
The task-clock-msecs event cannot actually be passed back as an
event name, the event name we recognize is 'task-clock'.
So change the output of the cpu-clock and task-clock events
to be idempotent.
( Units should be printed out in the right-side column, if needed. )
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-lexrnbzy09asscgd4f7oac4i@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently we fail without printing any error message on "perf stat -e task-clock-msecs".
The reason is that the task-clock event is matched and the "-msecs" postfix is assumed
to be an event modifier - but is not recognized.
This patch changes the code to be more informative:
$ perf stat -e task-clock-msecs true
invalid event modifier: '-msecs'
Run 'perf list' for a list of valid events and modifiers
And restructures the return value of parse_event_modifier() to allow
the printing of all variants of invalid event modifiers.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-wlaw3dvz1ly6wple8l52cfca@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We currently fail on something like '-e CPU-migrations', with:
invalid or unsupported event: 'CPU-migrations'
While 'CPU-migrations' is how we actually print out the event
in the default perf stat output:
Performance counter stats for 'true':
0.202204 task-clock-msecs # 0.282 CPUs
0 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
So change the matching to be case-insensitive.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-omcm3edjjtx83a4kh2e244se@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch changes the way perf stat prints event names at the end of a
run. Until now, it was trying to reconstruct the event name from its
encoding. The problem is that it would only print generic events without
their modifiers (u, k, pp).
This patch saves the event name as passed by the user in the evsel
struct and uses it to print the final event name.
This would also work in case perf is linked with a library (such as
libpfm4) which provides full PMU event tables.
$ perf stat -e cycles:u,cycles:k date
Wed Feb 16 14:58:52 CET 2011
Performance counter stats for 'date':
568600 cycles:u
2779715 cycles:k
0.001908182 seconds time elapsed
Cc: Arun Sharma <arun@sharma-home.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@gmail.com>
LPU-Reference: <4d5bdc64.98a1df0a.7aa3.06c2@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
[ committer note: Fixed a merge problem with 023695d "Add cgroup support" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is needed because it will call parse_event for each tracepoint
name that matches, and we pass the perf_evlist via opt->value.
Problem introduced in 4503fdd where my assumption about opt being
always non NULL made me not look at callers of parse_events outside
builtin-*.c.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>