2012-11-20 02:21:03 +04:00
# include <linux/hw_breakpoint.h>
2015-09-07 11:38:06 +03:00
# include <linux/err.h>
2009-05-26 13:10:09 +04:00
# include "util.h"
2009-09-04 23:39:51 +04:00
# include "../perf.h"
2011-01-12 01:56:53 +03:00
# include "evlist.h"
2011-01-03 21:39:04 +03:00
# include "evsel.h"
2009-05-26 13:10:09 +04:00
# include "parse-options.h"
# include "parse-events.h"
# include "exec_cmd.h"
2014-10-07 19:08:49 +04:00
# include "string.h"
2010-03-26 01:59:00 +03:00
# include "symbol.h"
2009-07-21 22:16:29 +04:00
# include "cache.h"
2009-09-12 09:52:51 +04:00
# include "header.h"
perf tools: Enable passing bpf object file to --event
By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.
After applying this patch, commands like:
# perf record --event foo.o sleep
become possible.
However, at this point it is unable to link any useful things onto the
evsel list because the creating of probe points and BPF program
attaching have not been implemented. Before real events are possible to
be extracted, to avoid perf report error because of empty evsel list,
this patch link a dummy evsel. The dummy event related code will be
removed when probing and extracting code is ready.
Commiter notes:
Using it:
$ ls -la foo.o
ls: cannot access foo.o: No such file or directory
$ perf record --event foo.o sleep
libbpf: failed to open foo.o: No such file or directory
event syntax error: 'foo.o'
\___ BPF object file 'foo.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/build/perf/perf.o
/tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ perf record --event /tmp/build/perf/perf.o sleep
libbpf: /tmp/build/perf/perf.o is not an eBPF object file
event syntax error: '/tmp/build/perf/perf.o'
\___ BPF object file '/tmp/build/perf/perf.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ perf record --event /tmp/foo.o sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
$ perf evlist
/tmp/foo.o
$ perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$
So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.
$ perf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:14 +03:00
# include "bpf-loader.h"
2014-08-14 06:22:36 +04:00
# include "debug.h"
2015-09-02 10:56:34 +03:00
# include <api/fs/tracing_path.h>
2012-06-15 10:31:39 +04:00
# include "parse-events-bison.h"
2012-06-15 10:31:40 +04:00
# define YY_EXTRA_TYPE int
2012-03-15 23:09:15 +04:00
# include "parse-events-flex.h"
2012-03-15 23:09:18 +04:00
# include "pmu.h"
2013-08-27 06:41:53 +04:00
# include "thread_map.h"
2015-06-23 01:36:04 +03:00
# include "cpumap.h"
perf tools: Add parse_events_error interface
Adding support to return error information from parse_events function.
Following struct will be populated by parse_events function on return:
struct parse_events_error {
int idx;
char *str;
char *help;
};
where 'idx' is the position in the string where the parsing failed,
'str' contains dynamically allocated error string describing the error
and 'help' is optional help string.
The change contains reporting function, which currently does not display
anything. The code changes to supply error data for specific event types
are coming in next patches. However this is what the expected output is:
$ sudo perf record -e 'sched:krava' ls
event syntax error: 'sched:krava'
\___ unknown tracepoint
...
$ perf record -e 'cpu/even=0x1/' ls
event syntax error: 'cpu/even=0x1/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period,branch_type
...
$ perf record -e cycles,cache-mises ls
event syntax error: '..es,cache-mises'
\___ parser error
...
The output functions cut the beginning of the event string so the error
starts up to 10th character and cut the end of the string of it crosses
the terminal width.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1429729824-13932-2-git-send-email-jolsa@kernel.org
[ Renamed 'error' variables to 'err', not to clash with util.h error() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-22 22:10:16 +03:00
# include "asm/bug.h"
2012-03-15 23:09:15 +04:00
# define MAX_NAME_LEN 100
2009-05-26 13:10:09 +04:00
2012-05-21 11:12:50 +04:00
# ifdef PARSER_DEBUG
extern int parse_events_debug ;
# endif
2012-06-15 10:31:39 +04:00
int parse_events_parse ( void * data , void * scanner ) ;
2015-09-28 06:52:16 +03:00
static int get_config_terms ( struct list_head * head_config ,
struct list_head * head_terms __maybe_unused ) ;
2009-09-12 01:19:45 +04:00
2014-10-07 19:08:50 +04:00
static struct perf_pmu_event_symbol * perf_pmu_events_list ;
/*
* The variable indicates the number of supported pmu event symbols .
* 0 means not initialized and ready to init
* - 1 means failed to init , don ' t try anymore
* > 0 is the number of supported pmu event symbols
*/
static int perf_pmu_events_list_num ;
2015-02-27 13:21:27 +03:00
struct event_symbol event_symbols_hw [ PERF_COUNT_HW_MAX ] = {
2012-07-04 02:00:44 +04:00
[ PERF_COUNT_HW_CPU_CYCLES ] = {
. symbol = " cpu-cycles " ,
. alias = " cycles " ,
} ,
[ PERF_COUNT_HW_INSTRUCTIONS ] = {
. symbol = " instructions " ,
. alias = " " ,
} ,
[ PERF_COUNT_HW_CACHE_REFERENCES ] = {
. symbol = " cache-references " ,
. alias = " " ,
} ,
[ PERF_COUNT_HW_CACHE_MISSES ] = {
. symbol = " cache-misses " ,
. alias = " " ,
} ,
[ PERF_COUNT_HW_BRANCH_INSTRUCTIONS ] = {
. symbol = " branch-instructions " ,
. alias = " branches " ,
} ,
[ PERF_COUNT_HW_BRANCH_MISSES ] = {
. symbol = " branch-misses " ,
. alias = " " ,
} ,
[ PERF_COUNT_HW_BUS_CYCLES ] = {
. symbol = " bus-cycles " ,
. alias = " " ,
} ,
[ PERF_COUNT_HW_STALLED_CYCLES_FRONTEND ] = {
. symbol = " stalled-cycles-frontend " ,
. alias = " idle-cycles-frontend " ,
} ,
[ PERF_COUNT_HW_STALLED_CYCLES_BACKEND ] = {
. symbol = " stalled-cycles-backend " ,
. alias = " idle-cycles-backend " ,
} ,
[ PERF_COUNT_HW_REF_CPU_CYCLES ] = {
. symbol = " ref-cycles " ,
. alias = " " ,
} ,
} ;
2015-02-27 13:21:27 +03:00
struct event_symbol event_symbols_sw [ PERF_COUNT_SW_MAX ] = {
2012-07-04 02:00:44 +04:00
[ PERF_COUNT_SW_CPU_CLOCK ] = {
. symbol = " cpu-clock " ,
. alias = " " ,
} ,
[ PERF_COUNT_SW_TASK_CLOCK ] = {
. symbol = " task-clock " ,
. alias = " " ,
} ,
[ PERF_COUNT_SW_PAGE_FAULTS ] = {
. symbol = " page-faults " ,
. alias = " faults " ,
} ,
[ PERF_COUNT_SW_CONTEXT_SWITCHES ] = {
. symbol = " context-switches " ,
. alias = " cs " ,
} ,
[ PERF_COUNT_SW_CPU_MIGRATIONS ] = {
. symbol = " cpu-migrations " ,
. alias = " migrations " ,
} ,
[ PERF_COUNT_SW_PAGE_FAULTS_MIN ] = {
. symbol = " minor-faults " ,
. alias = " " ,
} ,
[ PERF_COUNT_SW_PAGE_FAULTS_MAJ ] = {
. symbol = " major-faults " ,
. alias = " " ,
} ,
[ PERF_COUNT_SW_ALIGNMENT_FAULTS ] = {
. symbol = " alignment-faults " ,
. alias = " " ,
} ,
[ PERF_COUNT_SW_EMULATION_FAULTS ] = {
. symbol = " emulation-faults " ,
. alias = " " ,
} ,
2013-08-31 22:50:52 +04:00
[ PERF_COUNT_SW_DUMMY ] = {
. symbol = " dummy " ,
. alias = " " ,
} ,
2015-11-27 21:54:33 +03:00
[ PERF_COUNT_SW_BPF_OUTPUT ] = {
. symbol = " bpf-output " ,
. alias = " " ,
} ,
2009-05-26 13:10:09 +04:00
} ;
perf: Do the big rename: Performance Counters -> Performance Events
Bye-bye Performance Counters, welcome Performance Events!
In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.
Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.
All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)
The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.
Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.
User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)
This patch has been generated via the following script:
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES
for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done
FILES=$(find . -name perf_event.*)
sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\<event\>/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES
... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.
Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.
( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )
Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 14:02:48 +04:00
# define __PERF_EVENT_FIELD(config, name) \
( ( config & PERF_EVENT_ # # name # # _MASK ) > > PERF_EVENT_ # # name # # _SHIFT )
2009-05-26 11:17:18 +04:00
perf stat: Add stalled cycles to the default output
The new default output looks like this:
Performance counter stats for './loop_1b_instructions':
236.010686 task-clock # 0.996 CPUs utilized
0 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
99 page-faults # 0.000 M/sec
756,487,646 cycles # 3.205 GHz
354,938,996 stalled-cycles # 46.92% of all cycles are idle
1,001,403,797 instructions # 1.32 insns per cycle
# 0.35 stalled cycles per insn
100,279,773 branches # 424.895 M/sec
12,646 branch-misses # 0.013 % of all branches
0.236902540 seconds time elapsed
We dropped cache-refs and cache-misses and added stalled-cycles - this is a
more generic "how well utilized is the CPU" metric.
If the stalled-cycles ratio is too high then more specific measurements can be
taken to figure out the source of the inefficiency.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-pbpl2l4mn797s69bclfpwkwn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-27 07:20:22 +04:00
# define PERF_EVENT_RAW(config) __PERF_EVENT_FIELD(config, RAW)
perf: Do the big rename: Performance Counters -> Performance Events
Bye-bye Performance Counters, welcome Performance Events!
In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.
Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.
All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)
The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.
Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.
User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)
This patch has been generated via the following script:
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES
for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done
FILES=$(find . -name perf_event.*)
sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\<event\>/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES
... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.
Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.
( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )
Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 14:02:48 +04:00
# define PERF_EVENT_CONFIG(config) __PERF_EVENT_FIELD(config, CONFIG)
perf stat: Add stalled cycles to the default output
The new default output looks like this:
Performance counter stats for './loop_1b_instructions':
236.010686 task-clock # 0.996 CPUs utilized
0 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
99 page-faults # 0.000 M/sec
756,487,646 cycles # 3.205 GHz
354,938,996 stalled-cycles # 46.92% of all cycles are idle
1,001,403,797 instructions # 1.32 insns per cycle
# 0.35 stalled cycles per insn
100,279,773 branches # 424.895 M/sec
12,646 branch-misses # 0.013 % of all branches
0.236902540 seconds time elapsed
We dropped cache-refs and cache-misses and added stalled-cycles - this is a
more generic "how well utilized is the CPU" metric.
If the stalled-cycles ratio is too high then more specific measurements can be
taken to figure out the source of the inefficiency.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-pbpl2l4mn797s69bclfpwkwn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-27 07:20:22 +04:00
# define PERF_EVENT_TYPE(config) __PERF_EVENT_FIELD(config, TYPE)
perf: Do the big rename: Performance Counters -> Performance Events
Bye-bye Performance Counters, welcome Performance Events!
In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.
Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.
All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)
The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.
Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.
User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)
This patch has been generated via the following script:
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES
for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done
FILES=$(find . -name perf_event.*)
sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\<event\>/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES
... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.
Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.
( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )
Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 14:02:48 +04:00
# define PERF_EVENT_ID(config) __PERF_EVENT_FIELD(config, EVENT)
2009-05-26 11:17:18 +04:00
2009-09-04 23:39:51 +04:00
# define for_each_subsystem(sys_dir, sys_dirent, sys_next) \
2009-07-21 20:20:22 +04:00
while ( ! readdir_r ( sys_dir , & sys_dirent , & sys_next ) & & sys_next ) \
2009-09-04 23:39:51 +04:00
if ( sys_dirent . d_type = = DT_DIR & & \
2009-07-21 20:20:22 +04:00
( strcmp ( sys_dirent . d_name , " . " ) ) & & \
( strcmp ( sys_dirent . d_name , " .. " ) ) )
2009-08-06 18:48:54 +04:00
static int tp_event_has_id ( struct dirent * sys_dir , struct dirent * evt_dir )
{
char evt_path [ MAXPATHLEN ] ;
int fd ;
2011-11-16 20:03:07 +04:00
snprintf ( evt_path , MAXPATHLEN , " %s/%s/%s/id " , tracing_events_path ,
2009-08-06 18:48:54 +04:00
sys_dir - > d_name , evt_dir - > d_name ) ;
fd = open ( evt_path , O_RDONLY ) ;
if ( fd < 0 )
return - EINVAL ;
close ( fd ) ;
return 0 ;
}
2009-09-04 23:39:51 +04:00
# define for_each_event(sys_dirent, evt_dir, evt_dirent, evt_next) \
2009-07-21 20:20:22 +04:00
while ( ! readdir_r ( evt_dir , & evt_dirent , & evt_next ) & & evt_next ) \
2009-09-04 23:39:51 +04:00
if ( evt_dirent . d_type = = DT_DIR & & \
2009-07-21 20:20:22 +04:00
( strcmp ( evt_dirent . d_name , " . " ) ) & & \
2009-08-06 18:48:54 +04:00
( strcmp ( evt_dirent . d_name , " .. " ) ) & & \
( ! tp_event_has_id ( & sys_dirent , & evt_dirent ) ) )
2009-07-21 20:20:22 +04:00
2009-09-17 12:34:51 +04:00
# define MAX_EVENT_LENGTH 512
2009-07-21 20:20:22 +04:00
2009-08-28 05:09:58 +04:00
struct tracepoint_path * tracepoint_id_to_path ( u64 config )
2009-07-21 20:20:22 +04:00
{
2009-08-28 05:09:58 +04:00
struct tracepoint_path * path = NULL ;
2009-07-21 20:20:22 +04:00
DIR * sys_dir , * evt_dir ;
struct dirent * sys_next , * evt_next , sys_dirent , evt_dirent ;
2012-03-13 19:51:02 +04:00
char id_buf [ 24 ] ;
2009-09-24 17:39:09 +04:00
int fd ;
2009-07-21 20:20:22 +04:00
u64 id ;
char evt_path [ MAXPATHLEN ] ;
2009-09-24 17:39:09 +04:00
char dir_path [ MAXPATHLEN ] ;
2009-07-21 20:20:22 +04:00
2011-11-16 20:03:07 +04:00
sys_dir = opendir ( tracing_events_path ) ;
2009-07-21 20:20:22 +04:00
if ( ! sys_dir )
2009-09-24 17:39:09 +04:00
return NULL ;
2009-09-04 23:39:51 +04:00
for_each_subsystem ( sys_dir , sys_dirent , sys_next ) {
2009-09-24 17:39:09 +04:00
2011-11-16 20:03:07 +04:00
snprintf ( dir_path , MAXPATHLEN , " %s/%s " , tracing_events_path ,
2009-09-24 17:39:09 +04:00
sys_dirent . d_name ) ;
evt_dir = opendir ( dir_path ) ;
if ( ! evt_dir )
2009-09-04 23:39:51 +04:00
continue ;
2009-09-24 17:39:09 +04:00
2009-09-04 23:39:51 +04:00
for_each_event ( sys_dirent , evt_dir , evt_dirent , evt_next ) {
2009-09-24 17:39:09 +04:00
snprintf ( evt_path , MAXPATHLEN , " %s/%s/id " , dir_path ,
2009-07-21 20:20:22 +04:00
evt_dirent . d_name ) ;
2009-09-24 17:39:09 +04:00
fd = open ( evt_path , O_RDONLY ) ;
2009-07-21 20:20:22 +04:00
if ( fd < 0 )
continue ;
if ( read ( fd , id_buf , sizeof ( id_buf ) ) < 0 ) {
close ( fd ) ;
continue ;
}
close ( fd ) ;
id = atoll ( id_buf ) ;
if ( id = = config ) {
closedir ( evt_dir ) ;
closedir ( sys_dir ) ;
2009-12-06 12:16:30 +03:00
path = zalloc ( sizeof ( * path ) ) ;
2009-08-28 05:09:58 +04:00
path - > system = malloc ( MAX_EVENT_LENGTH ) ;
if ( ! path - > system ) {
free ( path ) ;
return NULL ;
}
path - > name = malloc ( MAX_EVENT_LENGTH ) ;
if ( ! path - > name ) {
2013-12-27 23:55:14 +04:00
zfree ( & path - > system ) ;
2009-08-28 05:09:58 +04:00
free ( path ) ;
return NULL ;
}
strncpy ( path - > system , sys_dirent . d_name ,
MAX_EVENT_LENGTH ) ;
strncpy ( path - > name , evt_dirent . d_name ,
MAX_EVENT_LENGTH ) ;
return path ;
2009-07-21 20:20:22 +04:00
}
}
closedir ( evt_dir ) ;
}
closedir ( sys_dir ) ;
2009-08-28 05:09:58 +04:00
return NULL ;
}
2013-06-26 11:14:05 +04:00
struct tracepoint_path * tracepoint_name_to_path ( const char * name )
{
struct tracepoint_path * path = zalloc ( sizeof ( * path ) ) ;
char * str = strchr ( name , ' : ' ) ;
if ( path = = NULL | | str = = NULL ) {
free ( path ) ;
return NULL ;
}
path - > system = strndup ( name , str - name ) ;
path - > name = strdup ( str + 1 ) ;
if ( path - > system = = NULL | | path - > name = = NULL ) {
2013-12-27 23:55:14 +04:00
zfree ( & path - > system ) ;
zfree ( & path - > name ) ;
2013-06-26 11:14:05 +04:00
free ( path ) ;
path = NULL ;
}
return path ;
}
2011-03-10 08:23:28 +03:00
const char * event_type ( int type )
{
switch ( type ) {
case PERF_TYPE_HARDWARE :
return " hardware " ;
case PERF_TYPE_SOFTWARE :
return " software " ;
case PERF_TYPE_TRACEPOINT :
return " tracepoint " ;
case PERF_TYPE_HW_CACHE :
return " hardware-cache " ;
default :
break ;
}
return " unknown " ;
}
2012-09-10 11:53:50 +04:00
2013-11-12 20:58:49 +04:00
static struct perf_evsel *
__add_event ( struct list_head * list , int * idx ,
struct perf_event_attr * attr ,
2015-07-29 12:42:10 +03:00
char * name , struct cpu_map * cpus ,
struct list_head * config_terms )
2012-03-15 23:09:15 +04:00
{
struct perf_evsel * evsel ;
event_attr_init ( attr ) ;
2013-11-07 23:41:19 +04:00
evsel = perf_evsel__new_idx ( attr , ( * idx ) + + ) ;
2013-07-02 23:27:25 +04:00
if ( ! evsel )
2013-11-12 20:58:49 +04:00
return NULL ;
2012-03-15 23:09:15 +04:00
2015-09-08 10:58:55 +03:00
evsel - > cpus = cpu_map__get ( cpus ) ;
evsel - > own_cpus = cpu_map__get ( cpus ) ;
2015-06-23 01:36:04 +03:00
2012-06-12 20:45:00 +04:00
if ( name )
evsel - > name = strdup ( name ) ;
2015-07-29 12:42:10 +03:00
if ( config_terms )
list_splice ( config_terms , & evsel - > config_terms ) ;
2012-05-21 11:12:51 +04:00
list_add_tail ( & evsel - > node , list ) ;
2013-11-12 20:58:49 +04:00
return evsel ;
2012-03-15 23:09:15 +04:00
}
2013-07-02 23:27:25 +04:00
static int add_event ( struct list_head * list , int * idx ,
2015-07-29 12:42:10 +03:00
struct perf_event_attr * attr , char * name ,
struct list_head * config_terms )
2012-09-10 11:53:50 +04:00
{
2015-07-29 12:42:10 +03:00
return __add_event ( list , idx , attr , name , NULL , config_terms ) ? 0 : - ENOMEM ;
2012-09-10 11:53:50 +04:00
}
2012-06-11 21:08:07 +04:00
static int parse_aliases ( char * str , const char * names [ ] [ PERF_EVSEL__MAX_ALIASES ] , int size )
perf_counter: Implement generalized cache event types
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-05 22:22:46 +04:00
{
int i , j ;
2009-07-01 07:04:34 +04:00
int n , longest = - 1 ;
perf_counter: Implement generalized cache event types
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-05 22:22:46 +04:00
for ( i = 0 ; i < size ; i + + ) {
2012-06-11 21:08:07 +04:00
for ( j = 0 ; j < PERF_EVSEL__MAX_ALIASES & & names [ i ] [ j ] ; j + + ) {
2009-07-01 07:04:34 +04:00
n = strlen ( names [ i ] [ j ] ) ;
2012-03-15 23:09:15 +04:00
if ( n > longest & & ! strncasecmp ( str , names [ i ] [ j ] , n ) )
2009-07-01 07:04:34 +04:00
longest = n ;
}
2012-03-15 23:09:15 +04:00
if ( longest > 0 )
2009-07-01 07:04:34 +04:00
return i ;
perf_counter: Implement generalized cache event types
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-05 22:22:46 +04:00
}
2009-06-06 23:04:17 +04:00
return - 1 ;
perf_counter: Implement generalized cache event types
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-05 22:22:46 +04:00
}
2013-07-02 23:27:25 +04:00
int parse_events_add_cache ( struct list_head * list , int * idx ,
2012-03-15 23:09:15 +04:00
char * type , char * op_result1 , char * op_result2 )
perf_counter: Implement generalized cache event types
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-05 22:22:46 +04:00
{
2012-03-15 23:09:15 +04:00
struct perf_event_attr attr ;
char name [ MAX_NAME_LEN ] ;
2009-07-01 07:04:34 +04:00
int cache_type = - 1 , cache_op = - 1 , cache_result = - 1 ;
2012-03-15 23:09:15 +04:00
char * op_result [ 2 ] = { op_result1 , op_result2 } ;
int i , n ;
perf_counter: Implement generalized cache event types
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-05 22:22:46 +04:00
/*
* No fallback - if we cannot get a clear cache type
* then bail out :
*/
2012-06-11 21:08:07 +04:00
cache_type = parse_aliases ( type , perf_evsel__hw_cache ,
2012-03-15 23:09:15 +04:00
PERF_COUNT_HW_CACHE_MAX ) ;
perf_counter: Implement generalized cache event types
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-05 22:22:46 +04:00
if ( cache_type = = - 1 )
2012-03-15 23:09:15 +04:00
return - EINVAL ;
n = snprintf ( name , MAX_NAME_LEN , " %s " , type ) ;
2009-07-01 07:04:34 +04:00
2012-03-15 23:09:15 +04:00
for ( i = 0 ; ( i < 2 ) & & ( op_result [ i ] ) ; i + + ) {
char * str = op_result [ i ] ;
2012-09-05 21:51:33 +04:00
n + = snprintf ( name + n , MAX_NAME_LEN - n , " -%s " , str ) ;
2009-07-01 07:04:34 +04:00
if ( cache_op = = - 1 ) {
2012-06-11 21:08:07 +04:00
cache_op = parse_aliases ( str , perf_evsel__hw_cache_op ,
2012-03-15 23:09:15 +04:00
PERF_COUNT_HW_CACHE_OP_MAX ) ;
2009-07-01 07:04:34 +04:00
if ( cache_op > = 0 ) {
2012-06-11 21:08:07 +04:00
if ( ! perf_evsel__is_cache_op_valid ( cache_type , cache_op ) )
2012-03-15 23:09:15 +04:00
return - EINVAL ;
2009-07-01 07:04:34 +04:00
continue ;
}
}
if ( cache_result = = - 1 ) {
2012-06-11 21:08:07 +04:00
cache_result = parse_aliases ( str , perf_evsel__hw_cache_result ,
PERF_COUNT_HW_CACHE_RESULT_MAX ) ;
2009-07-01 07:04:34 +04:00
if ( cache_result > = 0 )
continue ;
}
}
perf_counter: Implement generalized cache event types
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-05 22:22:46 +04:00
/*
* Fall back to reads :
*/
2009-06-06 23:04:17 +04:00
if ( cache_op = = - 1 )
cache_op = PERF_COUNT_HW_CACHE_OP_READ ;
perf_counter: Implement generalized cache event types
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-05 22:22:46 +04:00
/*
* Fall back to accesses :
*/
if ( cache_result = = - 1 )
cache_result = PERF_COUNT_HW_CACHE_RESULT_ACCESS ;
2012-03-15 23:09:15 +04:00
memset ( & attr , 0 , sizeof ( attr ) ) ;
attr . config = cache_type | ( cache_op < < 8 ) | ( cache_result < < 16 ) ;
attr . type = PERF_TYPE_HW_CACHE ;
2015-07-29 12:42:10 +03:00
return add_event ( list , idx , & attr , name , NULL ) ;
2009-09-12 01:19:45 +04:00
}
2015-09-29 18:05:31 +03:00
static void tracepoint_error ( struct parse_events_error * e , int err ,
2015-09-07 11:38:07 +03:00
char * sys , char * name )
{
char help [ BUFSIZ ] ;
/*
* We get error directly from syscall errno ( > 0 ) ,
* or from encoded pointer ' s error ( < 0 ) .
*/
err = abs ( err ) ;
switch ( err ) {
case EACCES :
2015-09-29 18:05:31 +03:00
e - > str = strdup ( " can't access trace events " ) ;
2015-09-07 11:38:07 +03:00
break ;
case ENOENT :
2015-09-29 18:05:31 +03:00
e - > str = strdup ( " unknown tracepoint " ) ;
2015-09-07 11:38:07 +03:00
break ;
default :
2015-09-29 18:05:31 +03:00
e - > str = strdup ( " failed to add tracepoint " ) ;
2015-09-07 11:38:07 +03:00
break ;
}
tracing_path__strerror_open_tp ( err , help , sizeof ( help ) , sys , name ) ;
2015-09-29 18:05:31 +03:00
e - > help = strdup ( help ) ;
2015-09-07 11:38:07 +03:00
}
2013-07-02 23:27:25 +04:00
static int add_tracepoint ( struct list_head * list , int * idx ,
2015-09-07 11:38:05 +03:00
char * sys_name , char * evt_name ,
2015-09-29 18:05:31 +03:00
struct parse_events_error * err ,
2015-09-28 06:52:16 +03:00
struct list_head * head_config )
2009-09-12 01:19:45 +04:00
{
2012-09-27 00:13:07 +04:00
struct perf_evsel * evsel ;
2009-09-12 01:19:45 +04:00
2013-11-07 23:41:19 +04:00
evsel = perf_evsel__newtp_idx ( sys_name , evt_name , ( * idx ) + + ) ;
2015-09-07 11:38:07 +03:00
if ( IS_ERR ( evsel ) ) {
2015-09-29 18:05:31 +03:00
tracepoint_error ( err , PTR_ERR ( evsel ) , sys_name , evt_name ) ;
2015-09-07 11:38:06 +03:00
return PTR_ERR ( evsel ) ;
2015-09-07 11:38:07 +03:00
}
2009-09-12 01:19:45 +04:00
2015-09-28 06:52:16 +03:00
if ( head_config ) {
LIST_HEAD ( config_terms ) ;
if ( get_config_terms ( head_config , & config_terms ) )
return - ENOMEM ;
list_splice ( & config_terms , & evsel - > config_terms ) ;
}
2012-09-27 00:13:07 +04:00
list_add_tail ( & evsel - > node , list ) ;
return 0 ;
perf_counter: Implement generalized cache event types
Extend generic event enumeration with the PERF_TYPE_HW_CACHE
method.
This is a 3-dimensional space:
{ L1-D, L1-I, L2, ITLB, DTLB, BPU } x
{ load, store, prefetch } x
{ accesses, misses }
User-space passes in the 3 coordinates and the kernel provides
a counter. (if the hardware supports that type and if the
combination makes sense.)
Combinations that make no sense produce a -EINVAL.
Combinations that are not supported by the hardware produce -ENOTSUP.
Extend the tools to deal with this, and rewrite the event symbol
parsing code with various popular aliases for the units and
access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are
both valid aliases.
( x86 is supported for now, with the Nehalem event table filled in,
and with Core2 and Atom having placeholder tables. )
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-05 22:22:46 +04:00
}
2013-07-02 23:27:25 +04:00
static int add_tracepoint_multi_event ( struct list_head * list , int * idx ,
2015-09-07 11:38:05 +03:00
char * sys_name , char * evt_name ,
2015-09-29 18:05:31 +03:00
struct parse_events_error * err ,
2015-09-28 06:52:16 +03:00
struct list_head * head_config )
2009-09-12 01:19:45 +04:00
{
char evt_path [ MAXPATHLEN ] ;
struct dirent * evt_ent ;
DIR * evt_dir ;
2015-10-05 22:31:17 +03:00
int ret = 0 , found = 0 ;
2009-09-12 01:19:45 +04:00
2011-11-16 20:03:07 +04:00
snprintf ( evt_path , MAXPATHLEN , " %s/%s " , tracing_events_path , sys_name ) ;
2009-09-12 01:19:45 +04:00
evt_dir = opendir ( evt_path ) ;
if ( ! evt_dir ) {
2015-09-29 18:05:31 +03:00
tracepoint_error ( err , errno , sys_name , evt_name ) ;
2012-03-15 23:09:15 +04:00
return - 1 ;
2009-09-12 01:19:45 +04:00
}
2012-03-15 23:09:15 +04:00
while ( ! ret & & ( evt_ent = readdir ( evt_dir ) ) ) {
2009-09-12 01:19:45 +04:00
if ( ! strcmp ( evt_ent - > d_name , " . " )
| | ! strcmp ( evt_ent - > d_name , " .. " )
| | ! strcmp ( evt_ent - > d_name , " enable " )
| | ! strcmp ( evt_ent - > d_name , " filter " ) )
continue ;
2012-03-15 23:09:15 +04:00
if ( ! strglobmatch ( evt_ent - > d_name , evt_name ) )
2010-01-06 01:47:17 +03:00
continue ;
2015-10-05 22:31:17 +03:00
found + + ;
2015-09-28 06:52:16 +03:00
ret = add_tracepoint ( list , idx , sys_name , evt_ent - > d_name ,
2015-09-29 18:05:31 +03:00
err , head_config ) ;
2009-09-12 01:19:45 +04:00
}
2015-10-05 22:31:17 +03:00
if ( ! found ) {
tracepoint_error ( err , ENOENT , sys_name , evt_name ) ;
ret = - 1 ;
}
2012-12-17 17:08:36 +04:00
closedir ( evt_dir ) ;
2012-03-15 23:09:15 +04:00
return ret ;
2009-09-12 01:19:45 +04:00
}
2013-07-02 23:27:25 +04:00
static int add_tracepoint_event ( struct list_head * list , int * idx ,
2015-09-07 11:38:05 +03:00
char * sys_name , char * evt_name ,
2015-09-29 18:05:31 +03:00
struct parse_events_error * err ,
2015-09-28 06:52:16 +03:00
struct list_head * head_config )
2012-12-17 17:08:37 +04:00
{
return strpbrk ( evt_name , " *? " ) ?
2015-09-28 06:52:16 +03:00
add_tracepoint_multi_event ( list , idx , sys_name , evt_name ,
2015-09-29 18:05:31 +03:00
err , head_config ) :
2015-09-28 06:52:16 +03:00
add_tracepoint ( list , idx , sys_name , evt_name ,
2015-09-29 18:05:31 +03:00
err , head_config ) ;
2012-12-17 17:08:37 +04:00
}
2013-07-02 23:27:25 +04:00
static int add_tracepoint_multi_sys ( struct list_head * list , int * idx ,
2015-09-07 11:38:05 +03:00
char * sys_name , char * evt_name ,
2015-09-29 18:05:31 +03:00
struct parse_events_error * err ,
2015-09-28 06:52:16 +03:00
struct list_head * head_config )
2012-12-17 17:08:37 +04:00
{
struct dirent * events_ent ;
DIR * events_dir ;
int ret = 0 ;
events_dir = opendir ( tracing_events_path ) ;
if ( ! events_dir ) {
2015-09-29 18:05:31 +03:00
tracepoint_error ( err , errno , sys_name , evt_name ) ;
2012-12-17 17:08:37 +04:00
return - 1 ;
}
while ( ! ret & & ( events_ent = readdir ( events_dir ) ) ) {
if ( ! strcmp ( events_ent - > d_name , " . " )
| | ! strcmp ( events_ent - > d_name , " .. " )
| | ! strcmp ( events_ent - > d_name , " enable " )
| | ! strcmp ( events_ent - > d_name , " header_event " )
| | ! strcmp ( events_ent - > d_name , " header_page " ) )
continue ;
if ( ! strglobmatch ( events_ent - > d_name , sys_name ) )
continue ;
ret = add_tracepoint_event ( list , idx , events_ent - > d_name ,
2015-09-29 18:05:31 +03:00
evt_name , err , head_config ) ;
2012-12-17 17:08:37 +04:00
}
closedir ( events_dir ) ;
return ret ;
}
perf bpf: Collect perf_evsel in BPF object files
This patch creates a 'struct perf_evsel' for every probe in a BPF object
file(s) and fills 'struct evlist' with them. The previously introduced
dummy event is now removed. After this patch, the following command:
# perf record --event filter.o ls
Can trace on each of the probes defined in filter.o.
The core of this patch is bpf__foreach_tev(), which calls a callback
function for each 'struct probe_trace_event' event for a bpf program
with each associated file descriptors. The add_bpf_event() callback
creates evsels by calling parse_events_add_tracepoint().
Since bpf-loader.c will not be built if libbpf is turned off, an empty
bpf__foreach_tev() is defined in bpf-loader.h to avoid build errors.
Committer notes:
Before:
# /tmp/oldperf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.198 MB perf.data ]
# perf evlist
/tmp/foo.o
# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
I.e. we create just the PERF_TYPE_SOFTWARE (type: 1),
PERF_COUNT_SW_DUMMY(config 0x9) event, now, with this patch:
# perf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.210 MB perf.data ]
# perf evlist -v
perf_bpf_probe:fork: type: 2, size: 112, config: 0x6bd, { sample_period,
sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1,
inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest:
1, mmap2: 1, comm_exec: 1
#
We now have a PERF_TYPE_SOFTWARE (type: 1), but the config states 0x6bd,
which is how, after setting up the event via the kprobes interface, the
'perf_bpf_probe:fork' event is accessible via the perf_event_open
syscall. This is all transient, as soon as the 'perf record' session
ends, these probes will go away.
To see how it looks like, lets try doing a neverending session, one that
expects a control+C to end:
# perf record --event /tmp/foo.o -a
So, with that in place, we can use 'perf probe' to see what is in place:
# perf probe -l
perf_bpf_probe:fork (on _do_fork@acme/git/linux/kernel/fork.c)
We also can use debugfs:
[root@felicio ~]# cat /sys/kernel/debug/tracing/kprobe_events
p:perf_bpf_probe/fork _text+638512
Ok, now lets stop and see if we got some forks:
[root@felicio linux]# perf record --event /tmp/foo.o -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.325 MB perf.data (111 samples) ]
[root@felicio linux]# perf script
sshd 1271 [003] 81797.507678: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [000] 81797.524917: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.381603: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.408635: perf_bpf_probe:fork: (ffffffff8109be30)
<SNIP>
Sure enough, we have 111 forks :-)
Callchains seems to work as well:
# perf report --stdio --no-child
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 562 of event 'perf_bpf_probe:fork'
# Event count (approx.): 562
#
# Overhead Command Shared Object Symbol
# ........ ........ ................ ............
#
44.66% sh [kernel.vmlinux] [k] _do_fork
|
---_do_fork
entry_SYSCALL_64_fastpath
__libc_fork
make_child
26.16% make [kernel.vmlinux] [k] _do_fork
<SNIP>
#
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:17 +03:00
struct __add_bpf_event_param {
struct parse_events_evlist * data ;
struct list_head * list ;
} ;
static int add_bpf_event ( struct probe_trace_event * tev , int fd ,
void * _param )
{
LIST_HEAD ( new_evsels ) ;
struct __add_bpf_event_param * param = _param ;
struct parse_events_evlist * evlist = param - > data ;
struct list_head * list = param - > list ;
perf bpf: Attach eBPF filter to perf event
This is the final patch which makes basic BPF filter work. After
applying this patch, users are allowed to use BPF filter like:
# perf record --event ./hello_world.o ls
A bpf_fd field is appended to 'struct evsel', and setup during the
callback function add_bpf_event() for each 'probe_trace_event'.
PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF program to a newly
created perf event. The file descriptor of the eBPF program is passed to
perf record using previous patches, and stored into evsel->bpf_fd.
It is possible that different perf event are created for one kprobe
events for different CPUs. In this case, when trying to call the ioctl,
EEXIST will be return. This patch doesn't treat it as an error.
Committer note:
The bpf proggie used so far:
__attribute__((section("fork=_do_fork"), used))
int fork(void *ctx)
{
return 0;
}
char _license[] __attribute__((section("license"), used)) = "GPL";
int _version __attribute__((section("version"), used)) = 0x40300;
failed to produce any samples, even with forks happening and it being
running in system wide mode.
That is because now the filter is being associated, and the code above
always returns zero, meaning that all forks will be probed but filtered
away ;-/
Change it to 'return 1;' instead and after that:
# trace --no-syscalls --event /tmp/foo.o
0.000 perf_bpf_probe:fork:(ffffffff8109be30))
2.333 perf_bpf_probe:fork:(ffffffff8109be30))
3.725 perf_bpf_probe:fork:(ffffffff8109be30))
4.550 perf_bpf_probe:fork:(ffffffff8109be30))
^C#
And it works with all tools, including 'perf trace'.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-8-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:18 +03:00
struct perf_evsel * pos ;
perf bpf: Collect perf_evsel in BPF object files
This patch creates a 'struct perf_evsel' for every probe in a BPF object
file(s) and fills 'struct evlist' with them. The previously introduced
dummy event is now removed. After this patch, the following command:
# perf record --event filter.o ls
Can trace on each of the probes defined in filter.o.
The core of this patch is bpf__foreach_tev(), which calls a callback
function for each 'struct probe_trace_event' event for a bpf program
with each associated file descriptors. The add_bpf_event() callback
creates evsels by calling parse_events_add_tracepoint().
Since bpf-loader.c will not be built if libbpf is turned off, an empty
bpf__foreach_tev() is defined in bpf-loader.h to avoid build errors.
Committer notes:
Before:
# /tmp/oldperf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.198 MB perf.data ]
# perf evlist
/tmp/foo.o
# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
I.e. we create just the PERF_TYPE_SOFTWARE (type: 1),
PERF_COUNT_SW_DUMMY(config 0x9) event, now, with this patch:
# perf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.210 MB perf.data ]
# perf evlist -v
perf_bpf_probe:fork: type: 2, size: 112, config: 0x6bd, { sample_period,
sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1,
inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest:
1, mmap2: 1, comm_exec: 1
#
We now have a PERF_TYPE_SOFTWARE (type: 1), but the config states 0x6bd,
which is how, after setting up the event via the kprobes interface, the
'perf_bpf_probe:fork' event is accessible via the perf_event_open
syscall. This is all transient, as soon as the 'perf record' session
ends, these probes will go away.
To see how it looks like, lets try doing a neverending session, one that
expects a control+C to end:
# perf record --event /tmp/foo.o -a
So, with that in place, we can use 'perf probe' to see what is in place:
# perf probe -l
perf_bpf_probe:fork (on _do_fork@acme/git/linux/kernel/fork.c)
We also can use debugfs:
[root@felicio ~]# cat /sys/kernel/debug/tracing/kprobe_events
p:perf_bpf_probe/fork _text+638512
Ok, now lets stop and see if we got some forks:
[root@felicio linux]# perf record --event /tmp/foo.o -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.325 MB perf.data (111 samples) ]
[root@felicio linux]# perf script
sshd 1271 [003] 81797.507678: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [000] 81797.524917: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.381603: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.408635: perf_bpf_probe:fork: (ffffffff8109be30)
<SNIP>
Sure enough, we have 111 forks :-)
Callchains seems to work as well:
# perf report --stdio --no-child
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 562 of event 'perf_bpf_probe:fork'
# Event count (approx.): 562
#
# Overhead Command Shared Object Symbol
# ........ ........ ................ ............
#
44.66% sh [kernel.vmlinux] [k] _do_fork
|
---_do_fork
entry_SYSCALL_64_fastpath
__libc_fork
make_child
26.16% make [kernel.vmlinux] [k] _do_fork
<SNIP>
#
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:17 +03:00
int err ;
pr_debug ( " add bpf event %s:%s and attach bpf program %d \n " ,
tev - > group , tev - > event , fd ) ;
err = parse_events_add_tracepoint ( & new_evsels , & evlist - > idx , tev - > group ,
tev - > event , evlist - > error , NULL ) ;
if ( err ) {
struct perf_evsel * evsel , * tmp ;
pr_debug ( " Failed to add BPF event %s:%s \n " ,
tev - > group , tev - > event ) ;
list_for_each_entry_safe ( evsel , tmp , & new_evsels , node ) {
list_del ( & evsel - > node ) ;
perf_evsel__delete ( evsel ) ;
}
return err ;
}
pr_debug ( " adding %s:%s \n " , tev - > group , tev - > event ) ;
perf bpf: Attach eBPF filter to perf event
This is the final patch which makes basic BPF filter work. After
applying this patch, users are allowed to use BPF filter like:
# perf record --event ./hello_world.o ls
A bpf_fd field is appended to 'struct evsel', and setup during the
callback function add_bpf_event() for each 'probe_trace_event'.
PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF program to a newly
created perf event. The file descriptor of the eBPF program is passed to
perf record using previous patches, and stored into evsel->bpf_fd.
It is possible that different perf event are created for one kprobe
events for different CPUs. In this case, when trying to call the ioctl,
EEXIST will be return. This patch doesn't treat it as an error.
Committer note:
The bpf proggie used so far:
__attribute__((section("fork=_do_fork"), used))
int fork(void *ctx)
{
return 0;
}
char _license[] __attribute__((section("license"), used)) = "GPL";
int _version __attribute__((section("version"), used)) = 0x40300;
failed to produce any samples, even with forks happening and it being
running in system wide mode.
That is because now the filter is being associated, and the code above
always returns zero, meaning that all forks will be probed but filtered
away ;-/
Change it to 'return 1;' instead and after that:
# trace --no-syscalls --event /tmp/foo.o
0.000 perf_bpf_probe:fork:(ffffffff8109be30))
2.333 perf_bpf_probe:fork:(ffffffff8109be30))
3.725 perf_bpf_probe:fork:(ffffffff8109be30))
4.550 perf_bpf_probe:fork:(ffffffff8109be30))
^C#
And it works with all tools, including 'perf trace'.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-8-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:18 +03:00
list_for_each_entry ( pos , & new_evsels , node ) {
pr_debug ( " adding %s:%s to %p \n " ,
tev - > group , tev - > event , pos ) ;
pos - > bpf_fd = fd ;
}
perf bpf: Collect perf_evsel in BPF object files
This patch creates a 'struct perf_evsel' for every probe in a BPF object
file(s) and fills 'struct evlist' with them. The previously introduced
dummy event is now removed. After this patch, the following command:
# perf record --event filter.o ls
Can trace on each of the probes defined in filter.o.
The core of this patch is bpf__foreach_tev(), which calls a callback
function for each 'struct probe_trace_event' event for a bpf program
with each associated file descriptors. The add_bpf_event() callback
creates evsels by calling parse_events_add_tracepoint().
Since bpf-loader.c will not be built if libbpf is turned off, an empty
bpf__foreach_tev() is defined in bpf-loader.h to avoid build errors.
Committer notes:
Before:
# /tmp/oldperf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.198 MB perf.data ]
# perf evlist
/tmp/foo.o
# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
I.e. we create just the PERF_TYPE_SOFTWARE (type: 1),
PERF_COUNT_SW_DUMMY(config 0x9) event, now, with this patch:
# perf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.210 MB perf.data ]
# perf evlist -v
perf_bpf_probe:fork: type: 2, size: 112, config: 0x6bd, { sample_period,
sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1,
inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest:
1, mmap2: 1, comm_exec: 1
#
We now have a PERF_TYPE_SOFTWARE (type: 1), but the config states 0x6bd,
which is how, after setting up the event via the kprobes interface, the
'perf_bpf_probe:fork' event is accessible via the perf_event_open
syscall. This is all transient, as soon as the 'perf record' session
ends, these probes will go away.
To see how it looks like, lets try doing a neverending session, one that
expects a control+C to end:
# perf record --event /tmp/foo.o -a
So, with that in place, we can use 'perf probe' to see what is in place:
# perf probe -l
perf_bpf_probe:fork (on _do_fork@acme/git/linux/kernel/fork.c)
We also can use debugfs:
[root@felicio ~]# cat /sys/kernel/debug/tracing/kprobe_events
p:perf_bpf_probe/fork _text+638512
Ok, now lets stop and see if we got some forks:
[root@felicio linux]# perf record --event /tmp/foo.o -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.325 MB perf.data (111 samples) ]
[root@felicio linux]# perf script
sshd 1271 [003] 81797.507678: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [000] 81797.524917: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.381603: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.408635: perf_bpf_probe:fork: (ffffffff8109be30)
<SNIP>
Sure enough, we have 111 forks :-)
Callchains seems to work as well:
# perf report --stdio --no-child
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 562 of event 'perf_bpf_probe:fork'
# Event count (approx.): 562
#
# Overhead Command Shared Object Symbol
# ........ ........ ................ ............
#
44.66% sh [kernel.vmlinux] [k] _do_fork
|
---_do_fork
entry_SYSCALL_64_fastpath
__libc_fork
make_child
26.16% make [kernel.vmlinux] [k] _do_fork
<SNIP>
#
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:17 +03:00
list_splice ( & new_evsels , list ) ;
return 0 ;
}
perf tools: Enable passing bpf object file to --event
By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.
After applying this patch, commands like:
# perf record --event foo.o sleep
become possible.
However, at this point it is unable to link any useful things onto the
evsel list because the creating of probe points and BPF program
attaching have not been implemented. Before real events are possible to
be extracted, to avoid perf report error because of empty evsel list,
this patch link a dummy evsel. The dummy event related code will be
removed when probing and extracting code is ready.
Commiter notes:
Using it:
$ ls -la foo.o
ls: cannot access foo.o: No such file or directory
$ perf record --event foo.o sleep
libbpf: failed to open foo.o: No such file or directory
event syntax error: 'foo.o'
\___ BPF object file 'foo.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/build/perf/perf.o
/tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ perf record --event /tmp/build/perf/perf.o sleep
libbpf: /tmp/build/perf/perf.o is not an eBPF object file
event syntax error: '/tmp/build/perf/perf.o'
\___ BPF object file '/tmp/build/perf/perf.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ perf record --event /tmp/foo.o sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
$ perf evlist
/tmp/foo.o
$ perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$
So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.
$ perf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:14 +03:00
int parse_events_load_bpf_obj ( struct parse_events_evlist * data ,
struct list_head * list ,
struct bpf_object * obj )
{
int err ;
char errbuf [ BUFSIZ ] ;
perf bpf: Collect perf_evsel in BPF object files
This patch creates a 'struct perf_evsel' for every probe in a BPF object
file(s) and fills 'struct evlist' with them. The previously introduced
dummy event is now removed. After this patch, the following command:
# perf record --event filter.o ls
Can trace on each of the probes defined in filter.o.
The core of this patch is bpf__foreach_tev(), which calls a callback
function for each 'struct probe_trace_event' event for a bpf program
with each associated file descriptors. The add_bpf_event() callback
creates evsels by calling parse_events_add_tracepoint().
Since bpf-loader.c will not be built if libbpf is turned off, an empty
bpf__foreach_tev() is defined in bpf-loader.h to avoid build errors.
Committer notes:
Before:
# /tmp/oldperf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.198 MB perf.data ]
# perf evlist
/tmp/foo.o
# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
I.e. we create just the PERF_TYPE_SOFTWARE (type: 1),
PERF_COUNT_SW_DUMMY(config 0x9) event, now, with this patch:
# perf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.210 MB perf.data ]
# perf evlist -v
perf_bpf_probe:fork: type: 2, size: 112, config: 0x6bd, { sample_period,
sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1,
inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest:
1, mmap2: 1, comm_exec: 1
#
We now have a PERF_TYPE_SOFTWARE (type: 1), but the config states 0x6bd,
which is how, after setting up the event via the kprobes interface, the
'perf_bpf_probe:fork' event is accessible via the perf_event_open
syscall. This is all transient, as soon as the 'perf record' session
ends, these probes will go away.
To see how it looks like, lets try doing a neverending session, one that
expects a control+C to end:
# perf record --event /tmp/foo.o -a
So, with that in place, we can use 'perf probe' to see what is in place:
# perf probe -l
perf_bpf_probe:fork (on _do_fork@acme/git/linux/kernel/fork.c)
We also can use debugfs:
[root@felicio ~]# cat /sys/kernel/debug/tracing/kprobe_events
p:perf_bpf_probe/fork _text+638512
Ok, now lets stop and see if we got some forks:
[root@felicio linux]# perf record --event /tmp/foo.o -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.325 MB perf.data (111 samples) ]
[root@felicio linux]# perf script
sshd 1271 [003] 81797.507678: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [000] 81797.524917: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.381603: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.408635: perf_bpf_probe:fork: (ffffffff8109be30)
<SNIP>
Sure enough, we have 111 forks :-)
Callchains seems to work as well:
# perf report --stdio --no-child
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 562 of event 'perf_bpf_probe:fork'
# Event count (approx.): 562
#
# Overhead Command Shared Object Symbol
# ........ ........ ................ ............
#
44.66% sh [kernel.vmlinux] [k] _do_fork
|
---_do_fork
entry_SYSCALL_64_fastpath
__libc_fork
make_child
26.16% make [kernel.vmlinux] [k] _do_fork
<SNIP>
#
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:17 +03:00
struct __add_bpf_event_param param = { data , list } ;
perf tools: Create probe points for BPF programs
This patch introduces bpf__{un,}probe() functions to enable callers to
create kprobe points based on section names a BPF program. It parses the
section names in the program and creates corresponding 'struct
perf_probe_event' structures. The parse_perf_probe_command() function is
used to do the main parsing work. The resuling 'struct perf_probe_event'
is stored into program private data for further using.
By utilizing the new probing API, this patch creates probe points during
event parsing.
To ensure probe points be removed correctly, register an atexit hook so
even perf quit through exit() bpf__clear() is still called, so probing
points are cleared. Note that bpf_clear() should be registered before
bpf__probe() is called, so failure of bpf__probe() can still trigger
bpf__clear() to remove probe points which are already probed.
strerror style error reporting scaffold is created by this patch.
bpf__strerror_probe() is the first error reporting function in
bpf-loader.c.
Committer note:
Trying it:
To build a test eBPF object file:
I am testing using a script I built from the 'perf test -v LLVM' output:
$ cat ~/bin/hello-ebpf
export KERNEL_INC_OPTIONS="-nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.8.3/include -I/home/acme/git/linux/arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated -I/home/acme/git/linux/include -Iinclude -I/home/acme/git/linux/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -Iinclude/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h"
export WORKING_DIR=/lib/modules/4.2.0/build
export CLANG_SOURCE=-
export CLANG_OPTIONS=-xc
OBJ=/tmp/foo.o
rm -f $OBJ
echo '__attribute__((section("fork=do_fork"), used)) int fork(void *ctx) {return 0;} char _license[] __attribute__((section("license"), used)) = "GPL";int _version __attribute__((section("version"), used)) = 0x40100;' | \
clang -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o /tmp/foo.o && file $OBJ
---
First asking to put a probe in a function not present in the kernel
(misses the initial _):
$ perf record --event /tmp/foo.o sleep 1
Probe point 'do_fork' not found.
event syntax error: '/tmp/foo.o'
\___ You need to check probing points in BPF file
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
---
Now, with "__attribute__((section("fork=_do_fork"), used)):
$ grep _do_fork /proc/kallsyms
ffffffff81099ab0 T _do_fork
$ perf record --event /tmp/foo.o sleep 1
Failed to open kprobe_events: Permission denied
event syntax error: '/tmp/foo.o'
\___ Permission denied
---
Cool, we need to provide some better hints, "kprobe_events" is too low
level, one doesn't strictly need to know the precise details of how
these things are put in place, so something that shows the command
needed to fix the permissions would be more helpful.
Lets try as root instead:
# perf record --event /tmp/foo.o sleep 1
Lowering default frequency rate to 1000.
Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
# perf evlist
/tmp/foo.o
[root@felicio ~]# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 1000, sample_type: IP|TID|TIME|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1,
sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
---
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:15 +03:00
static bool registered_unprobe_atexit = false ;
perf tools: Enable passing bpf object file to --event
By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.
After applying this patch, commands like:
# perf record --event foo.o sleep
become possible.
However, at this point it is unable to link any useful things onto the
evsel list because the creating of probe points and BPF program
attaching have not been implemented. Before real events are possible to
be extracted, to avoid perf report error because of empty evsel list,
this patch link a dummy evsel. The dummy event related code will be
removed when probing and extracting code is ready.
Commiter notes:
Using it:
$ ls -la foo.o
ls: cannot access foo.o: No such file or directory
$ perf record --event foo.o sleep
libbpf: failed to open foo.o: No such file or directory
event syntax error: 'foo.o'
\___ BPF object file 'foo.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/build/perf/perf.o
/tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ perf record --event /tmp/build/perf/perf.o sleep
libbpf: /tmp/build/perf/perf.o is not an eBPF object file
event syntax error: '/tmp/build/perf/perf.o'
\___ BPF object file '/tmp/build/perf/perf.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ perf record --event /tmp/foo.o sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
$ perf evlist
/tmp/foo.o
$ perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$
So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.
$ perf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:14 +03:00
if ( IS_ERR ( obj ) | | ! obj ) {
snprintf ( errbuf , sizeof ( errbuf ) ,
" Internal error: load bpf obj with NULL " ) ;
err = - EINVAL ;
goto errout ;
}
perf tools: Create probe points for BPF programs
This patch introduces bpf__{un,}probe() functions to enable callers to
create kprobe points based on section names a BPF program. It parses the
section names in the program and creates corresponding 'struct
perf_probe_event' structures. The parse_perf_probe_command() function is
used to do the main parsing work. The resuling 'struct perf_probe_event'
is stored into program private data for further using.
By utilizing the new probing API, this patch creates probe points during
event parsing.
To ensure probe points be removed correctly, register an atexit hook so
even perf quit through exit() bpf__clear() is still called, so probing
points are cleared. Note that bpf_clear() should be registered before
bpf__probe() is called, so failure of bpf__probe() can still trigger
bpf__clear() to remove probe points which are already probed.
strerror style error reporting scaffold is created by this patch.
bpf__strerror_probe() is the first error reporting function in
bpf-loader.c.
Committer note:
Trying it:
To build a test eBPF object file:
I am testing using a script I built from the 'perf test -v LLVM' output:
$ cat ~/bin/hello-ebpf
export KERNEL_INC_OPTIONS="-nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.8.3/include -I/home/acme/git/linux/arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated -I/home/acme/git/linux/include -Iinclude -I/home/acme/git/linux/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -Iinclude/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h"
export WORKING_DIR=/lib/modules/4.2.0/build
export CLANG_SOURCE=-
export CLANG_OPTIONS=-xc
OBJ=/tmp/foo.o
rm -f $OBJ
echo '__attribute__((section("fork=do_fork"), used)) int fork(void *ctx) {return 0;} char _license[] __attribute__((section("license"), used)) = "GPL";int _version __attribute__((section("version"), used)) = 0x40100;' | \
clang -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o /tmp/foo.o && file $OBJ
---
First asking to put a probe in a function not present in the kernel
(misses the initial _):
$ perf record --event /tmp/foo.o sleep 1
Probe point 'do_fork' not found.
event syntax error: '/tmp/foo.o'
\___ You need to check probing points in BPF file
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
---
Now, with "__attribute__((section("fork=_do_fork"), used)):
$ grep _do_fork /proc/kallsyms
ffffffff81099ab0 T _do_fork
$ perf record --event /tmp/foo.o sleep 1
Failed to open kprobe_events: Permission denied
event syntax error: '/tmp/foo.o'
\___ Permission denied
---
Cool, we need to provide some better hints, "kprobe_events" is too low
level, one doesn't strictly need to know the precise details of how
these things are put in place, so something that shows the command
needed to fix the permissions would be more helpful.
Lets try as root instead:
# perf record --event /tmp/foo.o sleep 1
Lowering default frequency rate to 1000.
Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
# perf evlist
/tmp/foo.o
[root@felicio ~]# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 1000, sample_type: IP|TID|TIME|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1,
sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
---
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:15 +03:00
/*
* Register atexit handler before calling bpf__probe ( ) so
* bpf__probe ( ) don ' t need to unprobe probe points its already
* created when failure .
*/
if ( ! registered_unprobe_atexit ) {
atexit ( bpf__clear ) ;
registered_unprobe_atexit = true ;
}
err = bpf__probe ( obj ) ;
if ( err ) {
bpf__strerror_probe ( obj , err , errbuf , sizeof ( errbuf ) ) ;
goto errout ;
}
perf tools: Load eBPF object into kernel
This patch utilizes bpf_object__load() provided by libbpf to load all
objects into kernel.
Committer notes:
Testing it:
When using an incorrect kernel version number, i.e., having this in your
eBPF proggie:
int _version __attribute__((section("version"), used)) = 0x40100;
For a 4.3.0-rc6+ kernel, say, this happens and needs checking at event
parsing time, to provide a better error report to the user:
# perf record --event /tmp/foo.o sleep 1
libbpf: load bpf program failed: Invalid argument
libbpf: -- BEGIN DUMP LOG ---
libbpf:
libbpf: -- END LOG --
libbpf: failed to load program 'fork=_do_fork'
libbpf: failed to load object '/tmp/foo.o'
event syntax error: '/tmp/foo.o'
\___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
If we instead make it match, i.e. use 0x40300 on this v4.3.0-rc6+
kernel, the whole process goes thru:
# perf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.202 MB perf.data ]
# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
#
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:16 +03:00
err = bpf__load ( obj ) ;
if ( err ) {
bpf__strerror_load ( obj , err , errbuf , sizeof ( errbuf ) ) ;
goto errout ;
}
perf bpf: Collect perf_evsel in BPF object files
This patch creates a 'struct perf_evsel' for every probe in a BPF object
file(s) and fills 'struct evlist' with them. The previously introduced
dummy event is now removed. After this patch, the following command:
# perf record --event filter.o ls
Can trace on each of the probes defined in filter.o.
The core of this patch is bpf__foreach_tev(), which calls a callback
function for each 'struct probe_trace_event' event for a bpf program
with each associated file descriptors. The add_bpf_event() callback
creates evsels by calling parse_events_add_tracepoint().
Since bpf-loader.c will not be built if libbpf is turned off, an empty
bpf__foreach_tev() is defined in bpf-loader.h to avoid build errors.
Committer notes:
Before:
# /tmp/oldperf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.198 MB perf.data ]
# perf evlist
/tmp/foo.o
# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
I.e. we create just the PERF_TYPE_SOFTWARE (type: 1),
PERF_COUNT_SW_DUMMY(config 0x9) event, now, with this patch:
# perf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.210 MB perf.data ]
# perf evlist -v
perf_bpf_probe:fork: type: 2, size: 112, config: 0x6bd, { sample_period,
sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1,
inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest:
1, mmap2: 1, comm_exec: 1
#
We now have a PERF_TYPE_SOFTWARE (type: 1), but the config states 0x6bd,
which is how, after setting up the event via the kprobes interface, the
'perf_bpf_probe:fork' event is accessible via the perf_event_open
syscall. This is all transient, as soon as the 'perf record' session
ends, these probes will go away.
To see how it looks like, lets try doing a neverending session, one that
expects a control+C to end:
# perf record --event /tmp/foo.o -a
So, with that in place, we can use 'perf probe' to see what is in place:
# perf probe -l
perf_bpf_probe:fork (on _do_fork@acme/git/linux/kernel/fork.c)
We also can use debugfs:
[root@felicio ~]# cat /sys/kernel/debug/tracing/kprobe_events
p:perf_bpf_probe/fork _text+638512
Ok, now lets stop and see if we got some forks:
[root@felicio linux]# perf record --event /tmp/foo.o -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.325 MB perf.data (111 samples) ]
[root@felicio linux]# perf script
sshd 1271 [003] 81797.507678: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [000] 81797.524917: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.381603: perf_bpf_probe:fork: (ffffffff8109be30)
sshd 18309 [001] 81799.408635: perf_bpf_probe:fork: (ffffffff8109be30)
<SNIP>
Sure enough, we have 111 forks :-)
Callchains seems to work as well:
# perf report --stdio --no-child
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 562 of event 'perf_bpf_probe:fork'
# Event count (approx.): 562
#
# Overhead Command Shared Object Symbol
# ........ ........ ................ ............
#
44.66% sh [kernel.vmlinux] [k] _do_fork
|
---_do_fork
entry_SYSCALL_64_fastpath
__libc_fork
make_child
26.16% make [kernel.vmlinux] [k] _do_fork
<SNIP>
#
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:17 +03:00
err = bpf__foreach_tev ( obj , add_bpf_event , & param ) ;
if ( err ) {
snprintf ( errbuf , sizeof ( errbuf ) ,
" Attach events in BPF object failed " ) ;
goto errout ;
}
return 0 ;
perf tools: Enable passing bpf object file to --event
By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.
After applying this patch, commands like:
# perf record --event foo.o sleep
become possible.
However, at this point it is unable to link any useful things onto the
evsel list because the creating of probe points and BPF program
attaching have not been implemented. Before real events are possible to
be extracted, to avoid perf report error because of empty evsel list,
this patch link a dummy evsel. The dummy event related code will be
removed when probing and extracting code is ready.
Commiter notes:
Using it:
$ ls -la foo.o
ls: cannot access foo.o: No such file or directory
$ perf record --event foo.o sleep
libbpf: failed to open foo.o: No such file or directory
event syntax error: 'foo.o'
\___ BPF object file 'foo.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/build/perf/perf.o
/tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ perf record --event /tmp/build/perf/perf.o sleep
libbpf: /tmp/build/perf/perf.o is not an eBPF object file
event syntax error: '/tmp/build/perf/perf.o'
\___ BPF object file '/tmp/build/perf/perf.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ perf record --event /tmp/foo.o sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
$ perf evlist
/tmp/foo.o
$ perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$
So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.
$ perf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:14 +03:00
errout :
data - > error - > help = strdup ( " (add -v to see detail) " ) ;
data - > error - > str = strdup ( errbuf ) ;
return err ;
}
int parse_events_load_bpf ( struct parse_events_evlist * data ,
struct list_head * list ,
2015-10-14 15:41:20 +03:00
char * bpf_file_name ,
bool source )
perf tools: Enable passing bpf object file to --event
By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.
After applying this patch, commands like:
# perf record --event foo.o sleep
become possible.
However, at this point it is unable to link any useful things onto the
evsel list because the creating of probe points and BPF program
attaching have not been implemented. Before real events are possible to
be extracted, to avoid perf report error because of empty evsel list,
this patch link a dummy evsel. The dummy event related code will be
removed when probing and extracting code is ready.
Commiter notes:
Using it:
$ ls -la foo.o
ls: cannot access foo.o: No such file or directory
$ perf record --event foo.o sleep
libbpf: failed to open foo.o: No such file or directory
event syntax error: 'foo.o'
\___ BPF object file 'foo.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/build/perf/perf.o
/tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ perf record --event /tmp/build/perf/perf.o sleep
libbpf: /tmp/build/perf/perf.o is not an eBPF object file
event syntax error: '/tmp/build/perf/perf.o'
\___ BPF object file '/tmp/build/perf/perf.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ perf record --event /tmp/foo.o sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
$ perf evlist
/tmp/foo.o
$ perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$
So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.
$ perf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:14 +03:00
{
struct bpf_object * obj ;
2015-10-14 15:41:20 +03:00
obj = bpf__prepare_load ( bpf_file_name , source ) ;
2015-11-06 16:49:37 +03:00
if ( IS_ERR ( obj ) ) {
perf tools: Enable passing bpf object file to --event
By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.
After applying this patch, commands like:
# perf record --event foo.o sleep
become possible.
However, at this point it is unable to link any useful things onto the
evsel list because the creating of probe points and BPF program
attaching have not been implemented. Before real events are possible to
be extracted, to avoid perf report error because of empty evsel list,
this patch link a dummy evsel. The dummy event related code will be
removed when probing and extracting code is ready.
Commiter notes:
Using it:
$ ls -la foo.o
ls: cannot access foo.o: No such file or directory
$ perf record --event foo.o sleep
libbpf: failed to open foo.o: No such file or directory
event syntax error: 'foo.o'
\___ BPF object file 'foo.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/build/perf/perf.o
/tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ perf record --event /tmp/build/perf/perf.o sleep
libbpf: /tmp/build/perf/perf.o is not an eBPF object file
event syntax error: '/tmp/build/perf/perf.o'
\___ BPF object file '/tmp/build/perf/perf.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ perf record --event /tmp/foo.o sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
$ perf evlist
/tmp/foo.o
$ perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$
So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.
$ perf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:14 +03:00
char errbuf [ BUFSIZ ] ;
int err ;
2015-11-06 16:49:37 +03:00
err = PTR_ERR ( obj ) ;
perf tools: Enable passing bpf object file to --event
By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.
After applying this patch, commands like:
# perf record --event foo.o sleep
become possible.
However, at this point it is unable to link any useful things onto the
evsel list because the creating of probe points and BPF program
attaching have not been implemented. Before real events are possible to
be extracted, to avoid perf report error because of empty evsel list,
this patch link a dummy evsel. The dummy event related code will be
removed when probing and extracting code is ready.
Commiter notes:
Using it:
$ ls -la foo.o
ls: cannot access foo.o: No such file or directory
$ perf record --event foo.o sleep
libbpf: failed to open foo.o: No such file or directory
event syntax error: 'foo.o'
\___ BPF object file 'foo.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/build/perf/perf.o
/tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ perf record --event /tmp/build/perf/perf.o sleep
libbpf: /tmp/build/perf/perf.o is not an eBPF object file
event syntax error: '/tmp/build/perf/perf.o'
\___ BPF object file '/tmp/build/perf/perf.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ perf record --event /tmp/foo.o sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
$ perf evlist
/tmp/foo.o
$ perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$
So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.
$ perf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:14 +03:00
if ( err = = - ENOTSUP )
snprintf ( errbuf , sizeof ( errbuf ) ,
" BPF support is not compiled " ) ;
else
2015-11-06 16:58:09 +03:00
bpf__strerror_prepare_load ( bpf_file_name ,
source ,
- err , errbuf ,
sizeof ( errbuf ) ) ;
perf tools: Enable passing bpf object file to --event
By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.
After applying this patch, commands like:
# perf record --event foo.o sleep
become possible.
However, at this point it is unable to link any useful things onto the
evsel list because the creating of probe points and BPF program
attaching have not been implemented. Before real events are possible to
be extracted, to avoid perf report error because of empty evsel list,
this patch link a dummy evsel. The dummy event related code will be
removed when probing and extracting code is ready.
Commiter notes:
Using it:
$ ls -la foo.o
ls: cannot access foo.o: No such file or directory
$ perf record --event foo.o sleep
libbpf: failed to open foo.o: No such file or directory
event syntax error: 'foo.o'
\___ BPF object file 'foo.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/build/perf/perf.o
/tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ perf record --event /tmp/build/perf/perf.o sleep
libbpf: /tmp/build/perf/perf.o is not an eBPF object file
event syntax error: '/tmp/build/perf/perf.o'
\___ BPF object file '/tmp/build/perf/perf.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ perf record --event /tmp/foo.o sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
$ perf evlist
/tmp/foo.o
$ perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$
So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.
$ perf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-14 15:41:14 +03:00
data - > error - > help = strdup ( " (add -v to see detail) " ) ;
data - > error - > str = strdup ( errbuf ) ;
return err ;
}
return parse_events_load_bpf_obj ( data , list , obj ) ;
}
2012-03-15 23:09:15 +04:00
static int
parse_breakpoint_type ( const char * type , struct perf_event_attr * attr )
2009-11-23 17:42:35 +03:00
{
int i ;
for ( i = 0 ; i < 3 ; i + + ) {
2012-03-15 23:09:15 +04:00
if ( ! type | | ! type [ i ] )
2009-11-23 17:42:35 +03:00
break ;
2012-06-29 11:22:54 +04:00
# define CHECK_SET_TYPE(bit) \
do { \
if ( attr - > bp_type & bit ) \
return - EINVAL ; \
else \
attr - > bp_type | = bit ; \
} while ( 0 )
2009-11-23 17:42:35 +03:00
switch ( type [ i ] ) {
case ' r ' :
2012-06-29 11:22:54 +04:00
CHECK_SET_TYPE ( HW_BREAKPOINT_R ) ;
2009-11-23 17:42:35 +03:00
break ;
case ' w ' :
2012-06-29 11:22:54 +04:00
CHECK_SET_TYPE ( HW_BREAKPOINT_W ) ;
2009-11-23 17:42:35 +03:00
break ;
case ' x ' :
2012-06-29 11:22:54 +04:00
CHECK_SET_TYPE ( HW_BREAKPOINT_X ) ;
2009-11-23 17:42:35 +03:00
break ;
default :
2012-03-15 23:09:15 +04:00
return - EINVAL ;
2009-11-23 17:42:35 +03:00
}
}
2012-03-15 23:09:15 +04:00
2012-06-29 11:22:54 +04:00
# undef CHECK_SET_TYPE
2009-11-23 17:42:35 +03:00
if ( ! attr - > bp_type ) /* Default */
attr - > bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W ;
2012-03-15 23:09:15 +04:00
return 0 ;
2009-11-23 17:42:35 +03:00
}
2013-07-02 23:27:25 +04:00
int parse_events_add_breakpoint ( struct list_head * list , int * idx ,
2014-05-29 19:26:51 +04:00
void * ptr , char * type , u64 len )
2009-11-23 17:42:35 +03:00
{
2012-03-15 23:09:15 +04:00
struct perf_event_attr attr ;
2009-11-23 17:42:35 +03:00
2012-03-15 23:09:15 +04:00
memset ( & attr , 0 , sizeof ( attr ) ) ;
2012-03-20 22:15:39 +04:00
attr . bp_addr = ( unsigned long ) ptr ;
2009-11-23 17:42:35 +03:00
2012-03-15 23:09:15 +04:00
if ( parse_breakpoint_type ( type , & attr ) )
return - EINVAL ;
2009-11-23 17:42:35 +03:00
2014-05-29 19:26:51 +04:00
/* Provide some defaults if len is not specified */
if ( ! len ) {
if ( attr . bp_type = = HW_BREAKPOINT_X )
len = sizeof ( long ) ;
else
len = HW_BREAKPOINT_LEN_4 ;
}
attr . bp_len = len ;
2009-07-01 07:04:34 +04:00
2012-03-15 23:09:15 +04:00
attr . type = PERF_TYPE_BREAKPOINT ;
2012-07-14 23:03:10 +04:00
attr . sample_period = 1 ;
2011-04-27 05:55:40 +04:00
2015-07-29 12:42:10 +03:00
return add_event ( list , idx , & attr , NULL , NULL ) ;
2009-06-22 15:14:28 +04:00
}
2015-04-22 22:10:22 +03:00
static int check_type_val ( struct parse_events_term * term ,
struct parse_events_error * err ,
int type )
{
if ( type = = term - > type_val )
return 0 ;
if ( err ) {
err - > idx = term - > err_val ;
if ( type = = PARSE_EVENTS__TERM_TYPE_NUM )
err - > str = strdup ( " expected numeric value " ) ;
else
err - > str = strdup ( " expected string value " ) ;
}
return - EINVAL ;
}
2015-09-28 06:52:13 +03:00
typedef int config_term_func_t ( struct perf_event_attr * attr ,
struct parse_events_term * term ,
struct parse_events_error * err ) ;
static int config_term_common ( struct perf_event_attr * attr ,
struct parse_events_term * term ,
struct parse_events_error * err )
2012-03-15 23:09:16 +04:00
{
2015-04-22 22:10:22 +03:00
# define CHECK_TYPE_VAL(type) \
do { \
if ( check_type_val ( term , err , PARSE_EVENTS__TERM_TYPE_ # # type ) ) \
return - EINVAL ; \
2012-04-25 20:24:57 +04:00
} while ( 0 )
switch ( term - > type_term ) {
2012-03-15 23:09:16 +04:00
case PARSE_EVENTS__TERM_TYPE_CONFIG :
2012-04-25 20:24:57 +04:00
CHECK_TYPE_VAL ( NUM ) ;
2012-03-15 23:09:16 +04:00
attr - > config = term - > val . num ;
break ;
case PARSE_EVENTS__TERM_TYPE_CONFIG1 :
2012-04-25 20:24:57 +04:00
CHECK_TYPE_VAL ( NUM ) ;
2012-03-15 23:09:16 +04:00
attr - > config1 = term - > val . num ;
break ;
case PARSE_EVENTS__TERM_TYPE_CONFIG2 :
2012-04-25 20:24:57 +04:00
CHECK_TYPE_VAL ( NUM ) ;
2012-03-15 23:09:16 +04:00
attr - > config2 = term - > val . num ;
break ;
case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD :
2012-04-25 20:24:57 +04:00
CHECK_TYPE_VAL ( NUM ) ;
2012-03-15 23:09:16 +04:00
break ;
2015-08-09 09:45:23 +03:00
case PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ :
CHECK_TYPE_VAL ( NUM ) ;
break ;
2012-03-15 23:09:16 +04:00
case PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE :
/*
* TODO uncomment when the field is available
* attr - > branch_sample_type = term - > val . num ;
*/
break ;
2015-08-04 11:30:19 +03:00
case PARSE_EVENTS__TERM_TYPE_TIME :
CHECK_TYPE_VAL ( NUM ) ;
if ( term - > val . num > 1 ) {
err - > str = strdup ( " expected 0 or 1 " ) ;
err - > idx = term - > err_val ;
return - EINVAL ;
}
break ;
perf callchain: Per-event type selection support
This patchkit adds the ability to set callgraph mode (fp, dwarf, lbr) per
event. This in term can reduce sampling overhead and the size of the
perf.data.
Here is an example.
perf record -e 'cpu/cpu-cycles,period=1000,call-graph=fp,time=1/,cpu/instructions,call-graph=lbr/' sleep 1
perf evlist -v
cpu/cpu-cycles,period=1000,call-graph=fp,time=1/: type: 4, size: 112,
config: 0x3c, { sample_period, sample_freq }: 1000, sample_type:
IP|TID|TIME|CALLCHAIN|PERIOD|IDENTIFIER, read_format: ID, disabled: 1,
inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all:
1, exclude_guest: 1, mmap2: 1, comm_exec: 1
cpu/instructions,call-graph=lbr/: type: 4, size: 112, config: 0xc0, {
sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID,
disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1,
exclude_guest: 1
Signed-off-by: Kan Liang <kan.liang@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1439289050-40510-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-11 13:30:47 +03:00
case PARSE_EVENTS__TERM_TYPE_CALLGRAPH :
CHECK_TYPE_VAL ( STR ) ;
break ;
case PARSE_EVENTS__TERM_TYPE_STACKSIZE :
CHECK_TYPE_VAL ( NUM ) ;
break ;
perf tools: Enable pre-event inherit setting by config terms
This patch allows perf record setting event's attr.inherit bit by
config terms like:
# perf record -e cycles/no-inherit/ ...
# perf record -e cycles/inherit/ ...
So user can control inherit bit for each event separately.
In following example, a.out fork()s in main then do some complex
CPU intensive computations in both of its children.
Basic result with and without inherit:
# perf record -e cycles -e instructions ./a.out
[ perf record: Woken up 9 times to write data ]
[ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
# perf report --stdio
# ...
# Samples: 23K of event 'cycles'
# Event count (approx.): 23641752891
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30428312415
# perf record -i -e cycles -e instructions ./a.out
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
...
# Samples: 12K of event 'cycles'
# Event count (approx.): 11699501775
...
# Samples: 12K of event 'instructions'
# Event count (approx.): 15058023559
Cancel inherit for one event when globally enable:
# perf record -e cycles/no-inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
...
# Samples: 12K of event 'cycles/no-inherit/'
# Event count (approx.): 11895759282
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30668000441
Enable inherit for one event when globally disable:
# perf record -i -e cycles/inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
...
# Samples: 23K of event 'cycles/inherit/'
# Event count (approx.): 23285400229
...
# Samples: 11K of event 'instructions'
# Event count (approx.): 14969050259
Committer note:
One can check if the bit was set, in addition to seeing the result in
the perf.data file size as above by doing one of:
# perf record -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#
So, the inherit bit was set in both, now, if we disable it globally using
--no-inherit:
# perf record --no-inherit -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
No inherit bit set, then disabling it and setting just on the cycles event:
# perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
# perf evlist -v
cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#
We can see it as well in by using a more verbose level of debug messages in
the tool that sets up the perf_event_attr, 'perf record' in this case:
[root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
------------------------------------------------------------
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
------------------------------------------------------------
perf_event_attr:
size 112
config 0x1
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
freq 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
<SNIP>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
[ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28 13:55:02 +03:00
case PARSE_EVENTS__TERM_TYPE_INHERIT :
CHECK_TYPE_VAL ( NUM ) ;
break ;
case PARSE_EVENTS__TERM_TYPE_NOINHERIT :
CHECK_TYPE_VAL ( NUM ) ;
break ;
2012-05-21 11:12:53 +04:00
case PARSE_EVENTS__TERM_TYPE_NAME :
CHECK_TYPE_VAL ( STR ) ;
break ;
2012-03-15 23:09:16 +04:00
default :
perf tools: Show proper error message for wrong terms of hw/sw events
Show proper error message and show valid terms when wrong config terms
is specified for hw/sw type perf events.
This patch makes the original error format function formats_error_string()
more generic, which only outputs the static config terms for hw/sw perf
events, and prepends pmu formats for pmu events.
Before this patch:
$ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
invalid or unsupported event: 'cpu-clock/freqx=200/'
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
After this patch:
$ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
event syntax error: 'cpu-clock/freqx=200/'
\___ unknown term
valid terms: config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1443412336-120050-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 06:52:14 +03:00
err - > str = strdup ( " unknown term " ) ;
err - > idx = term - > err_term ;
err - > help = parse_events_formats_error_string ( NULL ) ;
2012-03-15 23:09:16 +04:00
return - EINVAL ;
}
2012-04-25 20:24:57 +04:00
2012-03-15 23:09:16 +04:00
return 0 ;
2012-04-25 20:24:57 +04:00
# undef CHECK_TYPE_VAL
2012-03-15 23:09:16 +04:00
}
2015-09-28 06:52:13 +03:00
static int config_term_pmu ( struct perf_event_attr * attr ,
struct parse_events_term * term ,
struct parse_events_error * err )
{
if ( term - > type_term = = PARSE_EVENTS__TERM_TYPE_USER )
/*
* Always succeed for sysfs terms , as we dont know
* at this point what type they need to have .
*/
return 0 ;
else
return config_term_common ( attr , term , err ) ;
}
2015-09-28 06:52:16 +03:00
static int config_term_tracepoint ( struct perf_event_attr * attr ,
struct parse_events_term * term ,
struct parse_events_error * err )
{
switch ( term - > type_term ) {
case PARSE_EVENTS__TERM_TYPE_CALLGRAPH :
case PARSE_EVENTS__TERM_TYPE_STACKSIZE :
perf tools: Enable pre-event inherit setting by config terms
This patch allows perf record setting event's attr.inherit bit by
config terms like:
# perf record -e cycles/no-inherit/ ...
# perf record -e cycles/inherit/ ...
So user can control inherit bit for each event separately.
In following example, a.out fork()s in main then do some complex
CPU intensive computations in both of its children.
Basic result with and without inherit:
# perf record -e cycles -e instructions ./a.out
[ perf record: Woken up 9 times to write data ]
[ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
# perf report --stdio
# ...
# Samples: 23K of event 'cycles'
# Event count (approx.): 23641752891
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30428312415
# perf record -i -e cycles -e instructions ./a.out
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
...
# Samples: 12K of event 'cycles'
# Event count (approx.): 11699501775
...
# Samples: 12K of event 'instructions'
# Event count (approx.): 15058023559
Cancel inherit for one event when globally enable:
# perf record -e cycles/no-inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
...
# Samples: 12K of event 'cycles/no-inherit/'
# Event count (approx.): 11895759282
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30668000441
Enable inherit for one event when globally disable:
# perf record -i -e cycles/inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
...
# Samples: 23K of event 'cycles/inherit/'
# Event count (approx.): 23285400229
...
# Samples: 11K of event 'instructions'
# Event count (approx.): 14969050259
Committer note:
One can check if the bit was set, in addition to seeing the result in
the perf.data file size as above by doing one of:
# perf record -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#
So, the inherit bit was set in both, now, if we disable it globally using
--no-inherit:
# perf record --no-inherit -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
No inherit bit set, then disabling it and setting just on the cycles event:
# perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
# perf evlist -v
cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#
We can see it as well in by using a more verbose level of debug messages in
the tool that sets up the perf_event_attr, 'perf record' in this case:
[root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
------------------------------------------------------------
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
------------------------------------------------------------
perf_event_attr:
size 112
config 0x1
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
freq 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
<SNIP>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
[ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28 13:55:02 +03:00
case PARSE_EVENTS__TERM_TYPE_INHERIT :
case PARSE_EVENTS__TERM_TYPE_NOINHERIT :
2015-09-28 06:52:16 +03:00
return config_term_common ( attr , term , err ) ;
default :
if ( err ) {
err - > idx = term - > err_term ;
err - > str = strdup ( " unknown term " ) ;
err - > help = strdup ( " valid terms: call-graph,stack-size \n " ) ;
}
return - EINVAL ;
}
return 0 ;
}
2012-03-15 23:09:16 +04:00
static int config_attr ( struct perf_event_attr * attr ,
2015-04-22 22:10:22 +03:00
struct list_head * head ,
2015-09-28 06:52:13 +03:00
struct parse_events_error * err ,
config_term_func_t config_term )
2012-03-15 23:09:16 +04:00
{
2013-01-18 23:29:49 +04:00
struct parse_events_term * term ;
2012-03-15 23:09:16 +04:00
list_for_each_entry ( term , head , list )
2015-04-22 22:10:22 +03:00
if ( config_term ( attr , term , err ) )
2012-03-15 23:09:16 +04:00
return - EINVAL ;
return 0 ;
}
2015-07-29 12:42:10 +03:00
static int get_config_terms ( struct list_head * head_config ,
struct list_head * head_terms __maybe_unused )
{
# define ADD_CONFIG_TERM(__type, __name, __val) \
do { \
struct perf_evsel_config_term * __t ; \
\
__t = zalloc ( sizeof ( * __t ) ) ; \
if ( ! __t ) \
return - ENOMEM ; \
\
INIT_LIST_HEAD ( & __t - > list ) ; \
__t - > type = PERF_EVSEL__CONFIG_TERM_ # # __type ; \
__t - > val . __name = __val ; \
list_add_tail ( & __t - > list , head_terms ) ; \
} while ( 0 )
struct parse_events_term * term ;
list_for_each_entry ( term , head_config , list ) {
switch ( term - > type_term ) {
2015-07-29 12:42:11 +03:00
case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD :
ADD_CONFIG_TERM ( PERIOD , period , term - > val . num ) ;
2015-08-04 11:30:19 +03:00
break ;
2015-08-09 09:45:23 +03:00
case PARSE_EVENTS__TERM_TYPE_SAMPLE_FREQ :
ADD_CONFIG_TERM ( FREQ , freq , term - > val . num ) ;
break ;
2015-08-04 11:30:19 +03:00
case PARSE_EVENTS__TERM_TYPE_TIME :
ADD_CONFIG_TERM ( TIME , time , term - > val . num ) ;
break ;
perf callchain: Per-event type selection support
This patchkit adds the ability to set callgraph mode (fp, dwarf, lbr) per
event. This in term can reduce sampling overhead and the size of the
perf.data.
Here is an example.
perf record -e 'cpu/cpu-cycles,period=1000,call-graph=fp,time=1/,cpu/instructions,call-graph=lbr/' sleep 1
perf evlist -v
cpu/cpu-cycles,period=1000,call-graph=fp,time=1/: type: 4, size: 112,
config: 0x3c, { sample_period, sample_freq }: 1000, sample_type:
IP|TID|TIME|CALLCHAIN|PERIOD|IDENTIFIER, read_format: ID, disabled: 1,
inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all:
1, exclude_guest: 1, mmap2: 1, comm_exec: 1
cpu/instructions,call-graph=lbr/: type: 4, size: 112, config: 0xc0, {
sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID,
disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1,
exclude_guest: 1
Signed-off-by: Kan Liang <kan.liang@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1439289050-40510-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-11 13:30:47 +03:00
case PARSE_EVENTS__TERM_TYPE_CALLGRAPH :
ADD_CONFIG_TERM ( CALLGRAPH , callgraph , term - > val . str ) ;
break ;
case PARSE_EVENTS__TERM_TYPE_STACKSIZE :
ADD_CONFIG_TERM ( STACK_USER , stack_user , term - > val . num ) ;
break ;
perf tools: Enable pre-event inherit setting by config terms
This patch allows perf record setting event's attr.inherit bit by
config terms like:
# perf record -e cycles/no-inherit/ ...
# perf record -e cycles/inherit/ ...
So user can control inherit bit for each event separately.
In following example, a.out fork()s in main then do some complex
CPU intensive computations in both of its children.
Basic result with and without inherit:
# perf record -e cycles -e instructions ./a.out
[ perf record: Woken up 9 times to write data ]
[ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
# perf report --stdio
# ...
# Samples: 23K of event 'cycles'
# Event count (approx.): 23641752891
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30428312415
# perf record -i -e cycles -e instructions ./a.out
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
...
# Samples: 12K of event 'cycles'
# Event count (approx.): 11699501775
...
# Samples: 12K of event 'instructions'
# Event count (approx.): 15058023559
Cancel inherit for one event when globally enable:
# perf record -e cycles/no-inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
...
# Samples: 12K of event 'cycles/no-inherit/'
# Event count (approx.): 11895759282
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30668000441
Enable inherit for one event when globally disable:
# perf record -i -e cycles/inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
...
# Samples: 23K of event 'cycles/inherit/'
# Event count (approx.): 23285400229
...
# Samples: 11K of event 'instructions'
# Event count (approx.): 14969050259
Committer note:
One can check if the bit was set, in addition to seeing the result in
the perf.data file size as above by doing one of:
# perf record -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#
So, the inherit bit was set in both, now, if we disable it globally using
--no-inherit:
# perf record --no-inherit -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
No inherit bit set, then disabling it and setting just on the cycles event:
# perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
# perf evlist -v
cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#
We can see it as well in by using a more verbose level of debug messages in
the tool that sets up the perf_event_attr, 'perf record' in this case:
[root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
------------------------------------------------------------
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
------------------------------------------------------------
perf_event_attr:
size 112
config 0x1
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
freq 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
<SNIP>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
[ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-28 13:55:02 +03:00
case PARSE_EVENTS__TERM_TYPE_INHERIT :
ADD_CONFIG_TERM ( INHERIT , inherit , term - > val . num ? 1 : 0 ) ;
break ;
case PARSE_EVENTS__TERM_TYPE_NOINHERIT :
ADD_CONFIG_TERM ( INHERIT , inherit , term - > val . num ? 0 : 1 ) ;
break ;
2015-07-29 12:42:10 +03:00
default :
break ;
}
}
# undef ADD_EVSEL_CONFIG
return 0 ;
}
2015-09-28 06:52:16 +03:00
int parse_events_add_tracepoint ( struct list_head * list , int * idx ,
char * sys , char * event ,
2015-09-29 18:05:31 +03:00
struct parse_events_error * err ,
2015-09-28 06:52:16 +03:00
struct list_head * head_config )
{
if ( head_config ) {
struct perf_event_attr attr ;
2015-09-29 18:05:31 +03:00
if ( config_attr ( & attr , head_config , err ,
2015-09-28 06:52:16 +03:00
config_term_tracepoint ) )
return - EINVAL ;
}
if ( strpbrk ( sys , " *? " ) )
return add_tracepoint_multi_sys ( list , idx , sys , event ,
2015-09-29 18:05:31 +03:00
err , head_config ) ;
2015-09-28 06:52:16 +03:00
else
return add_tracepoint_event ( list , idx , sys , event ,
2015-09-29 18:05:31 +03:00
err , head_config ) ;
2015-09-28 06:52:16 +03:00
}
2015-04-22 22:10:24 +03:00
int parse_events_add_numeric ( struct parse_events_evlist * data ,
struct list_head * list ,
2012-08-07 21:43:13 +04:00
u32 type , u64 config ,
2012-03-15 23:09:16 +04:00
struct list_head * head_config )
2009-05-26 13:10:09 +04:00
{
2012-03-15 23:09:15 +04:00
struct perf_event_attr attr ;
2015-07-29 12:42:10 +03:00
LIST_HEAD ( config_terms ) ;
2009-07-01 07:04:34 +04:00
2012-03-15 23:09:15 +04:00
memset ( & attr , 0 , sizeof ( attr ) ) ;
attr . type = type ;
attr . config = config ;
2012-03-15 23:09:16 +04:00
2015-07-29 12:42:10 +03:00
if ( head_config ) {
2015-09-28 06:52:13 +03:00
if ( config_attr ( & attr , head_config , data - > error ,
config_term_common ) )
2015-07-29 12:42:10 +03:00
return - EINVAL ;
if ( get_config_terms ( head_config , & config_terms ) )
return - ENOMEM ;
}
2012-03-15 23:09:16 +04:00
2015-07-29 12:42:10 +03:00
return add_event ( list , & data - > idx , & attr , NULL , & config_terms ) ;
2009-07-01 07:04:34 +04:00
}
2009-05-26 13:10:09 +04:00
2013-01-18 23:29:49 +04:00
static int parse_events__is_name_term ( struct parse_events_term * term )
2012-05-21 11:12:53 +04:00
{
return term - > type_term = = PARSE_EVENTS__TERM_TYPE_NAME ;
}
2012-06-12 20:45:00 +04:00
static char * pmu_event_name ( struct list_head * head_terms )
2012-05-21 11:12:53 +04:00
{
2013-01-18 23:29:49 +04:00
struct parse_events_term * term ;
2012-05-21 11:12:53 +04:00
list_for_each_entry ( term , head_terms , list )
if ( parse_events__is_name_term ( term ) )
return term - > val . str ;
2012-06-12 20:45:00 +04:00
return NULL ;
2012-05-21 11:12:53 +04:00
}
2015-04-22 22:10:19 +03:00
int parse_events_add_pmu ( struct parse_events_evlist * data ,
struct list_head * list , char * name ,
struct list_head * head_config )
2012-03-15 23:09:18 +04:00
{
struct perf_event_attr attr ;
2014-09-24 18:04:06 +04:00
struct perf_pmu_info info ;
2012-03-15 23:09:18 +04:00
struct perf_pmu * pmu ;
2013-11-12 20:58:49 +04:00
struct perf_evsel * evsel ;
2015-07-29 12:42:10 +03:00
LIST_HEAD ( config_terms ) ;
2012-03-15 23:09:18 +04:00
pmu = perf_pmu__find ( name ) ;
if ( ! pmu )
return - EINVAL ;
2014-07-31 10:00:49 +04:00
if ( pmu - > default_config ) {
memcpy ( & attr , pmu - > default_config ,
sizeof ( struct perf_event_attr ) ) ;
} else {
memset ( & attr , 0 , sizeof ( attr ) ) ;
}
2012-03-15 23:09:18 +04:00
2014-08-15 23:08:40 +04:00
if ( ! head_config ) {
attr . type = pmu - > type ;
2015-07-29 12:42:10 +03:00
evsel = __add_event ( list , & data - > idx , & attr , NULL , pmu - > cpus , NULL ) ;
2014-08-15 23:08:40 +04:00
return evsel ? 0 : - ENOMEM ;
}
2014-09-24 18:04:06 +04:00
if ( perf_pmu__check_alias ( pmu , head_config , & info ) )
2012-06-15 10:31:41 +04:00
return - EINVAL ;
2012-03-15 23:09:18 +04:00
/*
* Configure hardcoded terms first , no need to check
* return value when called with fail = = 0 ; )
*/
2015-09-28 06:52:13 +03:00
if ( config_attr ( & attr , head_config , data - > error , config_term_pmu ) )
2015-04-22 22:10:18 +03:00
return - EINVAL ;
2012-03-15 23:09:18 +04:00
2015-07-29 12:42:10 +03:00
if ( get_config_terms ( head_config , & config_terms ) )
return - ENOMEM ;
perf tools: Add term support for parse_events_error
Allowing event's term processing to report back error, like:
$ perf record -e 'cpu/even=0x1/' ls
event syntax error: 'cpu/even=0x1/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period,branch_type
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1429729824-13932-7-git-send-email-jolsa@kernel.org
[ Renamed 'error' variables to 'err', not to clash with util.h error() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-22 22:10:21 +03:00
if ( perf_pmu__config ( pmu , & attr , head_config , data - > error ) )
2012-03-15 23:09:18 +04:00
return - EINVAL ;
2015-04-22 22:10:19 +03:00
evsel = __add_event ( list , & data - > idx , & attr ,
2015-07-29 12:42:10 +03:00
pmu_event_name ( head_config ) , pmu - > cpus ,
& config_terms ) ;
2013-11-12 20:58:49 +04:00
if ( evsel ) {
2014-09-24 18:04:06 +04:00
evsel - > unit = info . unit ;
evsel - > scale = info . scale ;
2014-11-21 12:31:12 +03:00
evsel - > per_pkg = info . per_pkg ;
2014-11-21 12:31:13 +03:00
evsel - > snapshot = info . snapshot ;
2013-11-12 20:58:49 +04:00
}
return evsel ? 0 : - ENOMEM ;
2012-03-15 23:09:18 +04:00
}
perf tools: Enable grouping logic for parsed events
This patch adds a functionality that allows to create event groups
based on the way they are specified on the command line. Adding
functionality to the '{}' group syntax introduced in earlier patch.
The current '--group/-g' option behaviour remains intact. If you
specify it for record/stat/top command, all the specified events
become members of a single group with the first event as a group
leader.
With the new '{}' group syntax you can create group like:
# perf record -e '{cycles,faults}' ls
resulting in single event group containing 'cycles' and 'faults'
events, with cycles event as group leader.
All groups are created with regards to threads and cpus. Thus
recording an event group within a 2 threads on server with
4 CPUs will create 8 separate groups.
Examples (first event in brackets is group leader):
# 1 group (cpu-clock,task-clock)
perf record --group -e cpu-clock,task-clock ls
perf record -e '{cpu-clock,task-clock}' ls
# 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
perf record -e '{cpu-clock,task-clock},{minor-faults,major-faults}' ls
# 1 group (cpu-clock,task-clock,minor-faults,major-faults)
perf record --group -e cpu-clock,task-clock -e minor-faults,major-faults ls
perf record -e '{cpu-clock,task-clock,minor-faults,major-faults}' ls
# 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
perf record -e '{cpu-clock,task-clock} -e '{minor-faults,major-faults}' \
-e instructions ls
# 1 group
# (cpu-clock,task-clock,minor-faults,major-faults,instructions)
perf record --group -e cpu-clock,task-clock \
-e minor-faults,major-faults -e instructions ls perf record -e
'{cpu-clock,task-clock,minor-faults,major-faults,instructions}' ls
It's possible to use standard event modifier for a group, which spans
over all events in the group and updates each event modifier settings,
for example:
# perf record -r '{faults:k,cache-references}:p'
resulting in ':kp' modifier being used for 'faults' and ':p' modifier
being used for 'cache-references' event.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/n/tip-ho42u0wcr8mn1otkalqi13qp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-08-08 14:22:36 +04:00
int parse_events__modifier_group ( struct list_head * list ,
char * event_mod )
2012-08-08 14:14:14 +04:00
{
perf tools: Enable grouping logic for parsed events
This patch adds a functionality that allows to create event groups
based on the way they are specified on the command line. Adding
functionality to the '{}' group syntax introduced in earlier patch.
The current '--group/-g' option behaviour remains intact. If you
specify it for record/stat/top command, all the specified events
become members of a single group with the first event as a group
leader.
With the new '{}' group syntax you can create group like:
# perf record -e '{cycles,faults}' ls
resulting in single event group containing 'cycles' and 'faults'
events, with cycles event as group leader.
All groups are created with regards to threads and cpus. Thus
recording an event group within a 2 threads on server with
4 CPUs will create 8 separate groups.
Examples (first event in brackets is group leader):
# 1 group (cpu-clock,task-clock)
perf record --group -e cpu-clock,task-clock ls
perf record -e '{cpu-clock,task-clock}' ls
# 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
perf record -e '{cpu-clock,task-clock},{minor-faults,major-faults}' ls
# 1 group (cpu-clock,task-clock,minor-faults,major-faults)
perf record --group -e cpu-clock,task-clock -e minor-faults,major-faults ls
perf record -e '{cpu-clock,task-clock,minor-faults,major-faults}' ls
# 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
perf record -e '{cpu-clock,task-clock} -e '{minor-faults,major-faults}' \
-e instructions ls
# 1 group
# (cpu-clock,task-clock,minor-faults,major-faults,instructions)
perf record --group -e cpu-clock,task-clock \
-e minor-faults,major-faults -e instructions ls perf record -e
'{cpu-clock,task-clock,minor-faults,major-faults,instructions}' ls
It's possible to use standard event modifier for a group, which spans
over all events in the group and updates each event modifier settings,
for example:
# perf record -r '{faults:k,cache-references}:p'
resulting in ':kp' modifier being used for 'faults' and ':p' modifier
being used for 'cache-references' event.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/n/tip-ho42u0wcr8mn1otkalqi13qp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-08-08 14:22:36 +04:00
return parse_events__modifier_event ( list , event_mod , true ) ;
}
2012-08-14 23:35:48 +04:00
void parse_events__set_leader ( char * name , struct list_head * list )
perf tools: Enable grouping logic for parsed events
This patch adds a functionality that allows to create event groups
based on the way they are specified on the command line. Adding
functionality to the '{}' group syntax introduced in earlier patch.
The current '--group/-g' option behaviour remains intact. If you
specify it for record/stat/top command, all the specified events
become members of a single group with the first event as a group
leader.
With the new '{}' group syntax you can create group like:
# perf record -e '{cycles,faults}' ls
resulting in single event group containing 'cycles' and 'faults'
events, with cycles event as group leader.
All groups are created with regards to threads and cpus. Thus
recording an event group within a 2 threads on server with
4 CPUs will create 8 separate groups.
Examples (first event in brackets is group leader):
# 1 group (cpu-clock,task-clock)
perf record --group -e cpu-clock,task-clock ls
perf record -e '{cpu-clock,task-clock}' ls
# 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
perf record -e '{cpu-clock,task-clock},{minor-faults,major-faults}' ls
# 1 group (cpu-clock,task-clock,minor-faults,major-faults)
perf record --group -e cpu-clock,task-clock -e minor-faults,major-faults ls
perf record -e '{cpu-clock,task-clock,minor-faults,major-faults}' ls
# 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
perf record -e '{cpu-clock,task-clock} -e '{minor-faults,major-faults}' \
-e instructions ls
# 1 group
# (cpu-clock,task-clock,minor-faults,major-faults,instructions)
perf record --group -e cpu-clock,task-clock \
-e minor-faults,major-faults -e instructions ls perf record -e
'{cpu-clock,task-clock,minor-faults,major-faults,instructions}' ls
It's possible to use standard event modifier for a group, which spans
over all events in the group and updates each event modifier settings,
for example:
# perf record -r '{faults:k,cache-references}:p'
resulting in ':kp' modifier being used for 'faults' and ':p' modifier
being used for 'cache-references' event.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/n/tip-ho42u0wcr8mn1otkalqi13qp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-08-08 14:22:36 +04:00
{
struct perf_evsel * leader ;
2015-09-06 10:13:17 +03:00
if ( list_empty ( list ) ) {
WARN_ONCE ( true , " WARNING: failed to set leader: empty list " ) ;
return ;
}
2012-08-14 23:35:48 +04:00
__perf_evlist__set_leader ( list ) ;
leader = list_entry ( list - > next , struct perf_evsel , node ) ;
perf tools: Enable grouping logic for parsed events
This patch adds a functionality that allows to create event groups
based on the way they are specified on the command line. Adding
functionality to the '{}' group syntax introduced in earlier patch.
The current '--group/-g' option behaviour remains intact. If you
specify it for record/stat/top command, all the specified events
become members of a single group with the first event as a group
leader.
With the new '{}' group syntax you can create group like:
# perf record -e '{cycles,faults}' ls
resulting in single event group containing 'cycles' and 'faults'
events, with cycles event as group leader.
All groups are created with regards to threads and cpus. Thus
recording an event group within a 2 threads on server with
4 CPUs will create 8 separate groups.
Examples (first event in brackets is group leader):
# 1 group (cpu-clock,task-clock)
perf record --group -e cpu-clock,task-clock ls
perf record -e '{cpu-clock,task-clock}' ls
# 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
perf record -e '{cpu-clock,task-clock},{minor-faults,major-faults}' ls
# 1 group (cpu-clock,task-clock,minor-faults,major-faults)
perf record --group -e cpu-clock,task-clock -e minor-faults,major-faults ls
perf record -e '{cpu-clock,task-clock,minor-faults,major-faults}' ls
# 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
perf record -e '{cpu-clock,task-clock} -e '{minor-faults,major-faults}' \
-e instructions ls
# 1 group
# (cpu-clock,task-clock,minor-faults,major-faults,instructions)
perf record --group -e cpu-clock,task-clock \
-e minor-faults,major-faults -e instructions ls perf record -e
'{cpu-clock,task-clock,minor-faults,major-faults,instructions}' ls
It's possible to use standard event modifier for a group, which spans
over all events in the group and updates each event modifier settings,
for example:
# perf record -r '{faults:k,cache-references}:p'
resulting in ':kp' modifier being used for 'faults' and ':p' modifier
being used for 'cache-references' event.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/n/tip-ho42u0wcr8mn1otkalqi13qp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-08-08 14:22:36 +04:00
leader - > group_name = name ? strdup ( name ) : NULL ;
2012-08-08 14:14:14 +04:00
}
2013-07-02 23:27:25 +04:00
/* list_event is assumed to point to malloc'ed memory */
2012-03-20 22:15:40 +04:00
void parse_events_update_lists ( struct list_head * list_event ,
struct list_head * list_all )
{
/*
* Called for single event definition . Update the
2012-08-08 14:14:14 +04:00
* ' all event ' list , and reinit the ' single event '
2012-03-20 22:15:40 +04:00
* list , for next event definition .
*/
list_splice_tail ( list_event , list_all ) ;
2012-05-21 11:12:51 +04:00
free ( list_event ) ;
2012-03-20 22:15:40 +04:00
}
2012-08-08 14:21:54 +04:00
struct event_modifier {
int eu ;
int ek ;
int eh ;
int eH ;
int eG ;
2015-04-08 00:25:14 +03:00
int eI ;
2012-08-08 14:21:54 +04:00
int precise ;
perf tools: Introduce 'P' modifier to request max precision
The 'P' will cause the event to get maximum possible detected precise
level.
Following record:
$ perf record -e cycles:P ...
will detect maximum precise level for 'cycles' event and use it.
Commiter note:
Testing it:
$ perf record -e cycles:P usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data (9 samples) ]
$ perf evlist
cycles:P
$ perf evlist -v
cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1, mmap2: 1,
comm_exec: 1
$
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-05 21:06:05 +03:00
int precise_max ;
2012-08-08 14:21:54 +04:00
int exclude_GH ;
2012-10-10 19:39:03 +04:00
int sample_read ;
perf tools: Add support for pinned modifier
This commit adds support for a new modifier "D", which requests that the
event, or group of events, be pinned to the PMU.
The "p" modifier is already taken for precise, and "P" may be used in
future to mean "fully precise".
So we use "D", which stands for pinneD - and looks like a padlock, or if
you're using the ":D" syntax perf smiles at you.
This is an oft-requested feature from our HW folks, who want to be able
to run a large number of events, but also want 100% accurate results for
instructions per cycle.
Comparison of results with and without pinning:
$ perf stat -e '{cycles,instructions}:D' -e cycles,instructions,...
79,590,480,683 cycles # 0.000 GHz
166,123,716,524 instructions # 2.09 insns per cycle
# 0.11 stalled cycles per insn
79,352,134,463 cycles # 0.000 GHz [11.11%]
165,178,301,818 instructions # 2.08 insns per cycle
# 0.11 stalled cycles per insn [11.13%]
As you can see although perf does a very good job of scaling the values
in the non-pinned case, there is some small discrepancy.
The patch is fairly straight forward, the one detail is that we need to
make sure we only request pinning for the group leader when we have a
group.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1375795686-4226-1-git-send-email-michael@ellerman.id.au
[ Use perf_evsel__is_group_leader instead of open coded equivalent, as
suggested by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-06 17:28:05 +04:00
int pinned ;
2012-08-08 14:21:54 +04:00
} ;
static int get_event_modifier ( struct event_modifier * mod , char * str ,
struct perf_evsel * evsel )
2009-07-01 07:04:34 +04:00
{
2012-08-08 14:21:54 +04:00
int eu = evsel ? evsel - > attr . exclude_user : 0 ;
int ek = evsel ? evsel - > attr . exclude_kernel : 0 ;
int eh = evsel ? evsel - > attr . exclude_hv : 0 ;
int eH = evsel ? evsel - > attr . exclude_host : 0 ;
int eG = evsel ? evsel - > attr . exclude_guest : 0 ;
2015-04-08 00:25:14 +03:00
int eI = evsel ? evsel - > attr . exclude_idle : 0 ;
2012-08-08 14:21:54 +04:00
int precise = evsel ? evsel - > attr . precise_ip : 0 ;
perf tools: Introduce 'P' modifier to request max precision
The 'P' will cause the event to get maximum possible detected precise
level.
Following record:
$ perf record -e cycles:P ...
will detect maximum precise level for 'cycles' event and use it.
Commiter note:
Testing it:
$ perf record -e cycles:P usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data (9 samples) ]
$ perf evlist
cycles:P
$ perf evlist -v
cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1, mmap2: 1,
comm_exec: 1
$
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-05 21:06:05 +03:00
int precise_max = 0 ;
2012-10-10 19:39:03 +04:00
int sample_read = 0 ;
perf tools: Add support for pinned modifier
This commit adds support for a new modifier "D", which requests that the
event, or group of events, be pinned to the PMU.
The "p" modifier is already taken for precise, and "P" may be used in
future to mean "fully precise".
So we use "D", which stands for pinneD - and looks like a padlock, or if
you're using the ":D" syntax perf smiles at you.
This is an oft-requested feature from our HW folks, who want to be able
to run a large number of events, but also want 100% accurate results for
instructions per cycle.
Comparison of results with and without pinning:
$ perf stat -e '{cycles,instructions}:D' -e cycles,instructions,...
79,590,480,683 cycles # 0.000 GHz
166,123,716,524 instructions # 2.09 insns per cycle
# 0.11 stalled cycles per insn
79,352,134,463 cycles # 0.000 GHz [11.11%]
165,178,301,818 instructions # 2.08 insns per cycle
# 0.11 stalled cycles per insn [11.13%]
As you can see although perf does a very good job of scaling the values
in the non-pinned case, there is some small discrepancy.
The patch is fairly straight forward, the one detail is that we need to
make sure we only request pinning for the group leader when we have a
group.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1375795686-4226-1-git-send-email-michael@ellerman.id.au
[ Use perf_evsel__is_group_leader instead of open coded equivalent, as
suggested by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-06 17:28:05 +04:00
int pinned = evsel ? evsel - > attr . pinned : 0 ;
2009-06-06 11:58:57 +04:00
2012-08-08 14:21:54 +04:00
int exclude = eu | ek | eh ;
int exclude_GH = evsel ? evsel - > exclude_GH : 0 ;
memset ( mod , 0 , sizeof ( * mod ) ) ;
2011-04-27 06:06:33 +04:00
2009-07-01 07:04:34 +04:00
while ( * str ) {
2010-04-09 01:03:20 +04:00
if ( * str = = ' u ' ) {
if ( ! exclude )
exclude = eu = ek = eh = 1 ;
2009-07-01 07:04:34 +04:00
eu = 0 ;
2010-04-09 01:03:20 +04:00
} else if ( * str = = ' k ' ) {
if ( ! exclude )
exclude = eu = ek = eh = 1 ;
2009-07-01 07:04:34 +04:00
ek = 0 ;
2010-04-09 01:03:20 +04:00
} else if ( * str = = ' h ' ) {
if ( ! exclude )
exclude = eu = ek = eh = 1 ;
2009-07-01 07:04:34 +04:00
eh = 0 ;
2012-01-04 20:54:19 +04:00
} else if ( * str = = ' G ' ) {
if ( ! exclude_GH )
exclude_GH = eG = eH = 1 ;
eG = 0 ;
} else if ( * str = = ' H ' ) {
if ( ! exclude_GH )
exclude_GH = eG = eH = 1 ;
eH = 0 ;
2015-04-08 00:25:14 +03:00
} else if ( * str = = ' I ' ) {
eI = 1 ;
2010-04-09 01:03:20 +04:00
} else if ( * str = = ' p ' ) {
precise + + ;
2012-09-14 00:59:13 +04:00
/* use of precise requires exclude_guest */
if ( ! exclude_GH )
eG = 1 ;
perf tools: Introduce 'P' modifier to request max precision
The 'P' will cause the event to get maximum possible detected precise
level.
Following record:
$ perf record -e cycles:P ...
will detect maximum precise level for 'cycles' event and use it.
Commiter note:
Testing it:
$ perf record -e cycles:P usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data (9 samples) ]
$ perf evlist
cycles:P
$ perf evlist -v
cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1, mmap2: 1,
comm_exec: 1
$
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-05 21:06:05 +03:00
} else if ( * str = = ' P ' ) {
precise_max = 1 ;
2012-10-10 19:39:03 +04:00
} else if ( * str = = ' S ' ) {
sample_read = 1 ;
perf tools: Add support for pinned modifier
This commit adds support for a new modifier "D", which requests that the
event, or group of events, be pinned to the PMU.
The "p" modifier is already taken for precise, and "P" may be used in
future to mean "fully precise".
So we use "D", which stands for pinneD - and looks like a padlock, or if
you're using the ":D" syntax perf smiles at you.
This is an oft-requested feature from our HW folks, who want to be able
to run a large number of events, but also want 100% accurate results for
instructions per cycle.
Comparison of results with and without pinning:
$ perf stat -e '{cycles,instructions}:D' -e cycles,instructions,...
79,590,480,683 cycles # 0.000 GHz
166,123,716,524 instructions # 2.09 insns per cycle
# 0.11 stalled cycles per insn
79,352,134,463 cycles # 0.000 GHz [11.11%]
165,178,301,818 instructions # 2.08 insns per cycle
# 0.11 stalled cycles per insn [11.13%]
As you can see although perf does a very good job of scaling the values
in the non-pinned case, there is some small discrepancy.
The patch is fairly straight forward, the one detail is that we need to
make sure we only request pinning for the group leader when we have a
group.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1375795686-4226-1-git-send-email-michael@ellerman.id.au
[ Use perf_evsel__is_group_leader instead of open coded equivalent, as
suggested by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-06 17:28:05 +04:00
} else if ( * str = = ' D ' ) {
pinned = 1 ;
2010-04-09 01:03:20 +04:00
} else
2009-07-01 07:04:34 +04:00
break ;
2010-04-09 01:03:20 +04:00
2009-07-01 07:04:34 +04:00
+ + str ;
2009-05-26 11:17:18 +04:00
}
2011-04-27 06:06:33 +04:00
2012-03-15 23:09:15 +04:00
/*
* precise ip :
*
* 0 - SAMPLE_IP can have arbitrary skid
* 1 - SAMPLE_IP must have constant skid
* 2 - SAMPLE_IP requested to have 0 skid
* 3 - SAMPLE_IP must have 0 skid
*
* See also PERF_RECORD_MISC_EXACT_IP
*/
if ( precise > 3 )
return - EINVAL ;
2011-04-27 06:06:33 +04:00
2012-08-08 14:21:54 +04:00
mod - > eu = eu ;
mod - > ek = ek ;
mod - > eh = eh ;
mod - > eH = eH ;
mod - > eG = eG ;
2015-04-08 00:25:14 +03:00
mod - > eI = eI ;
2012-08-08 14:21:54 +04:00
mod - > precise = precise ;
perf tools: Introduce 'P' modifier to request max precision
The 'P' will cause the event to get maximum possible detected precise
level.
Following record:
$ perf record -e cycles:P ...
will detect maximum precise level for 'cycles' event and use it.
Commiter note:
Testing it:
$ perf record -e cycles:P usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data (9 samples) ]
$ perf evlist
cycles:P
$ perf evlist -v
cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1, mmap2: 1,
comm_exec: 1
$
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-05 21:06:05 +03:00
mod - > precise_max = precise_max ;
2012-08-08 14:21:54 +04:00
mod - > exclude_GH = exclude_GH ;
2012-10-10 19:39:03 +04:00
mod - > sample_read = sample_read ;
perf tools: Add support for pinned modifier
This commit adds support for a new modifier "D", which requests that the
event, or group of events, be pinned to the PMU.
The "p" modifier is already taken for precise, and "P" may be used in
future to mean "fully precise".
So we use "D", which stands for pinneD - and looks like a padlock, or if
you're using the ":D" syntax perf smiles at you.
This is an oft-requested feature from our HW folks, who want to be able
to run a large number of events, but also want 100% accurate results for
instructions per cycle.
Comparison of results with and without pinning:
$ perf stat -e '{cycles,instructions}:D' -e cycles,instructions,...
79,590,480,683 cycles # 0.000 GHz
166,123,716,524 instructions # 2.09 insns per cycle
# 0.11 stalled cycles per insn
79,352,134,463 cycles # 0.000 GHz [11.11%]
165,178,301,818 instructions # 2.08 insns per cycle
# 0.11 stalled cycles per insn [11.13%]
As you can see although perf does a very good job of scaling the values
in the non-pinned case, there is some small discrepancy.
The patch is fairly straight forward, the one detail is that we need to
make sure we only request pinning for the group leader when we have a
group.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1375795686-4226-1-git-send-email-michael@ellerman.id.au
[ Use perf_evsel__is_group_leader instead of open coded equivalent, as
suggested by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-06 17:28:05 +04:00
mod - > pinned = pinned ;
2012-08-08 14:21:54 +04:00
return 0 ;
}
2012-11-13 18:32:58 +04:00
/*
* Basic modifier sanity check to validate it contains only one
* instance of any modifier ( apart from ' p ' ) present .
*/
static int check_modifier ( char * str )
{
char * p = str ;
/* The sizeof includes 0 byte as well. */
perf tools: Introduce 'P' modifier to request max precision
The 'P' will cause the event to get maximum possible detected precise
level.
Following record:
$ perf record -e cycles:P ...
will detect maximum precise level for 'cycles' event and use it.
Commiter note:
Testing it:
$ perf record -e cycles:P usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data (9 samples) ]
$ perf evlist
cycles:P
$ perf evlist -v
cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1, mmap2: 1,
comm_exec: 1
$
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-05 21:06:05 +03:00
if ( strlen ( str ) > ( sizeof ( " ukhGHpppPSDI " ) - 1 ) )
2012-11-13 18:32:58 +04:00
return - 1 ;
while ( * p ) {
if ( * p ! = ' p ' & & strchr ( p + 1 , * p ) )
return - 1 ;
p + + ;
}
return 0 ;
}
2012-08-08 14:21:54 +04:00
int parse_events__modifier_event ( struct list_head * list , char * str , bool add )
{
struct perf_evsel * evsel ;
struct event_modifier mod ;
if ( str = = NULL )
return 0 ;
2012-11-13 18:32:58 +04:00
if ( check_modifier ( str ) )
return - EINVAL ;
2012-08-08 14:21:54 +04:00
if ( ! add & & get_event_modifier ( & mod , str , NULL ) )
return - EINVAL ;
2014-01-10 17:37:27 +04:00
__evlist__for_each ( list , evsel ) {
2012-08-08 14:21:54 +04:00
if ( add & & get_event_modifier ( & mod , str , evsel ) )
return - EINVAL ;
evsel - > attr . exclude_user = mod . eu ;
evsel - > attr . exclude_kernel = mod . ek ;
evsel - > attr . exclude_hv = mod . eh ;
evsel - > attr . precise_ip = mod . precise ;
evsel - > attr . exclude_host = mod . eH ;
evsel - > attr . exclude_guest = mod . eG ;
2015-04-08 00:25:14 +03:00
evsel - > attr . exclude_idle = mod . eI ;
2012-08-08 14:21:54 +04:00
evsel - > exclude_GH = mod . exclude_GH ;
2012-10-10 19:39:03 +04:00
evsel - > sample_read = mod . sample_read ;
perf tools: Introduce 'P' modifier to request max precision
The 'P' will cause the event to get maximum possible detected precise
level.
Following record:
$ perf record -e cycles:P ...
will detect maximum precise level for 'cycles' event and use it.
Commiter note:
Testing it:
$ perf record -e cycles:P usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data (9 samples) ]
$ perf evlist
cycles:P
$ perf evlist -v
cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1, mmap2: 1,
comm_exec: 1
$
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-10-05 21:06:05 +03:00
evsel - > precise_max = mod . precise_max ;
perf tools: Add support for pinned modifier
This commit adds support for a new modifier "D", which requests that the
event, or group of events, be pinned to the PMU.
The "p" modifier is already taken for precise, and "P" may be used in
future to mean "fully precise".
So we use "D", which stands for pinneD - and looks like a padlock, or if
you're using the ":D" syntax perf smiles at you.
This is an oft-requested feature from our HW folks, who want to be able
to run a large number of events, but also want 100% accurate results for
instructions per cycle.
Comparison of results with and without pinning:
$ perf stat -e '{cycles,instructions}:D' -e cycles,instructions,...
79,590,480,683 cycles # 0.000 GHz
166,123,716,524 instructions # 2.09 insns per cycle
# 0.11 stalled cycles per insn
79,352,134,463 cycles # 0.000 GHz [11.11%]
165,178,301,818 instructions # 2.08 insns per cycle
# 0.11 stalled cycles per insn [11.13%]
As you can see although perf does a very good job of scaling the values
in the non-pinned case, there is some small discrepancy.
The patch is fairly straight forward, the one detail is that we need to
make sure we only request pinning for the group leader when we have a
group.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1375795686-4226-1-git-send-email-michael@ellerman.id.au
[ Use perf_evsel__is_group_leader instead of open coded equivalent, as
suggested by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-06 17:28:05 +04:00
if ( perf_evsel__is_group_leader ( evsel ) )
evsel - > attr . pinned = mod . pinned ;
2012-03-15 23:09:15 +04:00
}
2011-04-27 06:06:33 +04:00
2009-07-01 07:04:34 +04:00
return 0 ;
}
2009-05-26 13:10:09 +04:00
2012-08-16 23:10:21 +04:00
int parse_events_name ( struct list_head * list , char * name )
{
struct perf_evsel * evsel ;
2014-01-10 17:37:27 +04:00
__evlist__for_each ( list , evsel ) {
2012-08-16 23:10:21 +04:00
if ( ! evsel - > name )
evsel - > name = strdup ( name ) ;
}
return 0 ;
}
2014-10-07 19:08:50 +04:00
static int
comp_pmu ( const void * p1 , const void * p2 )
{
struct perf_pmu_event_symbol * pmu1 = ( struct perf_pmu_event_symbol * ) p1 ;
struct perf_pmu_event_symbol * pmu2 = ( struct perf_pmu_event_symbol * ) p2 ;
return strcmp ( pmu1 - > symbol , pmu2 - > symbol ) ;
}
static void perf_pmu__parse_cleanup ( void )
{
if ( perf_pmu_events_list_num > 0 ) {
struct perf_pmu_event_symbol * p ;
int i ;
for ( i = 0 ; i < perf_pmu_events_list_num ; i + + ) {
p = perf_pmu_events_list + i ;
free ( p - > symbol ) ;
}
free ( perf_pmu_events_list ) ;
perf_pmu_events_list = NULL ;
perf_pmu_events_list_num = 0 ;
}
}
# define SET_SYMBOL(str, stype) \
do { \
p - > symbol = str ; \
if ( ! p - > symbol ) \
goto err ; \
p - > type = stype ; \
} while ( 0 )
/*
* Read the pmu events list from sysfs
* Save it into perf_pmu_events_list
*/
static void perf_pmu__parse_init ( void )
{
struct perf_pmu * pmu = NULL ;
struct perf_pmu_alias * alias ;
int len = 0 ;
pmu = perf_pmu__find ( " cpu " ) ;
if ( ( pmu = = NULL ) | | list_empty ( & pmu - > aliases ) ) {
perf_pmu_events_list_num = - 1 ;
return ;
}
list_for_each_entry ( alias , & pmu - > aliases , list ) {
if ( strchr ( alias - > name , ' - ' ) )
len + + ;
len + + ;
}
perf_pmu_events_list = malloc ( sizeof ( struct perf_pmu_event_symbol ) * len ) ;
if ( ! perf_pmu_events_list )
return ;
perf_pmu_events_list_num = len ;
len = 0 ;
list_for_each_entry ( alias , & pmu - > aliases , list ) {
struct perf_pmu_event_symbol * p = perf_pmu_events_list + len ;
char * tmp = strchr ( alias - > name , ' - ' ) ;
if ( tmp ! = NULL ) {
SET_SYMBOL ( strndup ( alias - > name , tmp - alias - > name ) ,
PMU_EVENT_SYMBOL_PREFIX ) ;
p + + ;
SET_SYMBOL ( strdup ( + + tmp ) , PMU_EVENT_SYMBOL_SUFFIX ) ;
len + = 2 ;
} else {
SET_SYMBOL ( strdup ( alias - > name ) , PMU_EVENT_SYMBOL ) ;
len + + ;
}
}
qsort ( perf_pmu_events_list , len ,
sizeof ( struct perf_pmu_event_symbol ) , comp_pmu ) ;
return ;
err :
perf_pmu__parse_cleanup ( ) ;
}
enum perf_pmu_event_symbol_type
perf_pmu__parse_check ( const char * name )
{
struct perf_pmu_event_symbol p , * r ;
/* scan kernel pmu events from sysfs if needed */
if ( perf_pmu_events_list_num = = 0 )
perf_pmu__parse_init ( ) ;
/*
* name " cpu " could be prefix of cpu - cycles or cpu // events.
* cpu - cycles has been handled by hardcode .
* So it must be cpu // events, not kernel pmu event.
*/
if ( ( perf_pmu_events_list_num < = 0 ) | | ! strcmp ( name , " cpu " ) )
return PMU_EVENT_SYMBOL_ERR ;
p . symbol = strdup ( name ) ;
r = bsearch ( & p , perf_pmu_events_list ,
( size_t ) perf_pmu_events_list_num ,
sizeof ( struct perf_pmu_event_symbol ) , comp_pmu ) ;
free ( p . symbol ) ;
return r ? r - > type : PMU_EVENT_SYMBOL_ERR ;
}
2012-06-15 10:31:40 +04:00
static int parse_events__scanner ( const char * str , void * data , int start_token )
2009-07-01 07:04:34 +04:00
{
2012-03-15 23:09:15 +04:00
YY_BUFFER_STATE buffer ;
2012-06-15 10:31:39 +04:00
void * scanner ;
2012-06-15 10:31:38 +04:00
int ret ;
2009-09-12 01:19:45 +04:00
2012-06-15 10:31:40 +04:00
ret = parse_events_lex_init_extra ( start_token , & scanner ) ;
2012-06-15 10:31:39 +04:00
if ( ret )
return ret ;
buffer = parse_events__scan_string ( str , scanner ) ;
2009-06-06 11:58:57 +04:00
2012-05-21 11:12:50 +04:00
# ifdef PARSER_DEBUG
parse_events_debug = 1 ;
# endif
2012-06-15 10:31:39 +04:00
ret = parse_events_parse ( data , scanner ) ;
parse_events__flush_buffer ( buffer , scanner ) ;
parse_events__delete_buffer ( buffer , scanner ) ;
parse_events_lex_destroy ( scanner ) ;
return ret ;
}
2009-09-12 01:19:45 +04:00
2012-06-15 10:31:40 +04:00
/*
* parse event config string , return a list of event terms .
*/
int parse_events_terms ( struct list_head * terms , const char * str )
{
2013-01-18 23:56:57 +04:00
struct parse_events_terms data = {
2012-06-15 10:31:40 +04:00
. terms = NULL ,
} ;
int ret ;
ret = parse_events__scanner ( str , & data , PE_START_TERMS ) ;
if ( ! ret ) {
list_splice ( data . terms , terms ) ;
2013-12-27 23:55:14 +04:00
zfree ( & data . terms ) ;
2012-06-15 10:31:40 +04:00
return 0 ;
}
2013-07-04 17:20:23 +04:00
if ( data . terms )
parse_events__free_terms ( data . terms ) ;
2012-06-15 10:31:40 +04:00
return ret ;
}
perf tools: Add parse_events_error interface
Adding support to return error information from parse_events function.
Following struct will be populated by parse_events function on return:
struct parse_events_error {
int idx;
char *str;
char *help;
};
where 'idx' is the position in the string where the parsing failed,
'str' contains dynamically allocated error string describing the error
and 'help' is optional help string.
The change contains reporting function, which currently does not display
anything. The code changes to supply error data for specific event types
are coming in next patches. However this is what the expected output is:
$ sudo perf record -e 'sched:krava' ls
event syntax error: 'sched:krava'
\___ unknown tracepoint
...
$ perf record -e 'cpu/even=0x1/' ls
event syntax error: 'cpu/even=0x1/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period,branch_type
...
$ perf record -e cycles,cache-mises ls
event syntax error: '..es,cache-mises'
\___ parser error
...
The output functions cut the beginning of the event string so the error
starts up to 10th character and cut the end of the string of it crosses
the terminal width.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1429729824-13932-2-git-send-email-jolsa@kernel.org
[ Renamed 'error' variables to 'err', not to clash with util.h error() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-22 22:10:16 +03:00
int parse_events ( struct perf_evlist * evlist , const char * str ,
struct parse_events_error * err )
2012-06-15 10:31:39 +04:00
{
2013-01-18 23:56:57 +04:00
struct parse_events_evlist data = {
perf tools: Add parse_events_error interface
Adding support to return error information from parse_events function.
Following struct will be populated by parse_events function on return:
struct parse_events_error {
int idx;
char *str;
char *help;
};
where 'idx' is the position in the string where the parsing failed,
'str' contains dynamically allocated error string describing the error
and 'help' is optional help string.
The change contains reporting function, which currently does not display
anything. The code changes to supply error data for specific event types
are coming in next patches. However this is what the expected output is:
$ sudo perf record -e 'sched:krava' ls
event syntax error: 'sched:krava'
\___ unknown tracepoint
...
$ perf record -e 'cpu/even=0x1/' ls
event syntax error: 'cpu/even=0x1/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period,branch_type
...
$ perf record -e cycles,cache-mises ls
event syntax error: '..es,cache-mises'
\___ parser error
...
The output functions cut the beginning of the event string so the error
starts up to 10th character and cut the end of the string of it crosses
the terminal width.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1429729824-13932-2-git-send-email-jolsa@kernel.org
[ Renamed 'error' variables to 'err', not to clash with util.h error() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-22 22:10:16 +03:00
. list = LIST_HEAD_INIT ( data . list ) ,
. idx = evlist - > nr_entries ,
. error = err ,
2012-06-15 10:31:39 +04:00
} ;
int ret ;
2009-09-12 01:19:45 +04:00
2012-06-15 10:31:40 +04:00
ret = parse_events__scanner ( str , & data , PE_START_EVENTS ) ;
2014-10-07 19:08:50 +04:00
perf_pmu__parse_cleanup ( ) ;
2012-03-15 23:09:15 +04:00
if ( ! ret ) {
2015-07-10 10:36:09 +03:00
struct perf_evsel * last ;
2015-09-06 10:13:17 +03:00
if ( list_empty ( & data . list ) ) {
WARN_ONCE ( true , " WARNING: event parser found nothing " ) ;
return - 1 ;
}
2015-09-08 10:58:53 +03:00
perf_evlist__splice_list_tail ( evlist , & data . list ) ;
2013-01-22 13:09:29 +04:00
evlist - > nr_groups + = data . nr_groups ;
2015-07-10 10:36:09 +03:00
last = perf_evlist__last ( evlist ) ;
last - > cmdline_group_boundary = true ;
2012-03-15 23:09:15 +04:00
return 0 ;
}
2009-09-12 01:19:45 +04:00
2012-03-20 22:15:40 +04:00
/*
* There are 2 users - builtin - record and builtin - test objects .
* Both call perf_evlist__delete in case of error , so we dont
* need to bother .
*/
2009-09-12 01:19:45 +04:00
return ret ;
2009-05-26 13:10:09 +04:00
}
perf tools: Add parse_events_error interface
Adding support to return error information from parse_events function.
Following struct will be populated by parse_events function on return:
struct parse_events_error {
int idx;
char *str;
char *help;
};
where 'idx' is the position in the string where the parsing failed,
'str' contains dynamically allocated error string describing the error
and 'help' is optional help string.
The change contains reporting function, which currently does not display
anything. The code changes to supply error data for specific event types
are coming in next patches. However this is what the expected output is:
$ sudo perf record -e 'sched:krava' ls
event syntax error: 'sched:krava'
\___ unknown tracepoint
...
$ perf record -e 'cpu/even=0x1/' ls
event syntax error: 'cpu/even=0x1/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period,branch_type
...
$ perf record -e cycles,cache-mises ls
event syntax error: '..es,cache-mises'
\___ parser error
...
The output functions cut the beginning of the event string so the error
starts up to 10th character and cut the end of the string of it crosses
the terminal width.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1429729824-13932-2-git-send-email-jolsa@kernel.org
[ Renamed 'error' variables to 'err', not to clash with util.h error() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-22 22:10:16 +03:00
# define MAX_WIDTH 1000
static int get_term_width ( void )
{
struct winsize ws ;
get_term_dimensions ( & ws ) ;
return ws . ws_col > MAX_WIDTH ? MAX_WIDTH : ws . ws_col ;
}
static void parse_events_print_error ( struct parse_events_error * err ,
const char * event )
{
const char * str = " invalid or unsupported event: " ;
char _buf [ MAX_WIDTH ] ;
char * buf = ( char * ) event ;
int idx = 0 ;
if ( err - > str ) {
/* -2 for extra '' in the final fprintf */
int width = get_term_width ( ) - 2 ;
int len_event = strlen ( event ) ;
int len_str , max_len , cut = 0 ;
/*
* Maximum error index indent , we will cut
* the event string if it ' s bigger .
*/
2015-07-17 19:33:51 +03:00
int max_err_idx = 13 ;
perf tools: Add parse_events_error interface
Adding support to return error information from parse_events function.
Following struct will be populated by parse_events function on return:
struct parse_events_error {
int idx;
char *str;
char *help;
};
where 'idx' is the position in the string where the parsing failed,
'str' contains dynamically allocated error string describing the error
and 'help' is optional help string.
The change contains reporting function, which currently does not display
anything. The code changes to supply error data for specific event types
are coming in next patches. However this is what the expected output is:
$ sudo perf record -e 'sched:krava' ls
event syntax error: 'sched:krava'
\___ unknown tracepoint
...
$ perf record -e 'cpu/even=0x1/' ls
event syntax error: 'cpu/even=0x1/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period,branch_type
...
$ perf record -e cycles,cache-mises ls
event syntax error: '..es,cache-mises'
\___ parser error
...
The output functions cut the beginning of the event string so the error
starts up to 10th character and cut the end of the string of it crosses
the terminal width.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1429729824-13932-2-git-send-email-jolsa@kernel.org
[ Renamed 'error' variables to 'err', not to clash with util.h error() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-22 22:10:16 +03:00
/*
* Let ' s be specific with the message when
* we have the precise error .
*/
str = " event syntax error: " ;
len_str = strlen ( str ) ;
max_len = width - len_str ;
buf = _buf ;
/* We're cutting from the beggining. */
if ( err - > idx > max_err_idx )
cut = err - > idx - max_err_idx ;
strncpy ( buf , event + cut , max_len ) ;
/* Mark cut parts with '..' on both sides. */
if ( cut )
buf [ 0 ] = buf [ 1 ] = ' . ' ;
if ( ( len_event - cut ) > max_len ) {
buf [ max_len - 1 ] = buf [ max_len - 2 ] = ' . ' ;
buf [ max_len ] = 0 ;
}
idx = len_str + err - > idx - cut ;
}
fprintf ( stderr , " %s'%s' \n " , str , buf ) ;
if ( idx ) {
fprintf ( stderr , " %*s \\ ___ %s \n " , idx + 1 , " " , err - > str ) ;
if ( err - > help )
fprintf ( stderr , " \n %s \n " , err - > help ) ;
free ( err - > str ) ;
free ( err - > help ) ;
}
fprintf ( stderr , " Run 'perf list' for a list of valid events \n " ) ;
}
# undef MAX_WIDTH
2011-07-14 13:25:32 +04:00
int parse_events_option ( const struct option * opt , const char * str ,
2012-09-11 02:15:03 +04:00
int unset __maybe_unused )
2011-07-14 13:25:32 +04:00
{
struct perf_evlist * evlist = * ( struct perf_evlist * * ) opt - > value ;
perf tools: Add parse_events_error interface
Adding support to return error information from parse_events function.
Following struct will be populated by parse_events function on return:
struct parse_events_error {
int idx;
char *str;
char *help;
};
where 'idx' is the position in the string where the parsing failed,
'str' contains dynamically allocated error string describing the error
and 'help' is optional help string.
The change contains reporting function, which currently does not display
anything. The code changes to supply error data for specific event types
are coming in next patches. However this is what the expected output is:
$ sudo perf record -e 'sched:krava' ls
event syntax error: 'sched:krava'
\___ unknown tracepoint
...
$ perf record -e 'cpu/even=0x1/' ls
event syntax error: 'cpu/even=0x1/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period,branch_type
...
$ perf record -e cycles,cache-mises ls
event syntax error: '..es,cache-mises'
\___ parser error
...
The output functions cut the beginning of the event string so the error
starts up to 10th character and cut the end of the string of it crosses
the terminal width.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1429729824-13932-2-git-send-email-jolsa@kernel.org
[ Renamed 'error' variables to 'err', not to clash with util.h error() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-22 22:10:16 +03:00
struct parse_events_error err = { . idx = 0 , } ;
int ret = parse_events ( evlist , str , & err ) ;
if ( ret )
parse_events_print_error ( & err , str ) ;
2012-10-27 00:30:06 +04:00
return ret ;
2011-07-14 13:25:32 +04:00
}
2015-07-10 10:36:10 +03:00
static int
foreach_evsel_in_last_glob ( struct perf_evlist * evlist ,
int ( * func ) ( struct perf_evsel * evsel ,
const void * arg ) ,
const void * arg )
2009-10-15 07:22:07 +04:00
{
2011-01-03 21:39:04 +03:00
struct perf_evsel * last = NULL ;
2015-07-10 10:36:10 +03:00
int err ;
2009-10-15 07:22:07 +04:00
2015-09-06 10:13:17 +03:00
/*
* Don ' t return when list_empty , give func a chance to report
* error when it found last = = NULL .
*
* So no need to WARN here , let * func do this .
*/
2011-01-12 01:56:53 +03:00
if ( evlist - > nr_entries > 0 )
2012-08-14 23:42:15 +04:00
last = perf_evlist__last ( evlist ) ;
2011-01-03 21:39:04 +03:00
2015-07-10 10:36:09 +03:00
do {
2015-07-10 10:36:10 +03:00
err = ( * func ) ( last , arg ) ;
if ( err )
2015-07-10 10:36:09 +03:00
return - 1 ;
2015-07-10 10:36:10 +03:00
if ( ! last )
return 0 ;
2015-07-10 10:36:09 +03:00
if ( last - > node . prev = = & evlist - > entries )
return 0 ;
last = list_entry ( last - > node . prev , struct perf_evsel , node ) ;
} while ( ! last - > cmdline_group_boundary ) ;
2009-10-15 07:22:07 +04:00
return 0 ;
}
2015-07-10 10:36:10 +03:00
static int set_filter ( struct perf_evsel * evsel , const void * arg )
{
const char * str = arg ;
if ( evsel = = NULL | | evsel - > attr . type ! = PERF_TYPE_TRACEPOINT ) {
fprintf ( stderr ,
" --filter option should follow a -e tracepoint option \n " ) ;
return - 1 ;
}
if ( perf_evsel__append_filter ( evsel , " && " , str ) < 0 ) {
fprintf ( stderr ,
" not enough memory to hold filter string \n " ) ;
return - 1 ;
}
return 0 ;
}
int parse_filter ( const struct option * opt , const char * str ,
int unset __maybe_unused )
{
struct perf_evlist * evlist = * ( struct perf_evlist * * ) opt - > value ;
return foreach_evsel_in_last_glob ( evlist , set_filter ,
( const void * ) str ) ;
}
static int add_exclude_perf_filter ( struct perf_evsel * evsel ,
const void * arg __maybe_unused )
{
char new_filter [ 64 ] ;
if ( evsel = = NULL | | evsel - > attr . type ! = PERF_TYPE_TRACEPOINT ) {
fprintf ( stderr ,
" --exclude-perf option should follow a -e tracepoint option \n " ) ;
return - 1 ;
}
snprintf ( new_filter , sizeof ( new_filter ) , " common_pid != %d " , getpid ( ) ) ;
if ( perf_evsel__append_filter ( evsel , " && " , new_filter ) < 0 ) {
fprintf ( stderr ,
" not enough memory to hold filter string \n " ) ;
return - 1 ;
}
return 0 ;
}
int exclude_perf ( const struct option * opt ,
const char * arg __maybe_unused ,
int unset __maybe_unused )
{
struct perf_evlist * evlist = * ( struct perf_evlist * * ) opt - > value ;
return foreach_evsel_in_last_glob ( evlist , add_exclude_perf_filter ,
NULL ) ;
}
2009-06-06 14:24:17 +04:00
static const char * const event_type_descriptors [ ] = {
" Hardware event " ,
" Software event " ,
" Tracepoint event " ,
" Hardware cache event " ,
2009-12-29 11:37:07 +03:00
" Raw hardware event descriptor " ,
" Hardware breakpoint " ,
2009-06-06 14:24:17 +04:00
} ;
2015-02-27 13:21:25 +03:00
static int cmp_string ( const void * a , const void * b )
{
const char * const * as = a ;
const char * const * bs = b ;
return strcmp ( * as , * bs ) ;
}
2009-07-21 20:20:22 +04:00
/*
* Print the events from < debugfs_mount_point > / tracing / events
*/
2012-08-09 18:31:52 +04:00
void print_tracepoint_events ( const char * subsys_glob , const char * event_glob ,
bool name_only )
2009-07-21 20:20:22 +04:00
{
DIR * sys_dir , * evt_dir ;
struct dirent * sys_next , * evt_next , sys_dirent , evt_dirent ;
char evt_path [ MAXPATHLEN ] ;
2009-09-24 17:39:09 +04:00
char dir_path [ MAXPATHLEN ] ;
2015-02-27 13:21:25 +03:00
char * * evt_list = NULL ;
unsigned int evt_i = 0 , evt_num = 0 ;
bool evt_num_known = false ;
2009-07-21 20:20:22 +04:00
2015-02-27 13:21:25 +03:00
restart :
2011-11-16 20:03:07 +04:00
sys_dir = opendir ( tracing_events_path ) ;
2009-07-21 20:20:22 +04:00
if ( ! sys_dir )
2009-09-24 17:39:09 +04:00
return ;
2009-09-04 23:39:51 +04:00
2015-02-27 13:21:25 +03:00
if ( evt_num_known ) {
evt_list = zalloc ( sizeof ( char * ) * evt_num ) ;
if ( ! evt_list )
goto out_close_sys_dir ;
}
2009-09-04 23:39:51 +04:00
for_each_subsystem ( sys_dir , sys_dirent , sys_next ) {
2014-12-17 23:24:45 +03:00
if ( subsys_glob ! = NULL & &
2011-02-17 20:38:58 +03:00
! strglobmatch ( sys_dirent . d_name , subsys_glob ) )
continue ;
2009-09-24 17:39:09 +04:00
2011-11-16 20:03:07 +04:00
snprintf ( dir_path , MAXPATHLEN , " %s/%s " , tracing_events_path ,
2009-09-24 17:39:09 +04:00
sys_dirent . d_name ) ;
evt_dir = opendir ( dir_path ) ;
if ( ! evt_dir )
2009-09-04 23:39:51 +04:00
continue ;
2009-09-24 17:39:09 +04:00
2009-09-04 23:39:51 +04:00
for_each_event ( sys_dirent , evt_dir , evt_dirent , evt_next ) {
2014-12-17 23:24:45 +03:00
if ( event_glob ! = NULL & &
2011-02-17 20:38:58 +03:00
! strglobmatch ( evt_dirent . d_name , event_glob ) )
continue ;
2015-02-27 13:21:25 +03:00
if ( ! evt_num_known ) {
evt_num + + ;
2012-08-09 18:31:52 +04:00
continue ;
}
2009-07-21 20:20:22 +04:00
snprintf ( evt_path , MAXPATHLEN , " %s:%s " ,
sys_dirent . d_name , evt_dirent . d_name ) ;
2015-02-27 13:21:25 +03:00
evt_list [ evt_i ] = strdup ( evt_path ) ;
if ( evt_list [ evt_i ] = = NULL )
goto out_close_evt_dir ;
evt_i + + ;
2009-07-21 20:20:22 +04:00
}
closedir ( evt_dir ) ;
}
closedir ( sys_dir ) ;
2015-02-27 13:21:25 +03:00
if ( ! evt_num_known ) {
evt_num_known = true ;
goto restart ;
}
qsort ( evt_list , evt_num , sizeof ( char * ) , cmp_string ) ;
evt_i = 0 ;
while ( evt_i < evt_num ) {
if ( name_only ) {
printf ( " %s " , evt_list [ evt_i + + ] ) ;
continue ;
}
printf ( " %-50s [%s] \n " , evt_list [ evt_i + + ] ,
event_type_descriptors [ PERF_TYPE_TRACEPOINT ] ) ;
}
2015-09-30 23:13:26 +03:00
if ( evt_num & & pager_in_use ( ) )
2015-02-27 13:21:25 +03:00
printf ( " \n " ) ;
out_free :
evt_num = evt_i ;
for ( evt_i = 0 ; evt_i < evt_num ; evt_i + + )
zfree ( & evt_list [ evt_i ] ) ;
zfree ( & evt_list ) ;
return ;
out_close_evt_dir :
closedir ( evt_dir ) ;
out_close_sys_dir :
closedir ( sys_dir ) ;
printf ( " FATAL: not enough memory to print %s \n " ,
event_type_descriptors [ PERF_TYPE_TRACEPOINT ] ) ;
if ( evt_list )
goto out_free ;
2009-07-21 20:20:22 +04:00
}
2011-01-03 19:50:45 +03:00
/*
* Check whether event is in < debugfs_mount_point > / tracing / events
*/
int is_valid_tracepoint ( const char * event_string )
{
DIR * sys_dir , * evt_dir ;
struct dirent * sys_next , * evt_next , sys_dirent , evt_dirent ;
char evt_path [ MAXPATHLEN ] ;
char dir_path [ MAXPATHLEN ] ;
2011-11-16 20:03:07 +04:00
sys_dir = opendir ( tracing_events_path ) ;
2011-01-03 19:50:45 +03:00
if ( ! sys_dir )
return 0 ;
for_each_subsystem ( sys_dir , sys_dirent , sys_next ) {
2011-11-16 20:03:07 +04:00
snprintf ( dir_path , MAXPATHLEN , " %s/%s " , tracing_events_path ,
2011-01-03 19:50:45 +03:00
sys_dirent . d_name ) ;
evt_dir = opendir ( dir_path ) ;
if ( ! evt_dir )
continue ;
for_each_event ( sys_dirent , evt_dir , evt_dirent , evt_next ) {
snprintf ( evt_path , MAXPATHLEN , " %s:%s " ,
sys_dirent . d_name , evt_dirent . d_name ) ;
if ( ! strcmp ( evt_path , event_string ) ) {
closedir ( evt_dir ) ;
closedir ( sys_dir ) ;
return 1 ;
}
}
closedir ( evt_dir ) ;
}
closedir ( sys_dir ) ;
return 0 ;
}
2013-08-27 06:41:53 +04:00
static bool is_event_supported ( u8 type , unsigned config )
{
bool ret = true ;
2013-12-31 00:39:45 +04:00
int open_return ;
2013-08-27 06:41:53 +04:00
struct perf_evsel * evsel ;
struct perf_event_attr attr = {
. type = type ,
. config = config ,
. disabled = 1 ,
} ;
struct {
struct thread_map map ;
int threads [ 1 ] ;
} tmap = {
. map . nr = 1 ,
. threads = { 0 } ,
} ;
2013-11-07 23:41:19 +04:00
evsel = perf_evsel__new ( & attr ) ;
2013-08-27 06:41:53 +04:00
if ( evsel ) {
2013-12-31 00:39:45 +04:00
open_return = perf_evsel__open ( evsel , NULL , & tmap . map ) ;
ret = open_return > = 0 ;
if ( open_return = = - EACCES ) {
/*
* This happens if the paranoid value
* / proc / sys / kernel / perf_event_paranoid is set to 2
* Re - run with exclude_kernel set ; we don ' t do that
* by default as some ARM machines do not support it .
*
*/
evsel - > attr . exclude_kernel = 1 ;
ret = perf_evsel__open ( evsel , NULL , & tmap . map ) > = 0 ;
}
2013-08-27 06:41:53 +04:00
perf_evsel__delete ( evsel ) ;
}
return ret ;
}
2012-08-09 18:31:52 +04:00
int print_hwcache_events ( const char * event_glob , bool name_only )
2011-02-17 20:38:58 +03:00
{
2015-02-27 13:21:25 +03:00
unsigned int type , op , i , evt_i = 0 , evt_num = 0 ;
2012-06-11 21:08:07 +04:00
char name [ 64 ] ;
2015-02-27 13:21:25 +03:00
char * * evt_list = NULL ;
bool evt_num_known = false ;
restart :
if ( evt_num_known ) {
evt_list = zalloc ( sizeof ( char * ) * evt_num ) ;
if ( ! evt_list )
goto out_enomem ;
}
2011-02-17 20:38:58 +03:00
for ( type = 0 ; type < PERF_COUNT_HW_CACHE_MAX ; type + + ) {
for ( op = 0 ; op < PERF_COUNT_HW_CACHE_OP_MAX ; op + + ) {
/* skip invalid cache type */
2012-06-11 21:08:07 +04:00
if ( ! perf_evsel__is_cache_op_valid ( type , op ) )
2011-02-17 20:38:58 +03:00
continue ;
for ( i = 0 ; i < PERF_COUNT_HW_CACHE_RESULT_MAX ; i + + ) {
2012-06-11 21:08:07 +04:00
__perf_evsel__hw_cache_type_op_res_name ( type , op , i ,
name , sizeof ( name ) ) ;
2011-04-30 00:52:42 +04:00
if ( event_glob ! = NULL & & ! strglobmatch ( name , event_glob ) )
2011-02-17 20:38:58 +03:00
continue ;
2013-08-27 06:41:53 +04:00
if ( ! is_event_supported ( PERF_TYPE_HW_CACHE ,
type | ( op < < 8 ) | ( i < < 16 ) ) )
continue ;
2015-02-27 13:21:25 +03:00
if ( ! evt_num_known ) {
evt_num + + ;
continue ;
}
evt_list [ evt_i ] = strdup ( name ) ;
if ( evt_list [ evt_i ] = = NULL )
goto out_enomem ;
evt_i + + ;
2011-02-17 20:38:58 +03:00
}
}
}
2015-02-27 13:21:25 +03:00
if ( ! evt_num_known ) {
evt_num_known = true ;
goto restart ;
}
qsort ( evt_list , evt_num , sizeof ( char * ) , cmp_string ) ;
evt_i = 0 ;
while ( evt_i < evt_num ) {
if ( name_only ) {
printf ( " %s " , evt_list [ evt_i + + ] ) ;
continue ;
}
printf ( " %-50s [%s] \n " , evt_list [ evt_i + + ] ,
event_type_descriptors [ PERF_TYPE_HW_CACHE ] ) ;
}
2015-09-30 23:13:26 +03:00
if ( evt_num & & pager_in_use ( ) )
2013-04-20 22:02:29 +04:00
printf ( " \n " ) ;
2015-02-27 13:21:25 +03:00
out_free :
evt_num = evt_i ;
for ( evt_i = 0 ; evt_i < evt_num ; evt_i + + )
zfree ( & evt_list [ evt_i ] ) ;
zfree ( & evt_list ) ;
return evt_num ;
out_enomem :
printf ( " FATAL: not enough memory to print %s \n " , event_type_descriptors [ PERF_TYPE_HW_CACHE ] ) ;
if ( evt_list )
goto out_free ;
return evt_num ;
2011-02-17 20:38:58 +03:00
}
2015-02-27 13:21:27 +03:00
void print_symbol_events ( const char * event_glob , unsigned type ,
2012-08-09 18:31:52 +04:00
struct event_symbol * syms , unsigned max ,
bool name_only )
2009-05-26 13:10:09 +04:00
{
2015-02-27 13:21:25 +03:00
unsigned int i , evt_i = 0 , evt_num = 0 ;
2011-04-30 00:52:42 +04:00
char name [ MAX_NAME_LEN ] ;
2015-02-27 13:21:25 +03:00
char * * evt_list = NULL ;
bool evt_num_known = false ;
restart :
if ( evt_num_known ) {
evt_list = zalloc ( sizeof ( char * ) * evt_num ) ;
if ( ! evt_list )
goto out_enomem ;
syms - = max ;
}
2009-05-26 13:10:09 +04:00
2012-07-04 02:00:44 +04:00
for ( i = 0 ; i < max ; i + + , syms + + ) {
2011-02-17 20:38:58 +03:00
2014-12-17 23:24:45 +03:00
if ( event_glob ! = NULL & &
2011-02-17 20:38:58 +03:00
! ( strglobmatch ( syms - > symbol , event_glob ) | |
( syms - > alias & & strglobmatch ( syms - > alias , event_glob ) ) ) )
continue ;
2009-05-26 13:10:09 +04:00
2013-08-27 06:41:53 +04:00
if ( ! is_event_supported ( type , i ) )
continue ;
2015-02-27 13:21:25 +03:00
if ( ! evt_num_known ) {
evt_num + + ;
2012-08-09 18:31:52 +04:00
continue ;
}
2015-02-27 13:21:25 +03:00
if ( ! name_only & & strlen ( syms - > alias ) )
2011-04-30 00:52:42 +04:00
snprintf ( name , MAX_NAME_LEN , " %s OR %s " , syms - > symbol , syms - > alias ) ;
2009-06-22 15:14:28 +04:00
else
2011-04-30 00:52:42 +04:00
strncpy ( name , syms - > symbol , MAX_NAME_LEN ) ;
2009-05-26 13:10:09 +04:00
2015-02-27 13:21:25 +03:00
evt_list [ evt_i ] = strdup ( name ) ;
if ( evt_list [ evt_i ] = = NULL )
goto out_enomem ;
evt_i + + ;
2009-05-26 13:10:09 +04:00
}
2015-02-27 13:21:25 +03:00
if ( ! evt_num_known ) {
evt_num_known = true ;
goto restart ;
}
qsort ( evt_list , evt_num , sizeof ( char * ) , cmp_string ) ;
evt_i = 0 ;
while ( evt_i < evt_num ) {
if ( name_only ) {
printf ( " %s " , evt_list [ evt_i + + ] ) ;
continue ;
}
printf ( " %-50s [%s] \n " , evt_list [ evt_i + + ] , event_type_descriptors [ type ] ) ;
}
2015-09-30 23:13:26 +03:00
if ( evt_num & & pager_in_use ( ) )
2011-02-17 20:38:58 +03:00
printf ( " \n " ) ;
2015-02-27 13:21:25 +03:00
out_free :
evt_num = evt_i ;
for ( evt_i = 0 ; evt_i < evt_num ; evt_i + + )
zfree ( & evt_list [ evt_i ] ) ;
zfree ( & evt_list ) ;
return ;
out_enomem :
printf ( " FATAL: not enough memory to print %s \n " , event_type_descriptors [ type ] ) ;
if ( evt_list )
goto out_free ;
2012-07-04 02:00:44 +04:00
}
/*
* Print the help text for the event symbols :
*/
2012-08-09 18:31:52 +04:00
void print_events ( const char * event_glob , bool name_only )
2012-07-04 02:00:44 +04:00
{
print_symbol_events ( event_glob , PERF_TYPE_HARDWARE ,
2012-08-09 18:31:52 +04:00
event_symbols_hw , PERF_COUNT_HW_MAX , name_only ) ;
2012-07-04 02:00:44 +04:00
print_symbol_events ( event_glob , PERF_TYPE_SOFTWARE ,
2012-08-09 18:31:52 +04:00
event_symbols_sw , PERF_COUNT_SW_MAX , name_only ) ;
2012-07-04 02:00:44 +04:00
2012-08-09 18:31:52 +04:00
print_hwcache_events ( event_glob , name_only ) ;
2011-02-17 20:38:58 +03:00
2013-04-20 22:02:29 +04:00
print_pmu_events ( event_glob , name_only ) ;
2011-02-17 20:38:58 +03:00
if ( event_glob ! = NULL )
return ;
2009-07-01 17:06:18 +04:00
2012-08-09 18:31:52 +04:00
if ( ! name_only ) {
printf ( " %-50s [%s] \n " ,
" rNNN " ,
event_type_descriptors [ PERF_TYPE_RAW ] ) ;
printf ( " %-50s [%s] \n " ,
" cpu/t1=v1[,t2=v2,t3 ...]/modifier " ,
event_type_descriptors [ PERF_TYPE_RAW ] ) ;
2015-09-30 23:13:26 +03:00
if ( pager_in_use ( ) )
printf ( " (see 'man perf-list' on how to encode it) \n \n " ) ;
2012-08-09 18:31:52 +04:00
printf ( " %-50s [%s] \n " ,
2014-05-29 19:26:51 +04:00
" mem:<addr>[/len][:access] " ,
2009-12-29 11:37:07 +03:00
event_type_descriptors [ PERF_TYPE_BREAKPOINT ] ) ;
2015-09-30 23:13:26 +03:00
if ( pager_in_use ( ) )
printf ( " \n " ) ;
2012-08-09 18:31:52 +04:00
}
2009-11-23 17:42:35 +03:00
2012-08-09 18:31:52 +04:00
print_tracepoint_events ( NULL , NULL , name_only ) ;
2009-05-26 13:10:09 +04:00
}
2012-03-15 23:09:16 +04:00
2013-01-18 23:29:49 +04:00
int parse_events__is_hardcoded_term ( struct parse_events_term * term )
2012-03-15 23:09:16 +04:00
{
2012-04-25 20:24:57 +04:00
return term - > type_term ! = PARSE_EVENTS__TERM_TYPE_USER ;
2012-03-15 23:09:16 +04:00
}
2013-01-18 23:29:49 +04:00
static int new_term ( struct parse_events_term * * _term , int type_val ,
2012-04-25 20:24:57 +04:00
int type_term , char * config ,
2015-04-22 22:10:20 +03:00
char * str , u64 num , int err_term , int err_val )
2012-03-15 23:09:16 +04:00
{
2013-01-18 23:29:49 +04:00
struct parse_events_term * term ;
2012-03-15 23:09:16 +04:00
term = zalloc ( sizeof ( * term ) ) ;
if ( ! term )
return - ENOMEM ;
INIT_LIST_HEAD ( & term - > list ) ;
2012-04-25 20:24:57 +04:00
term - > type_val = type_val ;
term - > type_term = type_term ;
2012-03-15 23:09:16 +04:00
term - > config = config ;
2015-04-22 22:10:20 +03:00
term - > err_term = err_term ;
term - > err_val = err_val ;
2012-03-15 23:09:16 +04:00
2012-04-25 20:24:57 +04:00
switch ( type_val ) {
2012-03-15 23:09:16 +04:00
case PARSE_EVENTS__TERM_TYPE_NUM :
term - > val . num = num ;
break ;
case PARSE_EVENTS__TERM_TYPE_STR :
term - > val . str = str ;
break ;
default :
2013-07-04 17:20:24 +04:00
free ( term ) ;
2012-03-15 23:09:16 +04:00
return - EINVAL ;
}
* _term = term ;
return 0 ;
}
2013-01-18 23:29:49 +04:00
int parse_events_term__num ( struct parse_events_term * * term ,
2015-04-22 22:10:20 +03:00
int type_term , char * config , u64 num ,
2015-05-19 16:05:42 +03:00
void * loc_term_ , void * loc_val_ )
2012-04-25 20:24:57 +04:00
{
2015-05-19 16:05:42 +03:00
YYLTYPE * loc_term = loc_term_ ;
YYLTYPE * loc_val = loc_val_ ;
2012-04-25 20:24:57 +04:00
return new_term ( term , PARSE_EVENTS__TERM_TYPE_NUM , type_term ,
2015-04-22 22:10:20 +03:00
config , NULL , num ,
loc_term ? loc_term - > first_column : 0 ,
loc_val ? loc_val - > first_column : 0 ) ;
2012-04-25 20:24:57 +04:00
}
2013-01-18 23:29:49 +04:00
int parse_events_term__str ( struct parse_events_term * * term ,
2015-04-22 22:10:20 +03:00
int type_term , char * config , char * str ,
2015-05-19 16:05:42 +03:00
void * loc_term_ , void * loc_val_ )
2012-04-25 20:24:57 +04:00
{
2015-05-19 16:05:42 +03:00
YYLTYPE * loc_term = loc_term_ ;
YYLTYPE * loc_val = loc_val_ ;
2012-04-25 20:24:57 +04:00
return new_term ( term , PARSE_EVENTS__TERM_TYPE_STR , type_term ,
2015-04-22 22:10:20 +03:00
config , str , 0 ,
loc_term ? loc_term - > first_column : 0 ,
loc_val ? loc_val - > first_column : 0 ) ;
2012-04-25 20:24:57 +04:00
}
2013-01-18 23:29:49 +04:00
int parse_events_term__sym_hw ( struct parse_events_term * * term ,
2012-10-10 16:53:17 +04:00
char * config , unsigned idx )
{
struct event_symbol * sym ;
BUG_ON ( idx > = PERF_COUNT_HW_MAX ) ;
sym = & event_symbols_hw [ idx ] ;
if ( config )
return new_term ( term , PARSE_EVENTS__TERM_TYPE_STR ,
PARSE_EVENTS__TERM_TYPE_USER , config ,
2015-04-22 22:10:20 +03:00
( char * ) sym - > symbol , 0 , 0 , 0 ) ;
2012-10-10 16:53:17 +04:00
else
return new_term ( term , PARSE_EVENTS__TERM_TYPE_STR ,
PARSE_EVENTS__TERM_TYPE_USER ,
2015-04-22 22:10:20 +03:00
( char * ) " event " , ( char * ) sym - > symbol ,
0 , 0 , 0 ) ;
2012-10-10 16:53:17 +04:00
}
2013-01-18 23:29:49 +04:00
int parse_events_term__clone ( struct parse_events_term * * new ,
struct parse_events_term * term )
2012-06-15 10:31:41 +04:00
{
return new_term ( new , term - > type_val , term - > type_term , term - > config ,
2015-04-22 22:10:20 +03:00
term - > val . str , term - > val . num ,
term - > err_term , term - > err_val ) ;
2012-06-15 10:31:41 +04:00
}
2012-03-15 23:09:16 +04:00
void parse_events__free_terms ( struct list_head * terms )
{
2013-01-18 23:29:49 +04:00
struct parse_events_term * term , * h ;
2012-03-15 23:09:16 +04:00
list_for_each_entry_safe ( term , h , terms , list )
free ( term ) ;
}
perf tools: Add parse_events_error interface
Adding support to return error information from parse_events function.
Following struct will be populated by parse_events function on return:
struct parse_events_error {
int idx;
char *str;
char *help;
};
where 'idx' is the position in the string where the parsing failed,
'str' contains dynamically allocated error string describing the error
and 'help' is optional help string.
The change contains reporting function, which currently does not display
anything. The code changes to supply error data for specific event types
are coming in next patches. However this is what the expected output is:
$ sudo perf record -e 'sched:krava' ls
event syntax error: 'sched:krava'
\___ unknown tracepoint
...
$ perf record -e 'cpu/even=0x1/' ls
event syntax error: 'cpu/even=0x1/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period,branch_type
...
$ perf record -e cycles,cache-mises ls
event syntax error: '..es,cache-mises'
\___ parser error
...
The output functions cut the beginning of the event string so the error
starts up to 10th character and cut the end of the string of it crosses
the terminal width.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1429729824-13932-2-git-send-email-jolsa@kernel.org
[ Renamed 'error' variables to 'err', not to clash with util.h error() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-22 22:10:16 +03:00
void parse_events_evlist_error ( struct parse_events_evlist * data ,
int idx , const char * str )
{
struct parse_events_error * err = data - > error ;
2015-05-19 16:05:44 +03:00
if ( ! err )
return ;
perf tools: Add parse_events_error interface
Adding support to return error information from parse_events function.
Following struct will be populated by parse_events function on return:
struct parse_events_error {
int idx;
char *str;
char *help;
};
where 'idx' is the position in the string where the parsing failed,
'str' contains dynamically allocated error string describing the error
and 'help' is optional help string.
The change contains reporting function, which currently does not display
anything. The code changes to supply error data for specific event types
are coming in next patches. However this is what the expected output is:
$ sudo perf record -e 'sched:krava' ls
event syntax error: 'sched:krava'
\___ unknown tracepoint
...
$ perf record -e 'cpu/even=0x1/' ls
event syntax error: 'cpu/even=0x1/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period,branch_type
...
$ perf record -e cycles,cache-mises ls
event syntax error: '..es,cache-mises'
\___ parser error
...
The output functions cut the beginning of the event string so the error
starts up to 10th character and cut the end of the string of it crosses
the terminal width.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1429729824-13932-2-git-send-email-jolsa@kernel.org
[ Renamed 'error' variables to 'err', not to clash with util.h error() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-22 22:10:16 +03:00
err - > idx = idx ;
err - > str = strdup ( str ) ;
WARN_ONCE ( ! err - > str , " WARNING: failed to allocate error string " ) ;
}
perf tools: Show proper error message for wrong terms of hw/sw events
Show proper error message and show valid terms when wrong config terms
is specified for hw/sw type perf events.
This patch makes the original error format function formats_error_string()
more generic, which only outputs the static config terms for hw/sw perf
events, and prepends pmu formats for pmu events.
Before this patch:
$ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
invalid or unsupported event: 'cpu-clock/freqx=200/'
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
After this patch:
$ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
event syntax error: 'cpu-clock/freqx=200/'
\___ unknown term
valid terms: config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1443412336-120050-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28 06:52:14 +03:00
/*
* Return string contains valid config terms of an event .
* @ additional_terms : For terms such as PMU sysfs terms .
*/
char * parse_events_formats_error_string ( char * additional_terms )
{
char * str ;
static const char * static_terms = " config,config1,config2,name, "
" period,freq,branch_type,time, "
" call-graph,stack-size \n " ;
/* valid terms */
if ( additional_terms ) {
if ( ! asprintf ( & str , " valid terms: %s,%s " ,
additional_terms , static_terms ) )
goto fail ;
} else {
if ( ! asprintf ( & str , " valid terms: %s " , static_terms ) )
goto fail ;
}
return str ;
fail :
return NULL ;
}