linux

iv/linux

Author	SHA1	Message	Date
Arnaldo Carvalho de Melo	8ab2e96d8f	perf evsel: Rename perf_evsel__name() to evsel__name() As they are 'struct evsel' methods or related routines, not part of tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2020-05-05 16:35:30 -03:00
Arnaldo Carvalho de Melo	6ec17b4e25	perf evsel: Rename perf_evsel__config() to evsel__config() As they are all 'struct evsel' methods, not part of tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2020-05-05 16:35:30 -03:00
Arnaldo Carvalho de Melo	d7a07b2932	perf trace: Resolve prctl's 'option' arg strings to numbers # perf trace -e syscalls:sys_enter_prctl --filter="option==SET_NAME" 0.000 Socket Thread/3860 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7fc50b9733e8) 0.053 SSL Cert #78/3860 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7fc50b9733e8) ^C # If one uses '-v' with 'perf trace', we can see the filter it puts in place: New filter for syscalls:sys_enter_prctl: (option==0xf) && (common_pid != 3859 && common_pid != 2757) We still need to allow using plain '-e prctl' and have this turn into creating a 'syscalls:sys_enter_prctl' event so that the filter can be applied only to it as right now '-e prctl' ends up using the 'raw_syscalls:sys_enter/sys_exit'. The end goal is to have something like: # perf trace -e prctl/option==SET_NAME/ And have that use tracepoint filters or eBPF ones. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2020-02-11 16:41:50 -03:00
Ian Rogers	a910e4666d	perf parse: Report initial event parsing error Record the first event parsing error and report. Implementing feedback from Jiri Olsa: https://lkml.org/lkml/2019/10/28/680 An example error is: $ tools/perf/perf stat -e c/c/ WARNING: multiple event parsing errors event syntax error: 'c/c/' \___ unknown term valid terms: event,filter_rem,filter_opc0,edge,filter_isoc,filter_tid,filter_loc,filter_nc,inv,umask,filter_opc1,tid_en,thresh,filter_all_op,filter_not_nm,filter_state,filter_nm,config,config1,config2,name,period,percore Initial error: event syntax error: 'c/c/' \___ Cannot find PMU `c'. Missing kernel support? Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Allison Randal <allison@lohutok.net> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lore.kernel.org/lkml/20191116074652.9960-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-11-18 19:14:29 -03:00
Arnaldo Carvalho de Melo	27198a893b	perf trace: Use STUL_STRARRAY_FLAGS with mmap The 'mmap' syscall has special needs so it doesn't use SCA_STRARRAY_FLAGS, see its implementation in syscall_arg__scnprintf_mmap_flags(), related to special handling of MAP_ANONYMOUS, so set ->parm to the strarray__mmap_flags and hook up with strarray__strtoul_flags manually, now we can filter by those or-ed string expressions: # perf trace -e syscalls:sys_enter_mmap sleep 1 0.000 syscalls:sys_enter_mmap(addr: NULL, len: 134346, prot: READ, flags: PRIVATE, fd: 3, off: 0) 0.026 syscalls:sys_enter_mmap(addr: NULL, len: 8192, prot: READ\|WRITE, flags: PRIVATE\|ANONYMOUS) 0.036 syscalls:sys_enter_mmap(addr: NULL, len: 1857472, prot: READ, flags: PRIVATE\|DENYWRITE, fd: 3, off: 0) 0.046 syscalls:sys_enter_mmap(addr: 0x7fae003d9000, len: 1363968, prot: READ\|EXEC, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x22000) 0.052 syscalls:sys_enter_mmap(addr: 0x7fae00526000, len: 311296, prot: READ, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x16f000) 0.055 syscalls:sys_enter_mmap(addr: 0x7fae00573000, len: 24576, prot: READ\|WRITE, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x1bb000) 0.062 syscalls:sys_enter_mmap(addr: 0x7fae00579000, len: 14272, prot: READ\|WRITE, flags: PRIVATE\|FIXED\|ANONYMOUS) 0.253 syscalls:sys_enter_mmap(addr: NULL, len: 217750512, prot: READ, flags: PRIVATE, fd: 3, off: 0) # # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE\|FIXED\|DENYWRITE" sleep 1 0.000 syscalls:sys_enter_mmap(addr: 0x7f6ab3dcb000, len: 1363968, prot: READ\|EXEC, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x22000) 0.010 syscalls:sys_enter_mmap(addr: 0x7f6ab3f18000, len: 311296, prot: READ, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x16f000) 0.014 syscalls:sys_enter_mmap(addr: 0x7f6ab3f65000, len: 24576, prot: READ\|WRITE, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x1bb000) # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE\|ANONYMOUS" sleep 1 0.000 syscalls:sys_enter_mmap(addr: NULL, len: 8192, prot: READ\|WRITE, flags: PRIVATE\|ANONYMOUS) # # perf trace -v -e syscalls:sys_enter_mmap --filter="flags==PRIVATE\|ANONYMOUS" sleep 1 \|& grep "New filter" New filter for syscalls:sys_enter_mmap: flags==0x22 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-czw754b7m9rp9ibq2f6be2o1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-19 15:35:02 -03:00
Arnaldo Carvalho de Melo	e0712baa00	perf trace: Wire up strarray__strtoul_flags() Now anything that uses STRARRAY_FLAGS, like the 'fsmount' syscall will support mapping or-ed strings back to a value that can be used in a filter. In some cases, where STRARRAY_FLAGS isn't used but instead the scnprintf is a special one because of specific needs, like for mmap, then one has to set the ->pars to the strarray. See the next cset. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-r2lpqo7dfsrhi4ll0npsb3u7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-19 15:35:02 -03:00
Arnaldo Carvalho de Melo	154c978d48	libbeauty: Introduce strarray__strtoul_flags() Counterpart of strarray__scnprintf_flags(), i.e. from a expression like: # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE\|FIXED\|DENYWRITE" I.e. that "flags==PRIVATE\|FIXED\|DENYWRITE", turn that into # perf trace -e syscalls:sys_enter_mmap --filter=0x812 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-8xst3zrqqogax7fmfzwymvbl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-19 15:35:02 -03:00
Arnaldo Carvalho de Melo	82c38338e0	perf trace: Use strtoul for the fcntl 'cmd' argument Since its values are in two ranges of values we ended up codifying it using a 'struct strarrays', so now hook it up with STUL_STRARRAYS so that we can do: # perf trace -e syscalls:*enter_fcntl --filter=cmd==SETLK\|\|cmd==SETLKW 0.000 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4dee0) 1.523 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4de90) 1.629 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4de90) 2.711 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4de70) ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-mob96wyzri4r3rvyigqfjv0a@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-19 15:35:02 -03:00
Arnaldo Carvalho de Melo	1a8a90b823	libbeauty: Introduce syscall_arg__strtoul_strarrays() To allow going from string to integer for 'struct strarrays'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-b1ia3xzcy72hv0u4m168fcd0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-19 15:35:01 -03:00
Arnaldo Carvalho de Melo	9afec87ec1	perf trace: Pass a syscall_arg to syscall_arg_fmt->strtoul() With just what we need for the STUL_STRARRAY, i.e. the 'struct strarray' pointer to be used, just like with syscall_arg_fmt->scnprintf() for the other direction (number -> string). With this all the strarrays that are associated with syscalls can be used with '-e syscalls:sys_enter_SYSCALLNAME --filter', and soon will be possible as well to use with the strace-like shorter form, with just the syscall names, i.e. something like: -e lseek/whence==END/ For now we have to use the longer form: # perf trace -e syscalls:sys_enter_lseek 0.000 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR) 0.031 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR) 0.046 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR) 5003.528 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR) 5003.575 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR) 5003.593 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR) 10002.017 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR) 10002.051 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR) 10002.068 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR) ^C# perf trace -e syscalls:sys_enter_lseek --filter="whence!=CUR" 0.000 sshd/24476 syscalls:sys_enter_lseek(fd: 3, offset: 9032, whence: SET) 0.060 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 9032, whence: SET) 0.187 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET) 0.203 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET) 0.349 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 61936, whence: SET) ^C# And for those curious about what are those lseek(DSO, offset, SET), well, its the loader: # perf trace -e syscalls:sys_enter_lseek/max-stack=16/ --filter="whence!=CUR" 0.000 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) 0.067 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object_from_fd (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) 0.198 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) 0.219 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object_from_fd (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) ^C# :-) With this we can use strings in strarrays in filters, which allows us to reuse all these that are in place for syscalls: $ find tools/perf/trace/beauty/ -name ".c" \| xargs grep -w DEFINE_STRARRAY tools/perf/trace/beauty/fcntl.c: static DEFINE_STRARRAY(fcntl_setlease, "F_"); tools/perf/trace/beauty/mmap.c: static DEFINE_STRARRAY(mmap_flags, "MAP_"); tools/perf/trace/beauty/mmap.c: static DEFINE_STRARRAY(madvise_advices, "MADV_"); tools/perf/trace/beauty/sync_file_range.c: static DEFINE_STRARRAY(sync_file_range_flags, "SYNC_FILE_RANGE_"); tools/perf/trace/beauty/socket.c: static DEFINE_STRARRAY(socket_ipproto, "IPPROTO_"); tools/perf/trace/beauty/mount_flags.c: static DEFINE_STRARRAY(mount_flags, "MS_"); tools/perf/trace/beauty/pkey_alloc.c: static DEFINE_STRARRAY(pkey_alloc_access_rights, "PKEY_"); tools/perf/trace/beauty/sockaddr.c:DEFINE_STRARRAY(socket_families, "PF_"); tools/perf/trace/beauty/tracepoints/x86_irq_vectors.c:static DEFINE_STRARRAY(x86_irq_vectors, "_VECTOR"); tools/perf/trace/beauty/tracepoints/x86_msr.c:static DEFINE_STRARRAY(x86_MSRs, "MSR_"); tools/perf/trace/beauty/prctl.c: static DEFINE_STRARRAY(prctl_options, "PR_"); tools/perf/trace/beauty/prctl.c: static DEFINE_STRARRAY(prctl_set_mm_options, "PR_SET_MM_"); tools/perf/trace/beauty/fspick.c: static DEFINE_STRARRAY(fspick_flags, "FSPICK_"); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(ioctl_tty_cmd, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(drm_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(sndrv_pcm_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(sndrv_ctl_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(kvm_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(vhost_virtio_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(vhost_virtio_ioctl_read_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(perf_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(usbdevfs_ioctl_cmds, ""); tools/perf/trace/beauty/fsmount.c: static DEFINE_STRARRAY(fsmount_attr_flags, "MOUNT_ATTR_"); tools/perf/trace/beauty/renameat.c: static DEFINE_STRARRAY(rename_flags, "RENAME_"); tools/perf/trace/beauty/kcmp.c: static DEFINE_STRARRAY(kcmp_types, "KCMP_"); tools/perf/trace/beauty/move_mount.c: static DEFINE_STRARRAY(move_mount_flags, "MOVE_MOUNT_"); $ Well, some, as the mmap flags are like: $ tools/perf/trace/beauty/mmap_flags.sh static const char mmap_flags[] = { [ilog2(0x40) + 1] = "32BIT", [ilog2(0x01) + 1] = "SHARED", [ilog2(0x02) + 1] = "PRIVATE", [ilog2(0x10) + 1] = "FIXED", [ilog2(0x20) + 1] = "ANONYMOUS", [ilog2(0x008000) + 1] = "POPULATE", [ilog2(0x010000) + 1] = "NONBLOCK", [ilog2(0x020000) + 1] = "STACK", [ilog2(0x040000) + 1] = "HUGETLB", [ilog2(0x080000) + 1] = "SYNC", [ilog2(0x100000) + 1] = "FIXED_NOREPLACE", [ilog2(0x0100) + 1] = "GROWSDOWN", [ilog2(0x0800) + 1] = "DENYWRITE", [ilog2(0x1000) + 1] = "EXECUTABLE", [ilog2(0x2000) + 1] = "LOCKED", [ilog2(0x4000) + 1] = "NORESERVE", }; $ So we'll need a strarray__strtoul_flags() that will break donw the flags into tokens separated by '\|' before doing the lookup and then go on reconstructing the value from, say: # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE\|FIXED\|DENYWRITE" into: # perf trace -e syscalls:sys_enter_mmap --filter="flags==0x2\|0x10\|0x0800" and finally into: # perf trace -e syscalls:sys_enter_mmap --filter="flags==0x812" That is what we see if we don't use the augmented view obtained from: # perf trace -e mmap <SNIP> 211792.885 procmail/15393 mmap(addr: 0x7fcd11645000, len: 8192, prot: READ, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 8, off: 0xa000) = 0x7fcd11645000 <SNIP> But plain use tracefs: procmail-15559 [000] .... 54557.178262: sys_mmap(addr: 7f5c9bf7a000, len: 9b000, prot: 1, flags: 812, fd: 3, off: a9000) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-c6mgkjt8ujnc263eld5tb7q3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-19 15:34:48 -03:00
Arnaldo Carvalho de Melo	db25bf98a3	perf trace: Honour --max-events in processing syscalls:sys_enter_* We were doing this only at the sys_exit syscall tracepoint, as for strace-like we count the pair of sys_enter and sys_exit as one event, but when asking specifically for a the syscalls:sys_enter_NAME tracepoint we need to count each of those as an event. I.e. things like: # perf trace --max-events=4 -e syscalls:sys_enter_lseek 0.000 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR) 0.034 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR) 0.051 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR) 2307.900 sshd/30800 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libsystemd.so.0.25.0>, offset: 9032, whence: SET) # Were going on forever, since we only had sys_enter events. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-0ob1dky1a9ijlfrfhxyl40wr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-18 12:19:02 -03:00
Arnaldo Carvalho de Melo	d066da978f	libbeauty: Introduce syscall_arg__strtoul_strarray() To go from strarrays strings to its indexes. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-wta0qvo207z27huib2c4ijxq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-18 12:07:46 -03:00
Arnaldo Carvalho de Melo	362222f877	perf trace: Initialize evsel_trace->fmt for syscalls:sys_enter_* tracepoints From the syscall_fmts->arg entries for formatting strace-like syscalls. This is when resolving the string "whence" on a filter expression for the syscalls:sys_enter_lseek: Breakpoint 3, perf_evsel__syscall_arg_fmt (evsel=0xc91ed0, arg=0x7fffffff7cd0 "whence") at builtin-trace.c:3626 3626 { (gdb) n 3628 struct syscall_arg_fmt fmt = __evsel__syscall_arg_fmt(evsel); (gdb) n 3630 if (evsel->tp_format == NULL \|\| fmt == NULL) (gdb) n 3633 for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt) (gdb) n 3634 if (strcmp(field->name, arg) == 0) (gdb) p field->name $3 = 0xc945e0 "__syscall_nr" (gdb) n 3633 for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt) (gdb) p fmt $4 = {scnprintf = 0x0, strtoul = 0x0, mask_val = 0x0, parm = 0x0, name = 0x0, nr_entries = 0, show_zero = false} (gdb) n 3634 if (strcmp(field->name, arg) == 0) (gdb) p field->name $5 = 0xc94690 "fd" (gdb) n 3633 for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt) (gdb) n 3634 if (strcmp(field->name, arg) == 0) (gdb) n 3633 for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt) (gdb) n 3634 if (strcmp(field->name, arg) == 0) (gdb) p fmt $9 = {scnprintf = 0x489be2 <syscall_arg__scnprintf_strarray>, strtoul = 0x0, mask_val = 0x0, parm = 0xa2da80 <strarray.whences>, name = 0x0, nr_entries = 0, show_zero = false} (gdb) p field->name $10 = 0xc947b0 "whence" (gdb) p fmt->parm $11 = (void ) 0xa2da80 <strarray.whences> (gdb) p (struct strarray )fmt->parm $12 = {offset = 0, nr_entries = 5, prefix = 0x724d37 "SEEK_", entries = 0xa2da40 <whences>} (gdb) p (struct strarray )fmt->parm)->entries Junk after end of expression. (gdb) p ((struct strarray )fmt->parm)->entries $13 = (const char *) 0xa2da40 <whences> (gdb) p ((struct strarray )fmt->parm)->entries[0] $14 = 0x724d21 "SET" (gdb) p ((struct strarray )fmt->parm)->entries[1] $15 = 0x724d25 "CUR" (gdb) p ((struct strarray )fmt->parm)->entries[2] $16 = 0x724d29 "END" (gdb) p ((struct strarray )fmt->parm)->entries[2] $17 = 0x724d29 "END" (gdb) p ((struct strarray )fmt->parm)->entries[3] $18 = 0x724d2d "DATA" (gdb) p ((struct strarray *)fmt->parm)->entries[4] $19 = 0x724d32 "HOLE" (gdb) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-lc8h9jgvbnboe0g7ic8tra1y@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-18 12:07:42 -03:00
Arnaldo Carvalho de Melo	2b00bb627f	perf trace: Introduce 'struct evsel__trace' for evsel->priv needs For syscalls we need to cache the 'syscall_id' and 'ret' field offsets but as well have a pointer to the syscall_fmt_arg array for the fields, so that we can expand strings in filter expressions, so introduce a 'struct evsel_trace' to have in evsel->priv that allows for that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-hx8ukasuws5sz6rsar73cocv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-17 17:27:43 -03:00
Arnaldo Carvalho de Melo	8b913df50f	perf trace: Hide evsel->access further, simplify code Next step will be to have a 'struct evsel_trace' to allow for handling the syscalls tracepoints via the strace-like code while reusing parts of that code with the other tracepoints, where we don't have things like the 'syscall_nr' or 'ret' ((raw_)?syscalls:sys_{enter,exit}(_SYSCALL)?) args that we want to cache offsets and have been using evsel->priv for that, while for the other tracepoints we'll have just an array of 'struct syscall_arg_fmt' (i.e. ->scnprint() for number->string and ->strtoul() string->number conversions and other state those functions need). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-fre21jbyoqxmmquxcho7oa0x@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-17 17:26:35 -03:00
Arnaldo Carvalho de Melo	fecd990720	perf trace: Introduce accessors to trace specific evsel->priv We're using evsel->priv in syscalls:sys_{enter,exit}_SYSCALL and in raw_syscalls:sys_{enter,exit} to cache the offset of the common fields, the multiplexor id/syscall_id in the sys_enter case and syscall_id + ret for sys_exit. And for the rest of the tracepoints we use it to have a syscall_arg_fmt array to have scnprintf/strtoul for tracepoint args. So we better clearly mark them with accessors so that we can move to having a 'struct evsel_trace' struct for all 'perf trace' specific evsel->priv usage. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-dcoyxfslg7atz821tz9aupjh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-17 17:26:35 -03:00
Arnaldo Carvalho de Melo	3cdc8db91e	perf trace: Show error message when not finding a field used in a filter expression It was there, but as pr_debug(), make it pr_err() so that we can see it without -v: # trace -e syscalls:*lseek --filter="whenc==SET" sleep 1 "whenc" not found in "syscalls:sys_enter_lseek", can't set filter "whenc==SET" # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-ly4rgm1bto8uwc2itpaixjob@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-17 17:26:35 -03:00
Arnaldo Carvalho de Melo	df604bfda6	perf trace: Hook the 'vec' tracepoint argument with the x86 IRQ vectors scnprintf/strtoul Ended up only being useful when filtering multiple irq_vectors tracepoints, as we end up having a tracepoint for each of the entries, i.e.: This will always come with the "RESCHEDULE_VECTOR" in the 'vector' arg: # perf trace --max-events 8 -e irq_vectors:reschedule* 0.000 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE) 0.004 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE) 0.553 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE) 0.556 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE) 1.182 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE) 1.185 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE) 1.203 :29052/29052 irq_vectors:reschedule_entry(vector: RESCHEDULE) 1.206 :29052/29052 irq_vectors:reschedule_exit(vector: RESCHEDULE) # While filtering that value will produce nothing: # perf trace --max-events 8 -e irq_vectors:reschedule* --filter="vector != RESCHEDULE" ^C# Maybe it'll be useful for those other tracepoints: # perf list irq_vectors:vector_* List of pre-defined events (to be used in -e): irq_vectors:vector_activate [Tracepoint event] irq_vectors:vector_alloc [Tracepoint event] irq_vectors:vector_alloc_managed [Tracepoint event] irq_vectors:vector_clear [Tracepoint event] irq_vectors:vector_config [Tracepoint event] irq_vectors:vector_deactivate [Tracepoint event] irq_vectors:vector_free_moved [Tracepoint event] irq_vectors:vector_reserve [Tracepoint event] irq_vectors:vector_reserve_managed [Tracepoint event] irq_vectors:vector_setup [Tracepoint event] irq_vectors:vector_teardown [Tracepoint event] irq_vectors:vector_update [Tracepoint event] # But since we have it done, keep it. This at least served to teach me that all those irq vectors have a entry and an exit tracepoint that I can then use just like with raw_syscalls:sys_{enter,exit}, i.e. pair them, use just a trace__irq_vectors_entry() + trace__irq_vectors_exit() and use the 'vector' arg as I use the 'syscall id' one for syscalls. Then the default for 'perf trace' will include irq_vectors in addition to syscalls. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-wer4cwbbqub3o7sa8h1j3uzb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-15 16:50:13 -03:00
Arnaldo Carvalho de Melo	97c2a7806f	libbeauty: Add a strarray__scnprintf_suffix() method In some cases, like with x86 IRQ vectors, the common part in names is at the end, so a suffix, add a scnprintf function for that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-agxbj6es2ke3rehwt4gkdw23@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-15 16:01:42 -03:00
Arnaldo Carvalho de Melo	c5e006cdbd	perf trace: Support tracepoint dynamic char arrays Things like: # grep __data_loc /sys/kernel/debug/tracing/events/sched/sched_process_exec/format field:__data_loc char[] filename; offset:8; size:4; signed:1; # That, at that offset (8) and with that size(8) have an integer that contains the real length and offset for the contents of that array. Now this works: # perf trace --max-events 1 -e sched:exec -a 0.000 sed/19441 sched:sched_process_exec(filename: "/usr/bin/sync", pid: 19441 (sync), old_pid: 19441 (sync)) # As when using the libtraceevent based beautifier: # perf trace --libtraceevent --max-events 1 -e sched:exec -a 0.000 sync/19463 sched:sched_process_exec(filename=/usr/bin/sync pid=19463 old_pid=19463) # I.e. that 'filename' is implemented as a dynamic char array. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-950p0m842fe6n7sxsdwqj5i2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-15 13:03:58 -03:00
Arnaldo Carvalho de Melo	7fbfe22cf4	perf trace: Filter own pid to avoid a feedback look in 'perf trace record -a' When doing a system wide 'perf trace record' we need, just like in 'perf trace' live mode, to filter out perf trace's own pid, so set up a tracepoint filter for the raw_syscalls tracepoints right after adding them to the argv array that is set up to then call cmd_record(). Reported-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-uysx5w8f2y5ndoln5cq370tv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-15 13:03:57 -03:00
Arnaldo Carvalho de Melo	b88b14db21	perf trace: Introduce --errno-summary To be used with -S or -s, using just this new option implies -s, examples: # perf trace --errno-summary sleep 1 Summary of events: sleep (10793), 80 events, 93.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ nanosleep 1 0 1000.427 1000.427 1000.427 1000.427 0.00% mmap 8 0 0.026 0.002 0.003 0.005 9.18% close 5 0 0.018 0.001 0.004 0.009 48.97% mprotect 4 0 0.017 0.003 0.004 0.006 16.49% openat 3 0 0.012 0.003 0.004 0.005 9.41% munmap 1 0 0.010 0.010 0.010 0.010 0.00% brk 4 0 0.005 0.001 0.001 0.002 22.77% read 4 0 0.005 0.001 0.001 0.002 22.33% access 1 1 0.004 0.004 0.004 0.004 0.00% ENOENT: 1 fstat 3 0 0.004 0.001 0.001 0.002 17.18% lseek 3 0 0.003 0.001 0.001 0.001 11.62% arch_prctl 2 1 0.002 0.001 0.001 0.001 3.32% EINVAL: 1 execve 1 0 0.000 0.000 0.000 0.000 0.00% # Works as well together with --failure and -S, i.e. collect the stats and show just the syscalls that failed: # perf trace --failure -S --errno-summary sleep 1 0.032 arch_prctl(option: 0x3001, arg2: 0x7fffdb11b580) = -1 EINVAL (Invalid argument) 0.045 access(filename: "/etc/ld.so.preload", mode: R) = -1 ENOENT (No such file or directory) Summary of events: sleep (10806), 80 events, 93.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ nanosleep 1 0 1000.094 1000.094 1000.094 1000.094 0.00% mmap 8 0 0.026 0.002 0.003 0.005 9.06% close 5 0 0.018 0.001 0.004 0.010 49.58% mprotect 4 0 0.017 0.003 0.004 0.006 17.56% openat 3 0 0.014 0.004 0.005 0.006 12.29% munmap 1 0 0.010 0.010 0.010 0.010 0.00% brk 4 0 0.005 0.001 0.001 0.002 22.75% read 4 0 0.005 0.001 0.001 0.002 17.19% access 1 1 0.005 0.005 0.005 0.005 0.00% ENOENT: 1 fstat 3 0 0.004 0.001 0.001 0.002 21.66% lseek 3 0 0.003 0.001 0.001 0.001 11.71% arch_prctl 2 1 0.002 0.001 0.001 0.001 2.66% EINVAL: 1 execve 1 0 0.000 0.000 0.000 0.000 0.00% # Suggested-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-l0mjwczkpouov7lss5zn8d9h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-15 13:03:49 -03:00
Arnaldo Carvalho de Melo	8eded45fcd	perf trace: Add syscall failure stats to -s/--summary and -S/--with-summary Just like strace has: # trace -s sleep 1 Summary of events: sleep (32370), 80 events, 93.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ nanosleep 1 0 1000.402 1000.402 1000.402 1000.402 0.00% mmap 8 0 0.023 0.002 0.003 0.004 8.49% close 5 0 0.015 0.001 0.003 0.009 51.39% mprotect 4 0 0.014 0.002 0.003 0.005 16.95% openat 3 0 0.013 0.003 0.004 0.005 14.29% munmap 1 0 0.010 0.010 0.010 0.010 0.00% read 4 0 0.005 0.001 0.001 0.002 16.83% brk 4 0 0.004 0.001 0.001 0.002 20.82% access 1 1 0.004 0.004 0.004 0.004 0.00% fstat 3 0 0.003 0.001 0.001 0.001 12.17% lseek 3 0 0.003 0.001 0.001 0.001 11.45% arch_prctl 2 1 0.002 0.001 0.001 0.001 2.30% execve 1 0 0.000 0.000 0.000 0.000 0.00% # # perf trace -S sleep 1 ? ... [continued]: execve()) = 0 0.028 brk(brk: NULL) = 0x559f5bd96000 0.033 arch_prctl(option: 0x3001, arg2: 0x7ffda8b715a0) = -1 EINVAL (Invalid argument) 0.046 access(filename: "/etc/ld.so.preload", mode: R) = -1 ENOENT (No such file or directory) 0.055 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY\|CLOEXEC) = 3 0.060 fstat(fd: 3, statbuf: 0x7ffda8b707a0) = 0 0.062 mmap(addr: NULL, len: 134346, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3aedfc4000 0.066 close(fd: 3) = 0 0.079 openat(dfd: CWD, filename: "/lib64/libc.so.6", flags: RDONLY\|CLOEXEC) = 3 0.085 read(fd: 3, buf: 0x7ffda8b70948, count: 832) = 832 0.088 lseek(fd: 3, offset: 792, whence: SET) = 792 0.090 read(fd: 3, buf: 0x7ffda8b70810, count: 68) = 68 0.093 fstat(fd: 3, statbuf: 0x7ffda8b707f0) = 0 0.095 mmap(addr: NULL, len: 8192, prot: READ\|WRITE, flags: PRIVATE\|ANONYMOUS) = 0x7f3aedfc2000 0.101 lseek(fd: 3, offset: 792, whence: SET) = 792 0.103 read(fd: 3, buf: 0x7ffda8b70450, count: 68) = 68 0.105 lseek(fd: 3, offset: 864, whence: SET) = 864 0.107 read(fd: 3, buf: 0x7ffda8b70470, count: 32) = 32 0.110 mmap(addr: NULL, len: 1857472, prot: READ, flags: PRIVATE\|DENYWRITE, fd: 3, off: 0) = 0x7f3aeddfc000 0.114 mprotect(start: 0x7f3aede1e000, len: 1679360, prot: NONE) = 0 0.121 mmap(addr: 0x7f3aede1e000, len: 1363968, prot: READ\|EXEC, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x22000) = 0x7f3aede1e000 0.127 mmap(addr: 0x7f3aedf6b000, len: 311296, prot: READ, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x16f000) = 0x7f3aedf6b000 0.131 mmap(addr: 0x7f3aedfb8000, len: 24576, prot: READ\|WRITE, flags: PRIVATE\|FIXED\|DENYWRITE, fd: 3, off: 0x1bb000) = 0x7f3aedfb8000 0.138 mmap(addr: 0x7f3aedfbe000, len: 14272, prot: READ\|WRITE, flags: PRIVATE\|FIXED\|ANONYMOUS) = 0x7f3aedfbe000 0.147 close(fd: 3) = 0 0.158 arch_prctl(option: SET_FS, arg2: 0x7f3aedfc3580) = 0 0.210 mprotect(start: 0x7f3aedfb8000, len: 16384, prot: READ) = 0 0.230 mprotect(start: 0x559f5b27d000, len: 4096, prot: READ) = 0 0.236 mprotect(start: 0x7f3aee00f000, len: 4096, prot: READ) = 0 0.240 munmap(addr: 0x7f3aedfc4000, len: 134346) = 0 0.300 brk(brk: NULL) = 0x559f5bd96000 0.302 brk(brk: 0x559f5bdb7000) = 0x559f5bdb7000 0.305 brk(brk: NULL) = 0x559f5bdb7000 0.310 openat(dfd: CWD, filename: "/usr/lib/locale/locale-archive", flags: RDONLY\|CLOEXEC) = 3 0.315 fstat(fd: 3, statbuf: 0x7f3aedfbdac0) = 0 0.318 mmap(addr: NULL, len: 217750512, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3ae0e52000 0.325 close(fd: 3) = 0 0.358 nanosleep(rqtp: 0x7ffda8b714b0, rmtp: NULL) = 0 1000.622 close(fd: 1) = 0 1000.641 close(fd: 2) = 0 1000.664 exit_group(error_code: 0) = ? Summary of events: sleep (722), 80 events, 93.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ nanosleep 1 0 1000.194 1000.194 1000.194 1000.194 0.00% mmap 8 0 0.025 0.002 0.003 0.005 10.17% close 5 0 0.018 0.001 0.004 0.010 50.18% mprotect 4 0 0.016 0.003 0.004 0.006 16.81% openat 3 0 0.011 0.003 0.004 0.004 6.57% munmap 1 0 0.010 0.010 0.010 0.010 0.00% brk 4 0 0.005 0.001 0.001 0.002 20.72% read 4 0 0.005 0.001 0.001 0.002 16.71% access 1 1 0.005 0.005 0.005 0.005 0.00% fstat 3 0 0.004 0.001 0.001 0.002 14.82% lseek 3 0 0.003 0.001 0.001 0.001 11.66% arch_prctl 2 1 0.002 0.001 0.001 0.001 3.59% execve 1 0 0.000 0.000 0.000 0.000 0.00% # Works for system wide, e.g. for 1ms: # perf trace -s -a sleep 0.001 Summary of events: sleep (768), 94 events, 37.9% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ nanosleep 1 0 1.133 1.133 1.133 1.133 0.00% execve 7 6 0.351 0.003 0.050 0.316 88.53% mmap 8 0 0.024 0.002 0.003 0.004 8.86% mprotect 4 0 0.017 0.003 0.004 0.006 16.02% openat 3 0 0.013 0.004 0.004 0.005 8.34% munmap 1 0 0.010 0.010 0.010 0.010 0.00% brk 4 0 0.007 0.001 0.002 0.002 10.99% close 5 0 0.005 0.001 0.001 0.002 11.69% read 5 0 0.005 0.000 0.001 0.002 30.53% access 1 1 0.004 0.004 0.004 0.004 0.00% fstat 3 0 0.004 0.001 0.001 0.002 10.74% lseek 3 0 0.003 0.001 0.001 0.001 10.20% arch_prctl 2 1 0.002 0.001 0.001 0.001 3.34% Web Content (21258), 46 events, 18.5% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ recvmsg 12 12 0.015 0.001 0.001 0.002 8.50% futex 2 0 0.008 0.003 0.004 0.005 27.08% poll 6 0 0.006 0.000 0.001 0.002 22.14% read 2 0 0.006 0.002 0.003 0.003 26.08% write 1 0 0.002 0.002 0.002 0.002 0.00% Web Content (4365), 36 events, 14.5% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ recvmsg 10 10 0.015 0.001 0.002 0.003 11.83% poll 5 0 0.006 0.000 0.001 0.002 28.44% futex 2 0 0.005 0.001 0.003 0.004 48.29% read 1 0 0.003 0.003 0.003 0.003 0.00% Timer (21275), 14 events, 5.6% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ futex 6 1 0.240 0.000 0.040 0.149 64.58% write 1 0 0.008 0.008 0.008 0.008 0.00% Timer (4383), 14 events, 5.6% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ futex 6 2 0.186 0.000 0.031 0.181 96.45% write 1 0 0.010 0.010 0.010 0.010 0.00% Web Content (20354), 28 events, 11.3% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ recvmsg 8 8 0.010 0.001 0.001 0.002 15.24% poll 4 0 0.004 0.000 0.001 0.002 35.68% futex 1 0 0.003 0.003 0.003 0.003 0.00% read 1 0 0.003 0.003 0.003 0.003 0.00% Timer (20371), 10 events, 4.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ futex 4 1 0.077 0.000 0.019 0.075 95.46% write 1 0 0.005 0.005 0.005 0.005 0.00% [root@quaco ~]# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-k7kh2muo5oeg56yx446hnw9v@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-15 08:39:42 -03:00
Jiri Olsa	151ed5d70d	libperf: Adopt perf_mmap__read_event() from tools/perf Move perf_mmap__read_event() from tools/perf to libperf and export it in the perf/mmap.h header. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191007125344.14268-13-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-10 11:49:46 -03:00
Jiri Olsa	32fdc2ca7e	libperf: Adopt perf_mmap__read_done() from tools/perf Move perf_mmap__read_init() from tools/perf to libperf and export it in the perf/mmap.h header. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191007125344.14268-12-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-10 11:45:32 -03:00
Jiri Olsa	7c4d41824f	libperf: Adopt perf_mmap__read_init() from tools/perf Move perf_mmap__read_init() from tools/perf to libperf and export it in perf/mmap.h header. And add pr_debug2()/pr_debug3() macros support, because the code is using them. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191007125344.14268-11-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-10 11:45:21 -03:00
Jiri Olsa	7728fa0cfa	libperf: Adopt perf_mmap__consume() function from tools/perf Move perf_mmap__consume() vrom tools/perf to libperf and export it in the perf/mmap.h header. Move also the needed helpers perf_mmap__write_tail(), perf_mmap__read_head() and perf_mmap__empty(). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191007125344.14268-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-10 11:43:49 -03:00
Arnaldo Carvalho de Melo	728db19886	perf beauty: Introduce strtoul() for x86 MSRs Continuing from the previous cset comment, now that filter expression works: # perf trace -e msr:* --filter="msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750 0.000 Timer/5033 msr:write_msr(msr: SYSCALL_MASK, val: 292608) 0.009 Timer/5033 msr:write_msr(msr: LSTAR, val: -1398800368) 0.010 Timer/5033 msr:write_msr(msr: TSC_AUX, val: 4) 0.050 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 45.661 gnome-terminal/12595 msr:write_msr(msr: SYSCALL_MASK, val: 292608) 45.672 gnome-terminal/12595 msr:write_msr(msr: LSTAR, val: -1398800368) 45.675 gnome-terminal/12595 msr:write_msr(msr: TSC_AUX, val: 3) 54.852 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 130.508 Timer/4050 msr:write_msr(msr: SYSCALL_MASK, val: 292608) 130.527 Timer/4050 msr:write_msr(msr: LSTAR, val: -1398800368) 130.531 Timer/4050 msr:write_msr(msr: TSC_AUX, val: 3) 140.924 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 164.738 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 603.578 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 620.809 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 690.115 JS Watchdog/4259 msr:write_msr(msr: SYSCALL_MASK, val: 292608) 690.136 JS Watchdog/4259 msr:write_msr(msr: LSTAR, val: -1398800368) 690.141 JS Watchdog/4259 msr:write_msr(msr: TSC_AUX, val: 3) 690.186 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 759.016 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) ^C[root@quaco ~]# Or look at the first 3 write_msr events for that IA32_TSC_DEADLINE to learn why it happens so often: # perf trace --max-events=3 --max-stack=8 -e msr:* --filter="msr==IA32_TSC_DEADLINE" --filter-pids 3750 0.000 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19296732550862) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) lapic_next_deadline ([kernel.kallsyms]) clockevents_program_event ([kernel.kallsyms]) hrtimer_interrupt ([kernel.kallsyms]) smp_apic_timer_interrupt ([kernel.kallsyms]) apic_timer_interrupt ([kernel.kallsyms]) cpuidle_enter_state ([kernel.kallsyms]) 32.646 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19296800134158) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) lapic_next_deadline ([kernel.kallsyms]) clockevents_program_event ([kernel.kallsyms]) hrtimer_start_range_ns ([kernel.kallsyms]) tick_nohz_restart_sched_tick ([kernel.kallsyms]) tick_nohz_idle_exit ([kernel.kallsyms]) do_idle ([kernel.kallsyms]) 32.802 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19297507436922) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) lapic_next_deadline ([kernel.kallsyms]) clockevents_program_event ([kernel.kallsyms]) hrtimer_try_to_cancel ([kernel.kallsyms]) hrtimer_cancel ([kernel.kallsyms]) tick_nohz_restart_sched_tick ([kernel.kallsyms]) tick_nohz_idle_exit ([kernel.kallsyms]) # And if some of the strings can't be found: # trace -e msr:* --filter="msr!=SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750 "SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION" not found for "msr" in "msr:read_msr", can't set filter "(msr!=SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL) && (common_pid != 28131 && common_pid != 3750)" # Next step is to automatically wire up the pre-existing strarrays, which there are quite a few. The strtoul() methods will be further enhanced to allow for looking at other arguments in a syscall/tracepoint, just like going from integer to string (scnprintf methods), so that those "val" lines for the msr tracepoints can be properly formatted or even resolved into some string. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-4qaai5iqjgefd11k4ddm7qg8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-09 16:25:02 -03:00
Arnaldo Carvalho de Melo	90df0249c2	perf trace: Expand strings in filters to integers So that one can try things like: # perf trace -e msr:* --filter="msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750 That, at this point in the patchset, without any strtoul in place for tracepoint arguments, will result in: No resolver (strtoul) for "msr" in "msr:read_msr", can't set filter "(msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL) && (common_pid != 25407 && common_pid != 3750)" # See you in the next cset! Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-dx5j70fv2rgkeezd1cb3hv2p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-09 16:22:16 -03:00
Arnaldo Carvalho de Melo	d0a3a10410	perf trace: Introduce a strtoul() method for 'struct strarrays' And also for 'struct strarray', since its needed to implement strarrays__strtoul(). This just traverses the entries and when finding a match, returns (offset + index), i.e. the value associated with the searched string. E.g. "EFER" (MSR_EFER) returns: # grep -w EFER -B2 /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c #define x86_64_specific_MSRs_offset 0xc0000080 static const char *x86_64_specific_MSRs[] = { [0xc0000080 - x86_64_specific_MSRs_offset] = "EFER", # 0xc0000080 This will be auto-attached to 'struct syscall_arg_fmt' entries associated with strarrays as soon as we add a ->strarray and ->strarrays to 'struct syscall_arg_fmt'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-r2hpaahf8lishyb1owko9vs1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-09 16:11:36 -03:00
Arnaldo Carvalho de Melo	3f41b77843	perf trace: Add a strtoul() method to 'struct syscall_arg_fmt' This will go from a string to a number, so that filter expressions can be constructed with strings and then, before applying the tracepoint filters (or eBPF, in the future) we can map those strings to numbers. The first one will be for 'msr' tracepoint arguments, but real quickly we will be able to reuse all strarrays for that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-wgqq48agcgr95b8dmn6fygtr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-09 16:06:43 -03:00
Arnaldo Carvalho de Melo	d4097f1937	perf trace: Introduce --filter for tracepoint events Similar to what is in 'perf record', works just like there: # perf trace -e msr:* 328.297 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.302 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.306 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.317 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.322 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.327 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.331 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.336 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.340 :0/0 ^Cmsr:write_msr(msr: FS_BASE, val: 140240388381888) # So, for a system wide trace session looking at the write_msr tracepoint we see a flood of MSR_FS_BASE, we need to get the number for that: # grep FS_BASE /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c [0xc0000100 - x86_64_specific_MSRs_offset] = "FS_BASE", # And then use it in a filter: # perf trace -e msr:* --filter="msr!=0xc0000100" <SNIP> 942.177 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068232) 942.199 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3057135655252) 942.203 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068222) 942.231 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056998373022) 942.241 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068236) <SNIP> # Ok, lets filter that too, too noisy: # grep TSC_DEADLINE /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c [0x000006E0] = "IA32_TSC_DEADLINE", # # perf trace -e msr:* --filter="msr!=0xc0000100 && msr!=0x6e0" -a sleep 0.1 0.000 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 0.066 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 0.070 CPU 0/KVM/4895 msr:write_msr(msr: 0x830, val: 34359740667) 0.099 CPU 0/KVM/4895 msr:read_msr(msr: IA32_SYSENTER_ESP, val: -2199021993472) 0.100 CPU 0/KVM/4895 msr:read_msr(msr: IA32_APICBASE, val: 4276096000) 0.101 CPU 0/KVM/4895 msr:read_msr(msr: IA32_DEBUGCTLMSR) 0.109 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL) 1.000 :0/0 msr:write_msr(msr: 0x830, val: 17179871485) 18.893 :0/0 msr:write_msr(msr: 0x83f, val: 246) 28.810 :0/0 msr:write_msr(msr: 0x830, val: 68719479037) 40.117 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 40.127 CPU 0/KVM/4895 msr:read_msr(msr: IA32_DEBUGCTLMSR) 40.139 CPU 0/KVM/4895 msr:write_msr(msr: LSTAR, val: -2130661312) 40.141 CPU 0/KVM/4895 msr:write_msr(msr: SYSCALL_MASK, val: 14080) 40.142 CPU 0/KVM/4895 msr:write_msr(msr: TSC_AUX) 40.144 CPU 0/KVM/4895 msr:write_msr(msr: KERNEL_GS_BASE) 40.147 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL) 40.148 CPU 0/KVM/4895 msr:write_msr(msr: IA32_FLUSH_CMD, val: 1) 40.151 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) ^C # One can combine that with filtering pids as well: # perf trace -e msr:* --filter="msr!=0xc0000100 && msr!=0x6e0" --filter-pids 4895 -a sleep 0.09 0.000 :0/0 msr:write_msr(msr: 0x830, val: 4294969597) 0.291 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608) 0.294 gnome-terminal/2790 msr:write_msr(msr: LSTAR, val: -1935671280) 0.295 gnome-terminal/2790 msr:write_msr(msr: TSC_AUX, val: 6) 10.940 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 15.943 gnome-shell/2096 msr:write_msr(msr: 0x830, val: 4294969597) 16.975 :0/0 msr:write_msr(msr: 0x830, val: 4294969597) 19.560 :0/0 msr:write_msr(msr: 0x83f, val: 246) 25.162 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 25.807 JS Watchdog/3635 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 25.820 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL) 25.941 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 26.941 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 29.942 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 45.313 :0/0 msr:write_msr(msr: 0x83f, val: 246) 56.945 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 60.946 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 74.096 JS Watchdog/8971 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 74.130 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL) 79.673 :0/0 msr:write_msr(msr: 0x83f, val: 246) 79.947 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 17179871485) # Or for just a pid, with callchains: # grep SYSCALL_MAS /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c [0xc0000084 - x86_64_specific_MSRs_offset] = "SYSCALL_MASK", # perf trace -e msr:* --filter="msr==0xc0000084" --pid 2790 --call-graph=dwarf 0.000 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) kvm_on_user_return ([kvm]) fire_user_return_notifiers ([kernel.kallsyms]) exit_to_usermode_loop ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) __GI___poll (inlined) 9299.073 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) kvm_on_user_return ([kvm]) fire_user_return_notifiers ([kernel.kallsyms]) exit_to_usermode_loop ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) __GI___poll (inlined) 9348.374 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) kvm_on_user_return ([kvm]) fire_user_return_notifiers ([kernel.kallsyms]) exit_to_usermode_loop ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) __GI___poll (inlined) <SNIP> # Ok, just another form of KVM to emit MSRs :-) Next step: elliminate those greps by getting the filter expression, looking for arg names, then for the arrays associated with it to do a reverse lookup. Also allow those filters to be associated with strace-like syscall names. After that: augment the 'val' arg for 'msr:write_msr' based on the first arg, 'msr'. Then, do that with eBPF too, not just with tracepoint filters. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-95bfe5d4tzy5f66bx49d05rj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo	c330ef2847	perf trace: Associate the "msr" tracepoint arg name with x86_MSR__scnprintf() So that we can go from: # perf trace -e msr:write_msr --max-stack=16 sleep 1 0.000 sleep/6740 msr:write_msr(msr: 3221225728, val: 139636317451648) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) do_arch_prctl_64 ([kernel.kallsyms]) __x64_sys_arch_prctl ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) init_tls (/usr/lib64/ld-2.29.so) dl_main (/usr/lib64/ld-2.29.so) _dl_sysdep_start (/usr/lib64/ld-2.29.so) _dl_start (/usr/lib64/ld-2.29.so) # To: # perf trace -e msr:write_msr --max-stack=16 sleep 1 0.000 sleep/8519 msr:write_msr(msr: FS_BASE, val: 139878031705472) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) do_arch_prctl_64 ([kernel.kallsyms]) __x64_sys_arch_prctl ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) init_tls (/usr/lib64/ld-2.29.so) dl_main (/usr/lib64/ld-2.29.so) _dl_sysdep_start (/usr/lib64/ld-2.29.so) _dl_start (/usr/lib64/ld-2.29.so) # This, in reverse, will allow for symbolic system call/tracepoint filtering. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-q1q4unmqja5ex7dy0kb5cjaa@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo	5d88099bc0	perf trace: Allow associating scnprintf routines with well known arg names For instance 'msr' appears in several tracepoints, so we can associate it with a single scnprintf() routine auto-generated from kernel headers, as will be done in followup patches. Start with an empty array of associations. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-89ptht6s5fez82lykuwq1eyb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-09 11:23:52 -03:00
Arnaldo Carvalho de Melo	f11b2803bb	perf trace: Allow choosing how to augment the tracepoint arguments So far we used the libtraceevent printing routines when showing tracepoint arguments, but since 'perf trace' has a lot of beautifiers for syscall arguments, and since some of those can be used to augment tracepoint arguments, add a routine to make use of those beautifiers and allow the user to choose which one to use. The default now is to use the same beautifiers used for the strace-like sys_enter+sys_exit lines, but the user can choose the libtraceevent ones by either using the: perf trace --libtraceevent_print command line option, or by setting: # cat ~/.perfconfig [trace] tracepoint_beautifiers = libtraceevent For instance, here are some examples: # perf trace -e sched:switch,sleep,sched:wakeup,exit,sched:exit sleep 1 0.000 sched:sched_wakeup(comm: "perf", pid: 5273 (perf), prio: 120, success: 1, target_cpu: 6) 0.621 nanosleep(rqtp: 0x7ffdd06d1140, rmtp: NULL) ... 0.628 sched:sched_switch(prev_comm: "sleep", prev_pid: 5273 (sleep), prev_prio: 120, prev_state: 1, next_comm: "swapper/6", next_pid: 0, next_prio: 120) 1000.879 sched:sched_wakeup(comm: "sleep", pid: 5273 (sleep), prio: 120, success: 1, target_cpu: 6) 0.621 ... [continued]: nanosleep()) = 0 1001.026 exit_group(error_code: 0) = ? 1001.216 sched:sched_process_exit(comm: "sleep", pid: 5273 (sleep), prio: 120) # And then using libtraceevent, as before: # perf trace --libtraceevent_print -e sched:switch,sleep,sched:wakeup,exit,sched:exit sleep 1 0.000 sched:sched_wakeup(comm=perf pid=5288 prio=120 target_cpu=001) 0.739 nanosleep(rqtp: 0x7ffeba6c2f40, rmtp: NULL) ... 0.747 sched:sched_switch(prev_comm=sleep prev_pid=5288 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120) 1000.902 sched:sched_wakeup(comm=sleep pid=5288 prio=120 target_cpu=001) 0.739 ... [continued]: nanosleep()) = 0 1001.012 exit_group(error_code: 0) = ? # The new default allocates an array of 'struct syscall_arg_fmt' for the tracepoint arguments and, just like with syscall arguments, tries to find suitable syscall_arg__scnprintf_NAME() routines to augment those tracepoint arguments based on their type (as in the tracefs "format" file), or even in their name + type, for instance arguntents with names ending in "fd" with type "int" get the fd scnprintf beautifier attached, etc. Soon this will take advantage of the kernel BTF information to augment enumerations based on the tracefs "format" type info. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-o8qdluotkcb3b1x2gjqrejcl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo	311baaf93c	perf trace: Enclose all events argument lists with () So that they look a bit like normal strace-like syscall enter+exit lines. They will look even more when we switch from using libtraceevent's tep_print_event() routine in favour of using all the perf beautifiers used by the strace-like syscall enter+exit lines. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-y4fcej6v6u1m644nbxd2r4pg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo	9597945d7f	perf trace: Add array of chars scnprintf beautifier Needed for sched's traceoints prev/next comm, where, unlike with syscalls, we are not dealing with an integer or pointer, but an array straight out from the ring buffer. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-rlll7tmcqe1g4odtaifil5re@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo	888ca854e2	perf trace: Add the syscall_arg_fmt pointer to syscall_arg So that the scnprintf beautifiers can access it, as will be the case with the char array one in the following csets, that needs to know the number of elements in an array. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-01qmjqv6cb1nj1qy4khdexce@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo	3e0c9b2cfa	perf trace: Move some scnprintf methods from syscall to syscall_arg_fmt Since all they operate on is on a syscall_arg_fmt instance, so move them to allow use it from the upcoming tracepoint fprintf routine. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-ynttrs1l75f0x9tk67spd7jd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo	947b843cf5	perf trace: Allocate an array of beautifiers for tracepoint args This will work similar to the syscall args, we'll allocate an array of 'struct syscall_arg_fmt' for the tracepoint args and then init them using the same algorithm used for the defaults for syscall args, i.e. using its types and sometimes names as hints to find the right scnprintf routine to beautify them from numbers into strings. Next step is to stop using libtracevent to printf tracepoints, as we'll have more beautifiers than int provides, modulo perhaps some plugins. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-dcl135relxvf6ljisjg13aqg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo	8d1d4ff5e2	perf trace: Factor out the initialization of syscal_arg_fmt->scnprintf We set the default scnprint routines for the syscall args based on its type or on heuristics based on its names, now we'll use this for tracepoints as well, so move it out of syscall__set_arg_fmts() and into a routine that receive just an array of syscall_arg_fmt entries + the tracepoint format fields list. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-xs3x0zzyes06c7scdsjn01ty@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:18 -03:00
Arnaldo Carvalho de Melo	8bd436b006	perf trace augmented_syscalls: Do not show syscalls when none was asked for When not using augmented syscalls, i.e. not passing thru the command line a eBPF source or object file event that provides the __augmented_syscalls__ BPF_MAP_TYPE_PERF_EVENT_ARRAY, etc, as with: perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c or passing that augmented eBPF source/object via the trace.add_events in .perfconfig file, we were assuming that syscalls were asked for, differing from when not using augmented syscalls at all. This is confusing when using .perfconfig to hide the fact we're using the augmenter, i.e. using: # perf trace -e sched:* sleep 1 Will show both the scheduler tracepoints and the syscalls, where what we want is to show just the scheduler tracepoints. To see the scheduler tracepoints and some specific syscall strace-like formatting, one has to use: # perf trace -e sched:,nanosleep sleep 1 Or, if wanting all the syscalls: # perf trace -e sched: --syscalls sleep 1 This way 'perf trace' can be used to trace just a set of tracepoints while allowing for mixing with strace-like when desired, by simply adding to the mix the name of the syscalls to show in addition to the tracepoints. Fix it so that the behaviour using the eBPF based syscall augmenter is the same as when not using one. Testing: Before this patch, with this ~/.perfconfig: # egrep -B1 ^[[:space:]]+add_events ~/.perfconfig [trace] add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o # That points to this pre-compiled eBPF syscall augmenter: # file /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped And when asking for _only_ sched:sched_switch and sched:sched_wakeup we were unconditionally getting all the syscalls formatted strace-like: # perf trace -e sched:switch,sched:wakeup sleep 1 \|& tail 0.633 fstat(3, 0x7fe11d030ac0) = 0 0.635 mmap(NULL, 217750512, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe10fec5000 0.643 close(3) = 0 0.668 nanosleep(0x7fff649a3a90, NULL) ... 0.672 sched:sched_switch:prev_comm=sleep prev_pid=4417 prev_prio=120 prev_state=S ==> next_comm=swapper/6 next_pid=0 next_prio=120 1000.822 sched:sched_wakeup:comm=sleep pid=4417 prio=120 target_cpu=006 0.668 ... [continued]: nanosleep()) = 0 1000.923 close(1) = 0 1000.941 close(2) = 0 1000.974 exit_group(0) = ? # After the patch: # perf trace -e sched:switch,sched:wakeup sleep 1 0.000 sched:sched_wakeup:comm=perf pid=5529 prio=120 target_cpu=005 1.186 sched:sched_switch:prev_comm=sleep prev_pid=5529 prev_prio=120 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120 1001.573 sched:sched_wakeup:comm=sleep pid=5529 prio=120 target_cpu=005 # If we add the "open" syscalls to the mix then the eBPF augmented _will_ be used and these syscalls will be traced together with the specified sched tracepoints: # cd /sys/kernel/debug/tracing/events/syscalls/ # ls -1d sys_enter_open sys_enter_open sys_enter_openat sys_enter_open_by_handle_at sys_enter_open_tree # # perf trace -e open,sched:switch,sched:*wakeup sleep 1 0.000 sched:sched_wakeup:comm=perf pid=5580 prio=120 target_cpu=005 0.590 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY\|O_CLOEXEC) = 3 0.616 openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY\|O_CLOEXEC) = 3 0.846 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY\|O_CLOEXEC) = 3 0.891 sched:sched_switch:prev_comm=sleep prev_pid=5580 prev_prio=120 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120 1001.005 sched:sched_wakeup:comm=sleep pid=5580 prio=120 target_cpu=005 # And as we can see, the pathnames were collected via the eBPF augmenters. If we don't specify anything it'll trace all syscalls: # perf trace sleep 1 \|& tail 0.299 brk(0x5597543a3000) = 0x5597543a3000 0.302 brk(NULL) = 0x5597543a3000 0.307 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY\|O_CLOEXEC) = 3 0.313 fstat(3, 0x7feece50cac0) = 0 0.315 mmap(NULL, 217750512, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7feec13a1000 0.323 close(3) = 0 0.354 nanosleep(0x7ffe338856e0, NULL) = 0 1000.641 close(1) = 0 1000.655 close(2) = 0 1000.673 exit_group(0) = ? # Ditto if we don't use .perfconfig's trace.add_events but instead pass just the augmenter as a command line event: # vim ~/.perfconfig # egrep -B1 ^[[:space:]]+add_events ~/.perfconfig # perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o sleep 1 \|& tail 0.294 brk(0x55ae08ec3000) = 0x55ae08ec3000 0.297 brk(NULL) = 0x55ae08ec3000 0.302 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY\|O_CLOEXEC) = 3 0.309 fstat(3, 0x7f726488fac0) = 0 0.311 mmap(NULL, 217750512, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7257724000 0.319 close(3) = 0 0.347 nanosleep(0x7ffe23643a70, NULL) = 0 1000.560 close(1) = 0 1000.575 close(2) = 0 1000.593 exit_group(0) = ? # As well as that + some syscall names for strace-like formatting: # perf trace -e socket,connect,/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o ssh localhost 0.000 socket(PF_LOCAL, SOCK_STREAM\|CLOEXEC\|NONBLOCK, 0) = 3 0.021 connect(3, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) 0.034 socket(PF_LOCAL, SOCK_STREAM\|CLOEXEC\|NONBLOCK, 0) = 3 0.041 connect(3, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) 0.163 socket(PF_LOCAL, SOCK_STREAM, 0) = 4 0.185 connect(4, { .family: PF_LOCAL, path: /var/lib/sss/pipes/nss }, 110) = 0 0.670 socket(PF_LOCAL, SOCK_STREAM\|CLOEXEC\|NONBLOCK, 0) = 7 0.684 connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) 0.694 socket(PF_LOCAL, SOCK_STREAM\|CLOEXEC\|NONBLOCK, 0) = 7 0.701 connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) 0.994 socket(PF_LOCAL, SOCK_STREAM\|CLOEXEC\|NONBLOCK, 0) = 5 1.006 connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) 1.014 socket(PF_LOCAL, SOCK_STREAM\|CLOEXEC\|NONBLOCK, 0) = 5 1.022 connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) 1.068 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 5 1.087 connect(5, { .family: PF_INET, port: 22, addr: 127.0.0.1 }, 16) = 0 24.299 socket(PF_LOCAL, SOCK_STREAM, 0) = 6 24.337 connect(6, { .family: PF_LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, 110) = 0 28.441 socket(PF_LOCAL, SOCK_STREAM, 0) = 6 28.516 connect(6, { .family: PF_LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, 110) = 0 root@localhost's password:^C # Everything works without augmenters: # egrep -B1 ^[[:space:]]+add_events ~/.perfconfig # perf trace sleep 1 \|& tail 0.261 brk(0x5635068ac000) = 0x5635068ac000 0.264 brk(NULL) = 0x5635068ac000 0.268 openat(AT_FDCWD, 0xdce642a0, O_RDONLY\|O_CLOEXEC) = 3 0.275 fstat(3, 0x7f3fdce97ac0) = 0 0.277 mmap(NULL, 217750512, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f3fcfd2c000 0.284 close(3) = 0 0.310 nanosleep(0x7ffdaea6ecd0, NULL) = 0 1000.552 close(1) = 0 1000.565 close(2) = 0 1000.580 exit_group(0) = ? # # perf trace -e connect ssh localhost 0.000 connect(3, 0x58266930, 110) = -1 ENOENT (No such file or directory) 0.022 connect(3, 0x58266af0, 110) = -1 ENOENT (No such file or directory) 0.150 connect(4, 0x58266b00, 110) = 0 0.490 connect(7, 0x58264150, 110) = -1 ENOENT (No such file or directory) 0.505 connect(7, 0x58264300, 110) = -1 ENOENT (No such file or directory) 0.832 connect(5, 0x58266220, 110) = -1 ENOENT (No such file or directory) 0.847 connect(5, 0x582663e0, 110) = -1 ENOENT (No such file or directory) 0.899 connect(5, 0x95ba0630, 16) = 0 25.619 connect(6, 0x58266360, 110) = 0 40.564 connect(6, 0x58266330, 110) = 0 root@localhost's password: ^C # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-624f6jxic04031tnt40va4dd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo	7e035929f3	perf trace: Postpone parsing .perfconfig trace.add_events to after --verbose is processed When we add events via the '[trace]' section in perfconfig the command line options are not yet processed, so when something goes wrong with parsing those events and using --verbose is advised, we end up not getting any more verbosity by doing so. So just copy the trace.add_events string for later processing, after we processed --verbose and the other command line options. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-d6wbnz85ftqljdll6ynjyjd8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo	bcddbfc5c8	perf trace: Generalize the syscall_fmt find routines To allow them to be used with other stuff, such as tracepoints. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-od3gzg77ppqgnnrxqv40fvgx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo	9b2036cd32	perf trace: Separate 'struct syscall_fmt' definition from syscall_fmts variable As this has all the things needed to format tracepoints events, not just syscalls, that, after all, are just tracepoints with a set in stone ABI, i.e. order and number of parameters. For tracepoints we'll create a static struct syscall_fmt tracepoint_fmts[] array and will fill the ->arg[] entries with the beautifier for each positional argument and record the name, then, when we need it, we'll just check that the position has the same name, maybe even type, so that we can do some check that the tracepoint hasn't changed, if it has, we can even reorder things. Keep calling it syscall_fmt but use it as well for tracepoints, do it this way to minimize changes and reuse what is in place for syscalls, we'll see. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-2x1jgiev13zt4njaanlnne0d@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo	206d635aa5	perf trace: Make evlist__set_evsel_handler() affect just entries without a handler Renaming it to evlist__set_default_evsel_handler(), to better reflect what we want to do, which is to set a default handler for events we still haven't set a custom handler, like the ones for "msr:write_msr", etc that are coming soon. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-e1bit7upnpmtsayh8039kfuw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-10-07 12:22:17 -03:00
Arnaldo Carvalho de Melo	ca1252779f	perf evsel: Introduce evsel_fprintf.h We already had evsel_fprintf.c, add its counterpart, so that we can reduce evsel.h a bit more. We needed a new perf_event_attr_fprintf.c file so as to have a separate object to link with the python binding in tools/perf/util/python-ext-sources and not drag symbol_conf, etc into the python binding. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-06bdmt1062d9unzgqmxwlv88@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-09-25 16:26:34 -03:00
Arnaldo Carvalho de Melo	9620bc361a	perf evsel: Remove need for symbol_conf in evsel_fprintf.c So that we an later link it to the python binding without having to drag the symbol object files. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-8823tveyasocnuoelq4qopwf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-09-25 15:06:59 -03:00
Jiri Olsa	80ab2987a0	libperf: Add perf_evlist__poll() function Move perf_evlist__poll() from tools/perf to libperf, it will be used in the following patches. And rename the existing perf's function to evlist__poll(). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-39-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-09-25 09:51:49 -03:00
Jiri Olsa	f4009e7bf7	libperf: Add perf_evlist__add_pollfd() function Move perf_evlist__add_pollfd() from tools/perf to libperf, it will be used in the following patches. Also rename perf's perf_evlist__add_pollfd()/perf_evlist__filter_pollfd() to evlist__add_pollfd()/evlist__filter_pollfd(). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-38-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-09-25 09:51:49 -03:00
Jiri Olsa	515dbe48f6	libperf: Add perf_evlist__first()/last() functions Add perf_evlist__first()/last() functions to libperf, as internal functions and rename perf's origins to evlist__first/last. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-29-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-09-25 09:51:48 -03:00
Jiri Olsa	c976ee11a0	libperf: Move 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist' Moving 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist', it will be used in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-21-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-09-25 09:51:47 -03:00
Arnaldo Carvalho de Melo	e0fcfb086f	perf evlist: Adopt backwards ring buffer state enum As this isn't used at all in mmap.h but in evlist.h, so to cut down the header dependency tree, move it to where it is used. Also add mmap.h to the places using it but previously getting it indirectly via evlist.h. Add missing pthread.h to evlist.h, as it has a pthread_t struct member and was getting the header via mmap.h. Noticed while processing a Jiri's libperf batch touching mmap.h, where almost everything gets rebuilt because evlist.h is so popular, so cut down't this rebuild the world party. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lkml.kernel.org/n/tip-he0uljeftl0xfveh3d6vtode@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-09-25 09:51:45 -03:00
Jiri Olsa	9521b5f2d9	perf tools: Rename perf_evlist__mmap() to evlist__mmap() Rename perf_evlist__mmap() to evlist__mmap(), so we don't have a name clash when we add perf_evlist__mmap() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-09-25 09:51:44 -03:00
Jiri Olsa	a583053299	perf tools: Rename 'struct perf_mmap' to 'struct mmap' Rename 'struct perf_evlist' to 'struct evlist', so we don't have a name clash when we add 'struct perf_mmap' to libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-09-25 09:51:44 -03:00
Mamatha Inamdar	6ef81c55a2	perf session: Return error code for perf_session__new() function on failure This patch is to return error code of perf_new_session function on failure instead of NULL. Test Results: Before Fix: $ perf c2c report -input failed to open nput: No such file or directory $ echo $? 0 $ After Fix: $ perf c2c report -input failed to open nput: No such file or directory $ echo $? 254 $ Committer notes: Fix 'perf tests topology' case, where we use that TEST_ASSERT_VAL(..., session), i.e. we need to pass zero in case of failure, which was the case before when NULL was returned by perf_session__new() for failure, but now we need to negate the result of IS_ERR(session) to respect that TEST_ASSERT_VAL) expectation of zero meaning failure. Reported-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com> Signed-off-by: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com> Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shawn Landden <shawn@git.icu> Cc: Song Liu <songliubraving@fb.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com> Link: http://lore.kernel.org/lkml/20190822071223.17892.45782.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-09-20 15:58:11 -03:00
Arnaldo Carvalho de Melo	ea49e01cfa	perf tools: Move event synthesizing routines to separate header Those are the only routines using the perf_event__handler_t typedef and are all related, so move to a separate header to reduce the header dependency tree, lots of places were getting event.h and even stdio.h, limits.h indirectly, so fix those as well. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-yvx9u1mf7baq6cu1abfhbqgs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-09-20 09:19:22 -03:00
Arnaldo Carvalho de Melo	4a3cec8494	perf dsos: Move the dsos struct and its methods to separate source files So that we can reduce the header dependency tree further, in the process noticed that lots of places were getting even things like build-id routines and 'struct perf_tool' definition indirectly, so fix all those too. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-ti0btma9ow5ndrytyoqdk62j@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-31 22:24:10 -03:00
Arnaldo Carvalho de Melo	8520a98dba	perf debug: Remove needless include directives from debug.h All we need there is a forward declaration for 'union perf_event', so remove it from there and add missing header directives in places using things from this indirect include. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-7ftk0ztstqub1tirjj8o8xbl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-31 19:10:19 -03:00
Arnaldo Carvalho de Melo	c1a604dff4	perf tools: Remove needless perf.h include directive from headers Its not needed there, add it to the places that need it and were getting it via those headers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-5yulx1u16vyd0zmrbg1tjhju@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-29 17:38:32 -03:00
Arnaldo Carvalho de Melo	2da39f1cc3	perf evlist: Remove needless util.h from evlist.h There is no need for that util/util.h include there and, remove it, pruning the include tree, fix the fallout by adding necessary headers to places that were getting needed includes indirectly from evlist.h -> util.h. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-s9f7uve8wvykr5itcm7m7d8q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-28 17:19:35 -03:00
Igor Lubashev	d06e5fad8c	perf tools: Warn that perf_event_paranoid can restrict kernel symbols Warn that /proc/sys/kernel/perf_event_paranoid can also restrict kernel symbols. Signed-off-by: Igor Lubashev <ilubashe@akamai.com> Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: James Morris <jmorris@namei.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1566869956-7154-6-git-send-email-ilubashe@akamai.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-28 17:19:28 -03:00
Arnaldo Carvalho de Melo	aeb00b1aea	perf record: Move record_opts and other record decls out of perf.h And into a separate util/record.h, to better isolate things and make sure that those who use record_opts and the other moved declarations are explicitly including the necessary header. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-31q8mei1qkh74qvkl9nwidfq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-26 11:58:22 -03:00
Jiri Olsa	a2f354e3ab	libperf: Add perf_thread_map__nr/perf_thread_map__pid functions So it's part of libperf library as basic functions operating on perf_thread_map objects. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190822111141.25823-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-22 17:16:57 -03:00
Arnaldo Carvalho de Melo	22ac4318ad	perf trace: Add --switch-on/--switch-off events Just like with 'perf script': # perf trace -e sched:,syscalls:sleep* sleep 1 0.000 :28345/28345 sched:sched_waking:comm=perf pid=28346 prio=120 target_cpu=005 0.005 :28345/28345 sched:sched_wakeup:perf:28346 [120] success=1 CPU:005 0.383 sleep/28346 sched:sched_process_exec:filename=/usr/bin/sleep pid=28346 old_pid=28346 0.613 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=607375 [ns] vruntime=23289041218 [ns] 0.689 sleep/28346 syscalls:sys_enter_nanosleep:rqtp: 0x7ffc491789b0 0.693 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=72021 [ns] vruntime=23289113239 [ns] 0.694 sleep/28346 sched:sched_switch:sleep:28346 [120] S ==> swapper/5:0 [120] 1000.787 :0/0 sched:sched_waking:comm=sleep pid=28346 prio=120 target_cpu=005 1000.824 :0/0 sched:sched_wakeup:sleep:28346 [120] success=1 CPU:005 1000.908 sleep/28346 syscalls:sys_exit_nanosleep:0x0 1001.218 sleep/28346 sched:sched_process_exit:comm=sleep pid=28346 prio=120 # perf trace -e sched:,syscalls:sleep* --switch-on=syscalls:sys_enter_nanosleep sleep 1 0.000 sleep/28349 sched:sched_stat_runtime:comm=sleep pid=28349 runtime=603036 [ns] vruntime=23873537697 [ns] 0.001 sleep/28349 sched:sched_switch:sleep:28349 [120] S ==> swapper/4:0 [120] 1000.392 :0/0 sched:sched_waking:comm=sleep pid=28349 prio=120 target_cpu=004 1000.443 :0/0 sched:sched_wakeup:sleep:28349 [120] success=1 CPU:004 1000.540 sleep/28349 syscalls:sys_exit_nanosleep:0x0 1000.852 sleep/28349 sched:sched_process_exit:comm=sleep pid=28349 prio=120 # perf trace -e sched:,syscalls:sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep sleep 1 0.000 sleep/28352 sched:sched_stat_runtime:comm=sleep pid=28352 runtime=610543 [ns] vruntime=24811686681 [ns] 0.001 sleep/28352 sched:sched_switch:sleep:28352 [120] S ==> swapper/0:0 [120] 1000.397 :0/0 sched:sched_waking:comm=sleep pid=28352 prio=120 target_cpu=000 1000.440 :0/0 sched:sched_wakeup:sleep:28352 [120] success=1 CPU:000 # # perf trace -e sched:,syscalls:sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep --show-on-off sleep 1 0.000 sleep/28367 syscalls:sys_enter_nanosleep:rqtp: 0x7fffd1a25fc0 0.004 sleep/28367 sched:sched_stat_runtime:comm=sleep pid=28367 runtime=628760 [ns] vruntime=22170052672 [ns] 0.005 sleep/28367 sched:sched_switch:sleep:28367 [120] S ==> swapper/2:0 [120] 1000.367 :0/0 sched:sched_waking:comm=sleep pid=28367 prio=120 target_cpu=002 1000.412 :0/0 sched:sched_wakeup:sleep:28367 [120] success=1 CPU:002 1000.512 sleep/28367 syscalls:sys_exit_nanosleep:0x0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-t3ngpt1brcc1fm9gep9gxm4q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-15 12:26:21 -03:00
Leo Yan	3e70008a60	perf trace: Fix segmentation fault when access syscall info on arm64 'perf trace' reports the segmentation fault as below on Arm64: # perf trace -e string -e augmented_raw_syscalls.c LLVM: dumping tools/perf/examples/bpf/augmented_raw_syscalls.o perf: Segmentation fault Obtained 12 stack frames. perf(sighandler_dump_stack+0x47) [0xaaaaac96ac87] linux-vdso.so.1(+0x5b7) [0xffffadbeb5b7] /lib/aarch64-linux-gnu/libc.so.6(strlen+0x10) [0xfffface7d5d0] /lib/aarch64-linux-gnu/libc.so.6(_IO_vfprintf+0x1ac7) [0xfffface49f97] /lib/aarch64-linux-gnu/libc.so.6(__vsnprintf_chk+0xc7) [0xffffacedfbe7] perf(scnprintf+0x97) [0xaaaaac9ca3ff] perf(+0x997bb) [0xaaaaac8e37bb] perf(cmd_trace+0x28e7) [0xaaaaac8ec09f] perf(+0xd4a13) [0xaaaaac91ea13] perf(main+0x62f) [0xaaaaac8a147f] /lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe3) [0xfffface22d23] perf(+0x57723) [0xaaaaac8a1723] Segmentation fault This issue is introduced by commit `30a910d7d3` ("perf trace: Preallocate the syscall table"), it allocates trace->syscalls.table[] array and the element count is 'trace->sctbl->syscalls.nr_entries'; but on Arm64, the system call number is not continuously used; e.g. the syscall maximum id is 436 but the real entries is only 281. So the table is allocated with 'nr_entries' as the element count, but it accesses the table with the syscall id, which might be out of the bound of the array and cause the segmentation fault. This patch allocates trace->syscalls.table[] with the element count is 'trace->sctbl->syscalls.max_id + 1', this allows any id to access the table without out of the bound. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Fixes: `30a910d7d3` ("perf trace: Preallocate the syscall table") Link: http://lkml.kernel.org/r/20190809104752.27338-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-12 16:26:02 -03:00
Jiri Olsa	88761fa1f1	libperf: Adopt simplified perf_evsel__close() function from tools/perf Add perf_evsel__close() function to libperf while keeping a tools/perf specific evsel__close() to free ids. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-64-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:46 -03:00
Jiri Olsa	03617c22e3	libperf: Add threads to struct perf_evlist Move threads from tools/perf's evlist to libperf's perf_evlist struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-56-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:45 -03:00
Jiri Olsa	1fc632cef4	libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'. Committer notes: Fixed up these: tools/perf/arch/arm/util/auxtrace.c tools/perf/arch/arm/util/cs-etm.c tools/perf/arch/arm64/util/arm-spe.c tools/perf/arch/s390/util/auxtrace.c tools/perf/util/cs-etm.c Also cc1: warnings being treated as errors tests/sample-parsing.c: In function 'do_test': tests/sample-parsing.c:162: error: missing initializer tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus') struct evsel evsel = { .needs_swap = false, - .core.attr = { - .sample_type = sample_type, - .read_format = read_format, + .core = { + . attr = { + .sample_type = sample_type, + .read_format = read_format, + }, [perfbuilder@a70e4eeb5549 /]$ gcc --version \|& head -1 gcc (GCC) 4.4.7 Also we don't need to include perf_event.h in tools/perf/lib/include/perf/evsel.h, forward declaring 'struct perf_event_attr' is enough. And this even fixes the build in some systems where things are used somewhere down the include path from perf_event.h without defining __always_inline. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:45 -03:00
Jiri Olsa	6484d2f9dc	libperf: Add nr_entries to struct perf_evlist Move nr_entries count from 'struct perf' to into perf_evlist struct. Committer notes: Fix tools/perf/arch/s390/util/auxtrace.c case. And also the comment in tools/perf/util/annotate.h. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-42-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:45 -03:00
Jiri Olsa	ce9036a6e3	libperf: Include perf_evlist in evlist object Include perf_evlist in the evlist object, will continue to move other generic things into libperf's perf_evlist. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-37-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:44 -03:00
Jiri Olsa	b27c4ece72	libperf: Include perf_evsel in evsel object Including perf_evsel in evsel object, will continue to move other generic things into libperf's perf_evsel struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-36-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:44 -03:00
Jiri Olsa	e74676deba	perf evlist: Rename perf_evlist__disable() to evlist__disable() Rename perf_evlist__disable() to evlist__disable(), so we don't have a name clash when we add perf_evlist__disable() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-23-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:43 -03:00
Jiri Olsa	1c87f1654c	perf evlist: Rename perf_evlist__enable() to evlist__enable() Rename perf_evlist__enable() to evlist__enable(), so we don't have a name clash when we add perf_evlist__enable() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-22-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:43 -03:00
Jiri Olsa	474ddc4c46	perf evlist: Rename perf_evlist__open() to evlist__open() Rename perf_evlist__open() to evlist__open(), so we don't have a name clash when we add perf_evlist__open() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-20-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:43 -03:00
Jiri Olsa	9a10bb2289	perf evsel: Rename perf_evsel__disable() to evsel__disable() Renaming perf_evsel__disable() to evsel__disable(), so we don't have a name clash when we add perf_evsel__disable() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-17-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:43 -03:00
Jiri Olsa	a1cf3a75d3	perf evlist: Rename perf_evlist__add() to evlist__add() Rename perf_evlist__add() to evlist__add(), so we don't have a name clash when we add perf_evlist__add() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-13-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:43 -03:00
Jiri Olsa	365c3ae745	perf evsel: Rename perf_evsel__new() to evsel__new() Rename perf_evsel__new() to evsel__new(), so we don't have a name clash when we add perf_evsel__new() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-12-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:43 -03:00
Jiri Olsa	5eb2dd2ade	perf evsel: Rename perf_evsel__delete() to evsel__delete() Remame perf_evsel__delete() to evsel__delete(), so we don't have a name clash when we add perf_evsel__delete() in libperf. Also renaming perf_evsel__delete_priv() to evsel__delete_priv(). Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-11-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:43 -03:00
Jiri Olsa	c12995a554	perf evlist: Rename perf_evlist__delete() to evlist__delete() Rename perf_evlist__delete() to evlist__delete(), so we don't have a name clash when we add perf_evlist__delete() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:43 -03:00
Jiri Olsa	0f98b11c61	perf evlist: Rename perf_evlist__new() to evlist__new() Rename perf_evlist__new() to evlist__new(), so we don't have a name clash when we add perf_evlist__new() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:43 -03:00
Jiri Olsa	63503dba87	perf evlist: Rename struct perf_evlist to struct evlist Rename struct perf_evlist to struct evlist, so we don't have a name clash when we add struct perf_evlist in libperf. Committer notes: Added fixes to build on arm64, from Jiri and from me (tools/perf/util/cs-etm.c) Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Jiri Olsa	32dcd021d0	perf evsel: Rename struct perf_evsel to struct evsel Rename struct perf_evsel to struct evsel, so we don't have a name clash when we add struct perf_evsel in libperf. Committer notes: Added fixes for arm64, provided by Jiri. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	e4b00e930b	perf trace: Add "sendfile64" alias to the "sendfile" syscall We were looking in tracefs for: /sys/kernel/debug/tracing/events/syscalls/sys_enter_sendfile/format when what is there is just /sys/kernel/debug/tracing/events/syscalls/sys_enter_sendfile/format Its the same id, 40 in x86_64, so just add an alias and let the existing logic take care of that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-km2hmg7hru6u4pawi5fi903q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	ad4153f964	perf trace: Reuse BPF augmenters from syscalls with similar args signature We have an augmenter for the "open" syscall, which has just one pointer, in the first argument, a "const char *", so any other syscall that has just one pointer and that is the first can reuse the "open" BPF augmenter program. Even more, syscalls that get two pointers with the first being a string can reuse "open"'s BPF augmenter till we have an augmenter that better matches that syscall with two pointers. With this the few augmenters we have, for open (first arg is a string), openat (2nd arg is a string), renameat (2nd and 4th are strings) can be reused by a lot of syscalls, ditto for "bind" reusing "connect" because both have the 2nd argument as a sockaddr and the 3rd as its len. Lets see how this makes the "bind" syscall reuse the "connect" BPF prog augmenter found in tools/perf/examples/bpf/augmented_raw_syscalls.c: # perf trace -e bind,connect systemctl restart sshd connect(3, { .family: PF_LOCAL, path: /run/systemd/private }, 23) = 0 # Oh, it just connects to some daemon, so we better do it system wide and then stop/start sshd: # perf trace -e bind,connect systemctl/10124 connect(3, { .family: PF_LOCAL, path: /run/systemd/private }, 23) = 0 sshd/10102 connect(7, { .family: PF_LOCAL, path: /dev/log }, 110) = 0 systemctl/10126 connect(3, { .family: PF_LOCAL, path: /run/systemd/private }, 23) = 0 systemd/10128 ... [continued]: connect()) = 0 (sshd)/10128 connect(3, { .family: PF_LOCAL, path: /run/systemd/journal/stdout }, 30) ... sshd/10128 bind(3, { .family: PF_NETLINK }, 12) = 0 sshd/10128 connect(4, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) sshd/10128 connect(3, { .family: PF_INET6, port: 22, addr: :: }, 28) = 0 sshd/10128 connect(3, { .family: PF_UNSPEC }, 16) = 0 sshd/10128 connect(3, { .family: PF_INET, port: 22, addr: 0.0.0.0 }, 16) = 0 sshd/10128 connect(3, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) sshd/10128 connect(3, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) sshd/10128 connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) sshd/10128 connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) sshd/10128 bind(4, { .family: PF_INET, port: 22, addr: 0.0.0.0 }, 16) = 0 sshd/10128 connect(6, { .family: PF_LOCAL, path: /dev/log }, 110) = 0 sshd/10128 bind(6, { .family: PF_INET6, port: 22, addr: :: }, 28) = 0 sshd/10128 connect(7, { .family: PF_LOCAL, path: /dev/log }, 110) = 0 ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-zfley2ghs4nim1uq4nu6ed3l@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	30a910d7d3	perf trace: Preallocate the syscall table We'll continue reading its details from tracefs as we need it, but preallocate the whole thing otherwise we may realloc and end up with pointers to the previous buffer. I.e. in an upcoming algorithm we'll look for syscalls that have function signatures that are similar to a given syscall to see if we can reuse its BPF augmenter, so we may be at syscall 42, having a 'struct syscall' pointing to that slot in trace->syscalls.table[] and try to read the slot for an yet unread syscall, which would realloc that table to read the info for syscall 43, say, which would trigger a realoc of trace->syscalls.table[], and then the pointer we had for syscall 42 would be pointing to the previous block of memory. b00m. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-m3cjzzifibs13imafhkk77a0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	b8b1033fca	perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages There are holes in syscall tables with IDs not associated with any syscall, mark those when trying to read information for syscalls, which could happen when iterating thru all syscalls from 0 to the highest numbered syscall id. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-cku9mpcrcsqaiq0jepu86r68@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	5d2bd88975	perf trace: Forward error codes when trying to read syscall info We iterate thru the syscall table produced from the kernel syscall tables reading info, propagate the error and add to the debug message. This helps in fixing further bugs, such as failing to read the "sendfile" syscall info when it really should try the aliasm "sendfile64". Problems reading syscall 40: 2 (No such file or directory)(sendfile) information # grep sendfile /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c [40] = "sendfile", # I.e. in the tracefs format file for the syscall tracepoints we have it as sendfile64: # find /sys -type f -name format \| grep sendfile /sys/kernel/debug/tracing/events/syscalls/sys_enter_sendfile64/format /sys/kernel/debug/tracing/events/syscalls/sys_exit_sendfile64/format # But as "sendfile" in the file used to build the syscall table used in perf: $ grep sendfile arch/x86/entry/syscalls/syscall_64.tbl 40 common sendfile __x64_sys_sendfile64 $ So we need to add, in followup patches, aliases in 'perf trace' syscall data structures to cope with thie. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-w3eluap63x9je0bb8o3t79tz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	247dd65b90	perf trace beauty: Beautify bind's sockaddr arg By reusing the "connect" BPF collector. Testing it system wide and stopping/starting sshd: # perf trace -e bind LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o DNS Res~er #18/15132 bind(243, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #19/4833 bind(247, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #19/4833 bind(238, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #18/15132 bind(243, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #18/10327 bind(258, { .family: PF_NETLINK }, 12) = 0 :6507/6507 bind(24, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #19/4833 bind(238, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #18/15132 bind(242, { .family: PF_NETLINK }, 12) = 0 sshd/6514 bind(3, { .family: PF_NETLINK }, 12) = 0 sshd/6514 bind(5, { .family: PF_INET, port: 22, addr: 0.0.0.0 }, 16) = 0 sshd/6514 bind(7, { .family: PF_INET6, port: 22, addr: :: }, 28) = 0 DNS Res~er #18/10327 bind(229, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #18/15132 bind(231, { .family: PF_NETLINK }, 12) = 0 DNS Res~er #19/4833 bind(229, { .family: PF_NETLINK }, 12) = 0 ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-m2hmxqrckxxw2ciki0tu889u@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:42 -03:00
Arnaldo Carvalho de Melo	ef969ca64d	perf trace beauty: Do not try to use the fd->pathname beautifier for bind/connect fd arg Doesn't make sense and also we now beautify the sockaddr, which provides enough info: # trace -e close,socket,connec* ssh www.bla.com <SNIP> close(5) = 0 socket(PF_INET, SOCK_DGRAM\|CLOEXEC\|NONBLOCK, IPPROTO_IP) = 5 connect(5, { .family: PF_INET, port: 53, addr: 192.168.44.1 }, 16) = 0 close(5) = 0 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 5 ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-h9drpb7ail808d2mh4n7tla4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	79d725cdf2	perf trace beauty: Disable fd->pathname when close() not enabled As we invalidate the fd->pathname table in the SCA_CLOSE_FD beautifier, if we don't have it we may end up keeping an fd->pathname association that then gets misprinted. The previous behaviour continues when the close() syscall is enabled, which may still be a a problem if we lose records (i.e. we may lose a 'close' record and then get that fd reused by socket()) but then the tool will notify that records are being lost and the user will be warned that some of the heuristics will fall apart. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-b7t6h8sq9lebemvfy2zh3qq1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	1d86275225	perf trace beauty: Make connect's addrlen be printed as an int, not hex # perf trace -e connec* ssh www.bla.com connect(3</var/lib/sss/mc/passwd>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(3</var/lib/sss/mc/passwd>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(4<socket:[16610959]>, { .family: PF_LOCAL, path: /var/lib/sss/pipes/nss }, 110) = 0 connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(7, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(5, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET, port: 53, addr: 192.168.44.1 }, 16) = 0 connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET, port: 22, addr: 146.112.61.108 }, 16) = 0 connect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET6, port: 22, addr: ::ffff:146.112.61.108 }, 28) = 0 ^Cconnect(5</usr/lib64/libnss_mdns4_minimal.so.2>, { .family: PF_INET, port: 22, addr: 146.112.61.108 }, 16) = -1 (unknown) (INTERNAL ERROR: strerror_r(512, [buf], 128)=22) # Argh, the SCA_FD needs to invalidate its cache when close is done... It works if the 'close' syscall is not filtered out ;-\ # perf trace -e close,connec* ssh www.bla.com close(3) = 0 close(3</usr/lib64/libpcre2-8.so.0.8.0>) = 0 close(3) = 0 close(3</usr/lib64/libkrb5.so.3.3>) = 0 close(3</usr/lib64/libkrb5.so.3.3>) = 0 close(3) = 0 close(3</usr/lib64/libk5crypto.so.3.1>) = 0 close(3</usr/lib64/libk5crypto.so.3.1>) = 0 close(3</usr/lib64/libcom_err.so.2.1>) = 0 close(3</usr/lib64/libcom_err.so.2.1>) = 0 close(3) = 0 close(3</usr/lib64/libkrb5support.so.0.1>) = 0 close(3</usr/lib64/libkrb5support.so.0.1>) = 0 close(3</usr/lib64/libkeyutils.so.1.8>) = 0 close(3</usr/lib64/libkeyutils.so.1.8>) = 0 close(3) = 0 close(3) = 0 close(3) = 0 close(3) = 0 close(4) = 0 close(3) = 0 close(3) = 0 connect(3</etc/nsswitch.conf>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) close(3</etc/nsswitch.conf>) = 0 connect(3</usr/lib64/libnss_sss.so.2>, { .family: PF_LOCAL, path: /var/run/nscd/socket }, 110) = -1 ENOENT (No such file or directory) close(3</usr/lib64/libnss_sss.so.2>) = 0 close(3</usr/lib64/libnss_sss.so.2>) = 0 close(3) = 0 close(3) = 0 connect(4<socket:[16616519]>, { .family: PF_LOCAL, path: /var/lib/sss/pipes/nss }, 110) = 0 ^C # Will disable this beautifier when 'close' is filtered out... Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-ekuiciyx4znchvy95c8p1yyi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	8b8044e5c9	perf trace: Look for default name for entries in the syscalls prog array I.e. just look for "!syscalls:sys_enter_" or "exit_" plus the syscall name, that way we need just to add entries to the augmented_raw_syscalls.c BPF source to add handlers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-6xavwddruokp6ohs7tf4qilb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	8d5da2649d	perf augmented_raw_syscalls: Support copying two string syscall args Starting with the renameat and renameat2 syscall, that both receive as second and fourth parameters a pathname: # perf trace -e rename* mv one ANOTHER LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o mv: cannot stat 'one': No such file or directory renameat2(AT_FDCWD, "one", AT_FDCWD, "ANOTHER", RENAME_NOREPLACE) = -1 ENOENT (No such file or directory) # Since the per CPU scratch buffer map has space for two maximum sized pathnames, the verifier is satisfied that there will be no overrun. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-x2uboyg5kx2wqeru288209b6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	236dd58388	perf augmented_raw_syscalls: Add handler for "openat" I.e. for a syscall that has its second argument being a string, its difficult these days to find 'open' being used in the wild :-) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-yf3kbzirqrukd3fb2sp5qx4p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	b119970aa5	perf trace: Handle raw_syscalls:sys_enter just like the BPF_OUTPUT augmented event So, we use a PERF_COUNT_SW_BPF_OUTPUT to output the augmented sys_enter payload, i.e. to output more than just the raw syscall args, and if something goes wrong when handling an unfiltered syscall, we bail out and just return 1 in the bpf program associated with raw_syscalls:sys_enter, meaning, don't filter that tracepoint, in which case what will appear in the perf ring buffer isn't the BPF_OUTPUT event, but the original raw_syscalls:sys_enter event with its normal payload. Now that we're switching to using a bpf_tail_call + BPF_MAP_TYPE_PROG_ARRAY we're going to use this in the common case, so a bug where raw_syscalls:sys_enter wasn't being handled by trace__sys_enter() surfaced and for that case, instead of using the strace-like augmenter (trace__sys_enter()), we continued to use the normal generic tracepoint handler: (gdb) p evsel $2 = (struct perf_evsel ) 0xc03e40 (gdb) p evsel->name $3 = 0xbc56c0 "raw_syscalls:sys_enter" (gdb) p ((struct perf_evsel ) 0xc03e40)->name $4 = 0xbc56c0 "raw_syscalls:sys_enter" (gdb) p ((struct perf_evsel ) 0xc03e40)->handler $5 = (void ) 0x495eb3 <trace__event_handler> This resulted in this: 0.027 raw_syscalls:sys_enter:NR 12 (0, 7fcfcac64c9b, 4d, 7fcfcac64c9b, 7fcfcac6ce00, 19) ... [continued]: brk()) = 0x563b88677000 I.e. only the sys_exit tracepoint was being properly handled, but since the sys_enter went to the generic trace__event_handler() we printed it using libtraceevent's formatter instead of 'perf trace's strace-like one. Fix it by setting trace__sys_enter() as the handler for raw_syscalls:sys_enter and setup the tp_field tracepoint field accessors. Now, to test it we just make raw_syscalls:sys_enter return 1 right after checking if the pid is filtered, making it not use bpf_perf_output_event() but rather ask for the tracepoint not to be filtered and the result is the expected one: brk(NULL) = 0x556f42d6e000 I.e. raw_syscalls:sys_enter returns 1, gets handled by trace__sys_enter() and gets it combined with the raw_syscalls:sys_exit in a strace-like way. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-0mkocgk31nmy0odknegcby4z@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	3803a22931	perf trace: Put the per-syscall entry/exit prog_array BPF map infrastructure in place I.e. look for "syscalls_sys_enter" and "syscalls_sys_exit" BPF maps of type PROG_ARRAY and populate it with the handlers as specified per syscall, for now only 'open' is wiring it to something, in time all syscalls that need to copy arguments entering a syscall or returning from one will set these to the right handlers, reusing when possible pre-existing ones. Next step is to use bpf_tail_call() into that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-t0p4u43i9vbpzs1xtowna3gb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	6ff8fff456	perf trace: Allow specifying the bpf prog to augment specific syscalls This is a step in the direction of being able to use a BPF_MAP_TYPE_PROG_ARRAY to handle syscalls that need to copy pointer payloads in addition to the raw tracepoint syscall args. There is a first example in tools/perf/examples/bpf/augmented_raw_syscalls.c for the 'open' syscall. Next step is to introduce the prog array map and use this 'open' augmenter, then use that augmenter in other syscalls that also only copy the first arg as a string, and then show how to use with a syscall that reads more than one filename, like 'rename', etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-pys4v57x5qqrybb4cery2mc8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	5834da7f10	perf trace: Add BPF handler for unaugmented syscalls Will be used to assign to syscalls that don't need augmentation, i.e. those with just integer args. All syscalls will be in a BPF_MAP_TYPE_PROG_ARRAY, and the bpf_tail_call() keyed by the syscall id will either find nothing in place, which means the syscall is being filtered, or a function that will either add things like filenames to the ring buffer, right after the raw syscall args, or be this unaugmented handler that will just return 1, meaning don't filter the original raw_syscalls:sys_{enter,exit} tracepoint. For now it is not really being used, this is just leg work to break the patch into smaller pieces. It introduces a trace__find_bpf_program_by_title() helper that in turn uses libbpf's bpf_object__find_program_by_title() on the BPF object with the __augmented_syscalls__ map. "title" is how libbpf calls the SEC() argument for functions, i.e. the ELF section that follows a convention to specify what BPF program (a function with this SEC() marking) should be connected to which tracepoint, kprobes, etc. In perf anything that is of the form SEC("sys:event_name") will be connected to that tracepoint by perf's BPF loader. In this case its something that will be bpf_tail_call()ed from either the "raw_syscalls:sys_enter" or "raw_syscall:sys_exit" tracepoints, so its named "!raw_syscalls:unaugmented" to convey that idea, i.e. its not going to be directly attached to a tracepoint, thus it starts with a "!". Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-meucpjx2u0slpkayx56lxqq6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00
Arnaldo Carvalho de Melo	83e69b92b1	perf trace: Order -e syscalls table The ev_qualifier is an array with the syscall ids passed via -e on the command line, sort it as we'll search it when setting up the BPF_MAP_TYPE_PROG_ARRAY. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-c8hprylp3ai6e0z9burn2r3s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-07-29 18:34:41 -03:00

1 2 3 4 5 ...

718 Commits