723257 Commits

Author SHA1 Message Date
Jin Yao
74cd5815d9 perf tool: Improve bash command line auto-complete for multiple events with comma
perf has perf-completion.sh to define command line auto-completion in
bash/zsh.

For record/stat -e it works for single events, but isn't working when
specifying multiple events with comma.

It would be very useful if it could be fixed to make it easier by
supporting multiple events, comma separated.

With this patch, the result can be like this:

1. Support the events returned from 'perf list --raw-dump'

root@skl:/tmp# perf stat -e cpu/cache<TAB>
cpu/cache-misses/      cpu/cache-references/

root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/branch-<TAB>
cpu/branch-instructions/  cpu/branch-misses/

root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/branch-i<TAB>
root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/branch-instructions/

2. Support the events listed in /sys/bus/event_source/devices/cpu/events

root@skl:/tmp# perf stat -e cycle<TAB>
cycle_activity.cycles_l1d_miss  cycle_activity.stalls_l3_miss
cycle_activity.cycles_l2_miss   cycle_activity.stalls_mem_any
cycle_activity.cycles_l3_miss   cycle_activity.stalls_total
cycle_activity.cycles_mem_any   cycles-ct
cycle_activity.stalls_l1d_miss  cycles-t
cycle_activity.stalls_l2_miss

root@skl:/tmp# perf stat -e cycles-<TAB>
cycles-ct  cycles-t

root@skl:/tmp# perf stat -e cycles-t,cpu/c<TAB>
cpu/cache-misses/      cpu/cpu-cycles/        cpu/cycles-t/
cpu/cache-references/  cpu/cycles-ct/

root@skl:/tmp# perf stat -e cycles-t,cpu/cache-<TAB>
cpu/cache-misses/      cpu/cache-references/

root@skl:/tmp# perf stat -e cycles-t,cpu/cache-misses/

3. Support the uppercase event which is with prefix "cpu/"

root@skl:/tmp# perf stat -e cpu/c<TAB>
cpu/cache-misses/      cpu/cpu-cycles/        cpu/cycles-t/
cpu/cache-references/  cpu/cycles-ct/

root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/C<TAB>
cpu/CACHE-MISSES/      cpu/CPU-CYCLES/        cpu/CYCLES-T/
cpu/CACHE-REFERENCES/  cpu/CYCLES-CT/

root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/CACHE-REFERENCES/

Note that:

a) This patch only supports bash.

b) It doesn't support the cases like {},{} or {...,...}.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1513848370-8098-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:59 -03:00
Kim Phillips
f1031c8d33 perf probe arm64: Fix symbol fixup issues due to ELF type
On an arm64 machine running a CONFIG_RANDOMIZE_BASE=y kernel, perf
kernel symbol resolution fails.  Debugging saw symsrc_init calling the
default elf__needs_adjust_symbols() where checks for an ET_DYN (3)
ehdr.e_type failed when they should have succeeded.

Fix by adopting powerpc version of the weak elf__needs_adjust_symbols()
function, as done in commit d2332098331f ("perf probe ppc: Fix symbol
fixup issues due to ELF type").

Prior to this patch, perf test 1 would fail:

  $ sudo oldperf test -v 1 |& head
   1: vmlinux symtab matches kallsyms                       :
  test child forked, pid 33374
  Looking at the vmlinux_path (8 entries long)
  Using /usr/lib/debug/boot/vmlinux for symbols
  ERR : 0xfffe0000100f1000: do_undefinstr not on kallsyms
  ERR : 0xfffe0000100f1320: do_sysinstr not on kallsyms
  ERR : 0xfffe0000100f13b0: do_debug_exception not on kallsyms
  ERR : 0xfffe0000100f1498: do_mem_abort not on kallsyms
  ERR : 0xfffe0000100f1580: do_sp_pc_abort not on kallsyms
  ...

After applying this patch, perf test 1 now succeeds:

  $ sudo ./newperf test -v 1 |& head
   1: vmlinux symtab matches kallsyms                       :
  test child forked, pid 33378
  Looking at the vmlinux_path (8 entries long)
  Using /usr/lib/debug/boot/vmlinux for symbols
  WARN: 0xffff000008081000: diff name v: do_undefinstr k: __exception_text_start
  WARN: 0xffff0000080819e8: diff name v: __irqentry_text_end k: __softirqentry_text_start
  WARN: 0xffff000008081d08: diff name v: __entry_text_start k: __softirqentry_text_end
  WARN: 0xffff00000809db5c: diff name v: flush_icache_range k: __flush_cache_user_range
  WARN: 0xffff000008101908: diff name v: sys_ni_syscall k: sys_vm86old
  ...

Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20171214175242.e30450f17f93ad675d968fa3@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:58 -03:00
Mengting Zhang
ca8000684e perf evsel: Enable ignore_missing_thread for pid option
While monitoring a multithread process with pid option, perf sometimes
may return sys_perf_event_open failure with 3(No such process) if any of
the process's threads die before we open the event. However, we want
perf continue monitoring the remaining threads and do not exit with
error.

Here, the patch enables perf_evsel::ignore_missing_thread for -p option
to ignore complete failure if any of threads die before we open the event.
But it may still return sys_perf_event_open failure with 22(Invalid) if we
monitors several event groups.

        sys_perf_event_open: pid 28960  cpu 40  group_fd 118202  flags 0x8
        sys_perf_event_open: pid 28961  cpu 40  group_fd 118203  flags 0x8
        WARNING: Ignored open failure for pid 28962
        sys_perf_event_open: pid 28962  cpu 40  group_fd [118203]  flags 0x8
        sys_perf_event_open failed, error -22

That is because when we ignore a missing thread, we change the thread_idx
without dealing with its fds, FD(evsel, cpu, thread). Then get_group_fd()
may return a wrong group_fd for the next thread and sys_perf_event_open()
return with 22.

        sys_perf_event_open(){
           ...
           if (group_fd != -1)
               perf_fget_light()//to get corresponding group_leader by group_fd
           ...
           if (group_leader)
              if (group_leader->ctx->task != ctx->task)//should on the same task
                   goto err_context
           ...
        }

This patch also fixes this bug by introducing perf_evsel__remove_fd() and
update_fds to allow removing fds for the missing thread.

Changes since v1:
- Change group_fd__remove() into a more genetic way without changing code logic
- Remove redundant condition

Changes since v2:
- Use a proper function name and add some comment.
- Multiline comment style fixes.

Committer testing:

Before this patch the recently added 'perf stat --per-thread' for system
wide counting would race while enumerating all threads using /proc:

  [root@jouet ~]# perf stat --per-thread
  failed to parse CPUs map: No such file or directory

   Usage: perf stat [<options>] [<command>]

      -C, --cpu <cpu>       list of cpus to monitor in system-wide
      -a, --all-cpus        system-wide collection from all CPUs
  [root@jouet ~]# perf stat --per-thread
  failed to parse CPUs map: No such file or directory

   Usage: perf stat [<options>] [<command>]

      -C, --cpu <cpu>       list of cpus to monitor in system-wide
      -a, --all-cpus        system-wide collection from all CPUs
  [root@jouet ~]#

When, say, the kernel was being built, so lots of shortlived threads,
after this patch this doesn't happen.

Signed-off-by: Mengting Zhang <zhangmengting@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Cheng Jian <cj.chengjian@huawei.com>
Cc: Li Bin <huawei.libin@huawei.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1513148513-6974-1-git-send-email-zhangmengting@huawei.com
[ Remove one use 'evlist' alias variable ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:58 -03:00
Hendrik Brueckner
a9a3f1d18a perf s390: Always build with -fPIC
On s390, object files must be compiled with position-indepedent code in
order to be incrementally linked or linked to shared libraries.

Therefore, add -fPIC to the CFLAGS for s390 to ensure each object file
is built properly.

Reported-by: Jonathan Hermann <jonathan.hermann@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux s390 list <linux-s390@vger.kernel.org>
Link: https://lkml.kernel.org/r/20171207080951.GC4889@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:57 -03:00
Arnaldo Carvalho de Melo
922991c2b1 Revert "perf s390: Always build with -fPIC"
This one made x86 always build with -fPIC, when the intention was for
s390 to be built that way, due to a rebase mistake.

Reported-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
This reverts commit 1dc4ddf112a408e607a073d951b962b6c6e2bd6c.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:57 -03:00
Michael Petlan
69b5c95340 perf test shell: Fix check open filename arg using 'perf trace'
Commit f231af789b11 ("perf test shell: Fix check open filename arg using
'perf trace' on s390x") added an exception for s390x to use openat()
instead of open() in the test that intercepts a open syscall to look for
the filename argument as obtained by the vfs_getname 'perf probe' it
puts in place at the getname_flags kernel function.

Its not just s390x that uses openat() instead of open(), so use 'perf
list' to look for the syscall:sys_enter_open(at)? present in the system
being tested instead of checking if the system is s390x.

In fact Namhyung pointed out that glibc 2.26 changed this behaviour, as
described in https://lwn.net/Articles/738694/, so systems where glibc is
>= 2.26 will need this patch for this test to work, which already took
place in some distros for architectures such as s390x, while Fedora 26
x86_64 is at glibc 2.25, i.e. still uses open().

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: https://lkml.kernel.org/r/ab23fe42-1080-a46b-503e-744e097f414f@linux.vnet.ibm.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
LPU-Reference: 1275675985.12835754.1513095723265.JavaMail.zimbra@redhat.com
Link: https://lkml.kernel.org/n/tip-j2wbz9av1rw3thr3t0g4dtuk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:56 -03:00
Jiri Olsa
f9d8adb345 perf evsel: Fix swap for samples with raw data
When we detect a different endianity we swap event before processing.
It's tricky for samples because we have no idea what's inside. We treat
it as an array of u64s, swap them and later on we swap back parts which
are different.

We mangle this way also the tracepoint raw data, which ends up in report
showing wrong data:

  1.95%  comm=Q^B pid=29285 prio=16777216 target_cpu=000
  1.67%  comm=l^B pid=0 prio=16777216 target_cpu=000

Luckily the traceevent library handles the endianity by itself (thank
you Steven!), so we can pass the RAW data directly in the other
endianity.

  2.51%  comm=beah-rhts-task pid=1175 prio=120 target_cpu=002
  2.23%  comm=kworker/0:0 pid=11566 prio=120 target_cpu=000

The fix is basically to swap back the raw data if different endianity is
detected.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20171129184346.3656-1-jolsa@kernel.org
[ Add util/memswap.c to python-ext-sources to link missing mem_bswap_64() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:56 -03:00
Masami Hiramatsu
c588d15812 perf probe: Support escaped character in parser
Support the special characters escaped by '\' in parser.  This allows
user to specify versions directly like below.

  =====
  # ./perf probe -x /lib64/libc-2.25.so malloc_get_state\\@GLIBC_2.2.5
  Added new event:
    probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so)

  You can now use it in all perf tools, such as:

	  perf record -e probe_libc:malloc_get_state -aR sleep 1

  =====

Or, you can use separators in source filename, e.g.

  =====
  # ./perf probe -x /opt/test/a.out foo+bar.c:3
  Semantic error :There is non-digit character in offset.
    Error: Command Parse Error.
  =====

Usually "+" in source file cause parser error, but

  =====
  # ./perf probe -x /opt/test/a.out foo\\+bar.c:4
  Added new event:
    probe_a:main         (on @foo+bar.c:4 in /opt/test/a.out)

  You can now use it in all perf tools, such as:

	  perf record -e probe_a:main -aR sleep 1
  =====

escaped "\+" allows you to specify that.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151309111236.18107.5634753157435343410.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:55 -03:00
Masami Hiramatsu
1e9f9e8af0 perf string: Add {strdup,strpbrk}_esc()
To support the special characters escaped by '\' in 'perf probe' event parser.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275052163.24652.18205979384585484358.stgit@devbox
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:55 -03:00
Masami Hiramatsu
4b3a2716dd perf probe: Find versioned symbols from map
Commit d80406453ad4 ("perf symbols: Allow user probes on versioned
symbols") allows user to find default versioned symbols (with "@@") in
map. However, it did not enable normal versioned symbol (with "@") for
perf-probe.  E.g.

  =====
  # ./perf probe -x /lib64/libc-2.25.so malloc_get_state
  Failed to find symbol malloc_get_state in /usr/lib64/libc-2.25.so
    Error: Failed to add events.
  =====

This solves above issue by improving perf-probe symbol search function,
as below.

  =====
  # ./perf probe -x /lib64/libc-2.25.so malloc_get_state
  Added new event:
    probe_libc:malloc_get_state (on malloc_get_state in /usr/lib64/libc-2.25.so)

  You can now use it in all perf tools, such as:

	  perf record -e probe_libc:malloc_get_state -aR sleep 1

  # ./perf probe -l
    probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so)
  =====

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275049269.24652.1639103455496216255.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:54 -03:00
Masami Hiramatsu
e63c625a1e perf probe: Add __return suffix for return events
Add __return suffix for function return events automatically. Without
this, user have to give --force option and will see the number suffix
for each event like "function_1", which is not easy to recognize.
Instead, this adds __return suffix to it automatically.  E.g.

  =====
  # ./perf probe -x /lib64/libc-2.25.so 'malloc*%return'
  Added new events:
    probe_libc:malloc_printerr__return (on malloc*%return in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_consolidate__return (on malloc*%return in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_check__return (on malloc*%return in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_hook_ini__return (on malloc*%return in /usr/lib64/libc-2.25.so)
    probe_libc:malloc__return (on malloc*%return in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_trim__return (on malloc*%return in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_usable_size__return (on malloc*%return in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_stats__return (on malloc*%return in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_info__return (on malloc*%return in /usr/lib64/libc-2.25.so)
    probe_libc:mallochook__return (on malloc*%return in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_get_state__return (on malloc*%return in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_set_state__return (on malloc*%return in /usr/lib64/libc-2.25.so)

  You can now use it in all perf tools, such as:

	  perf record -e probe_libc:malloc_set_state__return -aR sleep 1

  =====

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275046418.24652.6696011972866498489.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:54 -03:00
Masami Hiramatsu
a3110cd9d0 perf probe: Cut off the version suffix from event name
Cut off the version suffix (e.g. @GLIBC_2.2.5 etc.) from automatic
generated event name. This fixes wildcard event adding like below case;

  =====
  # perf probe -x /lib64/libc-2.25.so malloc*
  Internal error: "malloc_get_state@GLIBC_2" is wrong event name.
    Error: Failed to add events.
  =====

This failure was caused by a versioned suffix symbol.

With this fix, perf probe automatically cuts the suffix after @ as
below.

  =====
  # ./perf probe -x /lib64/libc-2.25.so malloc*
  Added new events:
    probe_libc:malloc_printerr (on malloc* in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_consolidate (on malloc* in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_check (on malloc* in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_hook_ini (on malloc* in /usr/lib64/libc-2.25.so)
    probe_libc:malloc    (on malloc* in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_trim (on malloc* in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_usable_size (on malloc* in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_stats (on malloc* in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_info (on malloc* in /usr/lib64/libc-2.25.so)
    probe_libc:mallochook (on malloc* in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_get_state (on malloc* in /usr/lib64/libc-2.25.so)
    probe_libc:malloc_set_state (on malloc* in /usr/lib64/libc-2.25.so)

  You can now use it in all perf tools, such as:

	  perf record -e probe_libc:malloc_set_state -aR sleep 1

  =====

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: bhargavb <bhargavaramudu@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/None
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:53 -03:00
Masami Hiramatsu
9f5c6d8777 perf probe: Add warning message if there is unexpected event name
This improve the error message so that user can know event-name error
before writing new events to kprobe-events interface.

E.g.
   ======
   #./perf probe -x /lib64/libc-2.25.so malloc_get_state*
   Internal error: "malloc_get_state@GLIBC_2" is an invalid event name.
     Error: Failed to add events.
   ======

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275040665.24652.5188568529237584489.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:53 -03:00
Arnaldo Carvalho de Melo
4e8fbc1c97 perf env: Adopt perf_env__arch() from the annotate code
And use it in the libunwind case, with both passing a valid perf_env to
extract the arch to be normalized from and passing NULL with the same
semantic as in the annotate code: to get it from uname() uts.machine.

Now the code to generate per arch errno translation tables (int/string)
can use it to decode perf.data files recorded in a different arch than
that where 'perf trace' (or any other analysis tool) runs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-p2epffgash69w38kvj3ntpc9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:52 -03:00
Arnaldo Carvalho de Melo
3285debaf5 perf annotate: Use perf_env when obtaining the arch name
Paving the way to reuse these routines in other areas, like when
generating errno tables.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-rh1qv051vb8gfdcswskrn53h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:51 -03:00
Arnaldo Carvalho de Melo
5449f13c55 perf annotate: Get the cpuid from evsel->evlist->env in symbol__annotate()
To reduce its function signature, since we get this from 'evsel' which
is already one of its arguments.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-070eap7t6uicg9c3w086xy2z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:51 -03:00
Hendrik Brueckner
901bb0280b perf trace: Use generated syscall table on s390 too
This should speed up accessing new system calls introduced with the
kernel rather than waiting for libaudit updates to include them.

It also enables users to specify wildcards, for example, perf trace -e
'open*', just like was already possible on x86.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-htplh3nbrivi7g3cffbh4fsu@git.kernel.org
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:50 -03:00
Hendrik Brueckner
164a747f1a perf s390: Generate system call table from asm/unistd.h
This should speed up accessing new system calls introduced with
the kernel rather than waiting for libaudit updates to include
them.

Committer testing:

  $ rm -rf /tmp/build/perf
  $ mkdir /tmp/build/perf
  $ make srctree=/home/acme/git/perf -C tools/perf/arch/s390 OUTPUT=/tmp/build/perf/ archheaders
  make: Entering directory '/home/acme/git/perf/tools/perf/arch/s390'
  /bin/sh '/home/acme/git/perf/tools/perf/arch/s390/entry/syscalls//mksyscalltbl' 'cc' /home/acme/git/perf/tools/arch/s390/include/uapi/asm/unistd.h > /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
  make: Leaving directory '/home/acme/git/perf/tools/perf/arch/s390'
  $ head -5 /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
  static const char *syscalltbl_s390_64[] = {
	[1] = "exit",
	[2] = "fork",
	[3] = "read",
	[4] = "write",
  $ tail -5 /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
	[378] = "s390_guarded_storage",
	[379] = "statx",
	[380] = "s390_sthyi",
  };
  #define SYSCALLTBL_S390_64_MAX_ID 380
  $

Now to plug this into 'perf trace' proper.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-h5km60rdg3rqxvsys85q50l3@git.kernel.org
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:50 -03:00
Hendrik Brueckner
7af7919f0f tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h
Will be used for generating the syscall id/string translation table.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-vjfbfvgjrnqnbdluqd7leo98@git.kernel.org
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:49 -03:00
Pravin Shedge
3315d14f8e perf perf: Remove duplicate includes
These duplicate includes have been found with scripts/checkincludes.pl
but they have been removed manually to avoid removing false positives.

Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1512582204-6493-1-git-send-email-pravin.shedge4linux@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:49 -03:00
Jiri Olsa
378811ac30 perf test: Handle properly readdir DT_UNKNOWN
Some system can return DT_UNKNOWN in readdir's struct dirent::d_type and
we must handle it properly. In this case we can directly check if the
entity we found is directory and skip it.

Reported-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171206174535.25380-1-jolsa@kernel.org
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:48 -03:00
Jiri Olsa
06c3f2aa9f perf utils: Move is_directory() to path.h
So that it can be used more widely, like in the next patch, when it will
be used to fix a bug in 'perf test' handling of dirent.d_type ==
DT_UNKNOWN.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171206174535.25380-1-jolsa@kernel.org
[ Split from a larger patch, removed needless includes in path.h ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:48 -03:00
Jin Yao
29734550c9 perf stat: Resort '--per-thread' result
There are many threads reported if we enable '--per-thread'
globally.

1. Most of the threads are not counted or counting value 0.
This patch removes these threads.

2. We also resort the threads in display according to the
counting value. It's useful for user to see the hottest
threads easily.

For example, the new results would be:

root@skl:/tmp# perf stat --per-thread
^C
 Performance counter stats for 'system wide':

            perf-24165              4.302433      cpu-clock (msec)          #    0.001 CPUs utilized
          vmstat-23127              1.562215      cpu-clock (msec)          #    0.000 CPUs utilized
      irqbalance-2780               0.827851      cpu-clock (msec)          #    0.000 CPUs utilized
            sshd-23111              0.278308      cpu-clock (msec)          #    0.000 CPUs utilized
        thermald-2841               0.230880      cpu-clock (msec)          #    0.000 CPUs utilized
            sshd-23058              0.207306      cpu-clock (msec)          #    0.000 CPUs utilized
     kworker/0:2-19991              0.133983      cpu-clock (msec)          #    0.000 CPUs utilized
   kworker/u16:1-18249              0.125636      cpu-clock (msec)          #    0.000 CPUs utilized
       rcu_sched-8                  0.085533      cpu-clock (msec)          #    0.000 CPUs utilized
   kworker/u16:2-23146              0.077139      cpu-clock (msec)          #    0.000 CPUs utilized
           gmain-2700               0.041789      cpu-clock (msec)          #    0.000 CPUs utilized
     kworker/4:1-15354              0.028370      cpu-clock (msec)          #    0.000 CPUs utilized
     kworker/6:0-17528              0.023895      cpu-clock (msec)          #    0.000 CPUs utilized
    kworker/4:1H-1887               0.013209      cpu-clock (msec)          #    0.000 CPUs utilized
     kworker/5:2-31362              0.011627      cpu-clock (msec)          #    0.000 CPUs utilized
      watchdog/0-11                 0.010892      cpu-clock (msec)          #    0.000 CPUs utilized
     kworker/3:2-12870              0.010220      cpu-clock (msec)          #    0.000 CPUs utilized
     ksoftirqd/0-7                  0.008869      cpu-clock (msec)          #    0.000 CPUs utilized
      watchdog/1-14                 0.008476      cpu-clock (msec)          #    0.000 CPUs utilized
      watchdog/7-50                 0.002944      cpu-clock (msec)          #    0.000 CPUs utilized
      watchdog/3-26                 0.002893      cpu-clock (msec)          #    0.000 CPUs utilized
      watchdog/4-32                 0.002759      cpu-clock (msec)          #    0.000 CPUs utilized
      watchdog/2-20                 0.002429      cpu-clock (msec)          #    0.000 CPUs utilized
      watchdog/6-44                 0.001491      cpu-clock (msec)          #    0.000 CPUs utilized
      watchdog/5-38                 0.001477      cpu-clock (msec)          #    0.000 CPUs utilized
       rcu_sched-8                        10      context-switches          #    0.117 M/sec
   kworker/u16:1-18249                     7      context-switches          #    0.056 M/sec
            sshd-23111                     4      context-switches          #    0.014 M/sec
          vmstat-23127                     4      context-switches          #    0.003 M/sec
            perf-24165                     4      context-switches          #    0.930 K/sec
     kworker/0:2-19991                     3      context-switches          #    0.022 M/sec
   kworker/u16:2-23146                     3      context-switches          #    0.039 M/sec
     kworker/4:1-15354                     2      context-switches          #    0.070 M/sec
     kworker/6:0-17528                     2      context-switches          #    0.084 M/sec
            sshd-23058                     2      context-switches          #    0.010 M/sec
     ksoftirqd/0-7                         1      context-switches          #    0.113 M/sec
      watchdog/0-11                        1      context-switches          #    0.092 M/sec
      watchdog/1-14                        1      context-switches          #    0.118 M/sec
      watchdog/2-20                        1      context-switches          #    0.412 M/sec
      watchdog/3-26                        1      context-switches          #    0.346 M/sec
      watchdog/4-32                        1      context-switches          #    0.362 M/sec
      watchdog/5-38                        1      context-switches          #    0.677 M/sec
      watchdog/6-44                        1      context-switches          #    0.671 M/sec
      watchdog/7-50                        1      context-switches          #    0.340 M/sec
    kworker/4:1H-1887                      1      context-switches          #    0.076 M/sec
        thermald-2841                      1      context-switches          #    0.004 M/sec
           gmain-2700                      1      context-switches          #    0.024 M/sec
      irqbalance-2780                      1      context-switches          #    0.001 M/sec
     kworker/3:2-12870                     1      context-switches          #    0.098 M/sec
     kworker/5:2-31362                     1      context-switches          #    0.086 M/sec
   kworker/u16:1-18249                     2      cpu-migrations            #    0.016 M/sec
   kworker/u16:2-23146                     2      cpu-migrations            #    0.026 M/sec
       rcu_sched-8                         1      cpu-migrations            #    0.012 M/sec
            sshd-23058                     1      cpu-migrations            #    0.005 M/sec
            perf-24165             8,833,385      cycles                    #    2.053 GHz
          vmstat-23127             1,702,699      cycles                    #    1.090 GHz
      irqbalance-2780                739,847      cycles                    #    0.894 GHz
            sshd-23111               269,506      cycles                    #    0.968 GHz
        thermald-2841                204,556      cycles                    #    0.886 GHz
            sshd-23058               158,780      cycles                    #    0.766 GHz
     kworker/0:2-19991               112,981      cycles                    #    0.843 GHz
   kworker/u16:1-18249               100,926      cycles                    #    0.803 GHz
       rcu_sched-8                    74,024      cycles                    #    0.865 GHz
   kworker/u16:2-23146                55,984      cycles                    #    0.726 GHz
           gmain-2700                 34,278      cycles                    #    0.820 GHz
     kworker/4:1-15354                20,665      cycles                    #    0.728 GHz
     kworker/6:0-17528                16,445      cycles                    #    0.688 GHz
     kworker/5:2-31362                 9,492      cycles                    #    0.816 GHz
      watchdog/3-26                    8,695      cycles                    #    3.006 GHz
    kworker/4:1H-1887                  8,238      cycles                    #    0.624 GHz
      watchdog/4-32                    7,580      cycles                    #    2.747 GHz
     kworker/3:2-12870                 7,306      cycles                    #    0.715 GHz
      watchdog/2-20                    7,274      cycles                    #    2.995 GHz
      watchdog/0-11                    6,988      cycles                    #    0.642 GHz
     ksoftirqd/0-7                     6,376      cycles                    #    0.719 GHz
      watchdog/1-14                    5,340      cycles                    #    0.630 GHz
      watchdog/5-38                    4,061      cycles                    #    2.749 GHz
      watchdog/6-44                    3,976      cycles                    #    2.667 GHz
      watchdog/7-50                    3,418      cycles                    #    1.161 GHz
          vmstat-23127             2,511,699      instructions              #    1.48  insn per cycle
            perf-24165             1,829,908      instructions              #    0.21  insn per cycle
      irqbalance-2780              1,190,204      instructions              #    1.61  insn per cycle
        thermald-2841                143,544      instructions              #    0.70  insn per cycle
            sshd-23111               128,138      instructions              #    0.48  insn per cycle
            sshd-23058                57,654      instructions              #    0.36  insn per cycle
       rcu_sched-8                    44,063      instructions              #    0.60  insn per cycle
   kworker/u16:1-18249                42,551      instructions              #    0.42  insn per cycle
     kworker/0:2-19991                25,873      instructions              #    0.23  insn per cycle
   kworker/u16:2-23146                21,407      instructions              #    0.38  insn per cycle
           gmain-2700                 13,691      instructions              #    0.40  insn per cycle
     kworker/4:1-15354                12,964      instructions              #    0.63  insn per cycle
     kworker/6:0-17528                10,034      instructions              #    0.61  insn per cycle
     kworker/5:2-31362                 5,203      instructions              #    0.55  insn per cycle
     kworker/3:2-12870                 4,866      instructions              #    0.67  insn per cycle
    kworker/4:1H-1887                  3,586      instructions              #    0.44  insn per cycle
     ksoftirqd/0-7                     3,463      instructions              #    0.54  insn per cycle
      watchdog/0-11                    3,135      instructions              #    0.45  insn per cycle
      watchdog/1-14                    3,135      instructions              #    0.59  insn per cycle
      watchdog/2-20                    3,135      instructions              #    0.43  insn per cycle
      watchdog/3-26                    3,135      instructions              #    0.36  insn per cycle
      watchdog/4-32                    3,135      instructions              #    0.41  insn per cycle
      watchdog/5-38                    3,135      instructions              #    0.77  insn per cycle
      watchdog/6-44                    3,135      instructions              #    0.79  insn per cycle
      watchdog/7-50                    3,135      instructions              #    0.92  insn per cycle
          vmstat-23127               539,181      branches                  #  345.139 M/sec
            perf-24165               375,364      branches                  #   87.245 M/sec
      irqbalance-2780                262,092      branches                  #  316.593 M/sec
        thermald-2841                 31,611      branches                  #  136.915 M/sec
            sshd-23111                21,874      branches                  #   78.596 M/sec
            sshd-23058                10,682      branches                  #   51.528 M/sec
       rcu_sched-8                     8,693      branches                  #  101.633 M/sec
   kworker/u16:1-18249                 7,891      branches                  #   62.808 M/sec
     kworker/0:2-19991                 5,761      branches                  #   42.998 M/sec
   kworker/u16:2-23146                 4,099      branches                  #   53.138 M/sec
     kworker/4:1-15354                 2,755      branches                  #   97.110 M/sec
           gmain-2700                  2,638      branches                  #   63.127 M/sec
     kworker/6:0-17528                 2,216      branches                  #   92.739 M/sec
     kworker/5:2-31362                 1,132      branches                  #   97.360 M/sec
     kworker/3:2-12870                 1,081      branches                  #  105.773 M/sec
    kworker/4:1H-1887                    725      branches                  #   54.887 M/sec
     ksoftirqd/0-7                       707      branches                  #   79.716 M/sec
      watchdog/0-11                      652      branches                  #   59.860 M/sec
      watchdog/1-14                      652      branches                  #   76.923 M/sec
      watchdog/2-20                      652      branches                  #  268.423 M/sec
      watchdog/3-26                      652      branches                  #  225.372 M/sec
      watchdog/4-32                      652      branches                  #  236.318 M/sec
      watchdog/5-38                      652      branches                  #  441.435 M/sec
      watchdog/6-44                      652      branches                  #  437.290 M/sec
      watchdog/7-50                      652      branches                  #  221.467 M/sec
          vmstat-23127                 8,960      branch-misses             #    1.66% of all branches
      irqbalance-2780                  3,047      branch-misses             #    1.16% of all branches
            perf-24165                 2,876      branch-misses             #    0.77% of all branches
            sshd-23111                 1,843      branch-misses             #    8.43% of all branches
        thermald-2841                  1,444      branch-misses             #    4.57% of all branches
            sshd-23058                 1,379      branch-misses             #   12.91% of all branches
   kworker/u16:1-18249                   982      branch-misses             #   12.44% of all branches
       rcu_sched-8                       893      branch-misses             #   10.27% of all branches
   kworker/u16:2-23146                   578      branch-misses             #   14.10% of all branches
     kworker/0:2-19991                   376      branch-misses             #    6.53% of all branches
           gmain-2700                    280      branch-misses             #   10.61% of all branches
     kworker/6:0-17528                   196      branch-misses             #    8.84% of all branches
     kworker/4:1-15354                   187      branch-misses             #    6.79% of all branches
     kworker/5:2-31362                   123      branch-misses             #   10.87% of all branches
      watchdog/0-11                       95      branch-misses             #   14.57% of all branches
      watchdog/4-32                       89      branch-misses             #   13.65% of all branches
     kworker/3:2-12870                    80      branch-misses             #    7.40% of all branches
      watchdog/3-26                       61      branch-misses             #    9.36% of all branches
    kworker/4:1H-1887                     60      branch-misses             #    8.28% of all branches
      watchdog/2-20                       52      branch-misses             #    7.98% of all branches
     ksoftirqd/0-7                        47      branch-misses             #    6.65% of all branches
      watchdog/1-14                       46      branch-misses             #    7.06% of all branches
      watchdog/7-50                       13      branch-misses             #    1.99% of all branches
      watchdog/5-38                        8      branch-misses             #    1.23% of all branches
      watchdog/6-44                        7      branch-misses             #    1.07% of all branches

       3.695150786 seconds time elapsed

root@skl:/tmp# perf stat --per-thread -M IPC,CPI
^C

 Performance counter stats for 'system wide':

          vmstat-23127             2,000,783      inst_retired.any          #      1.5 IPC
        thermald-2841              1,472,670      inst_retired.any          #      1.3 IPC
            sshd-23111               977,374      inst_retired.any          #      1.2 IPC
            perf-24163               483,779      inst_retired.any          #      0.2 IPC
           gmain-2700                341,213      inst_retired.any          #      0.9 IPC
            sshd-23058               148,891      inst_retired.any          #      0.8 IPC
    rtkit-daemon-3288                 71,210      inst_retired.any          #      0.7 IPC
   kworker/u16:1-18249                39,562      inst_retired.any          #      0.3 IPC
       rcu_sched-8                    14,474      inst_retired.any          #      0.8 IPC
     kworker/0:2-19991                 7,659      inst_retired.any          #      0.2 IPC
     kworker/4:1-15354                 6,714      inst_retired.any          #      0.8 IPC
    rtkit-daemon-3289                  4,839      inst_retired.any          #      0.3 IPC
     kworker/6:0-17528                 3,321      inst_retired.any          #      0.6 IPC
     kworker/5:2-31362                 3,215      inst_retired.any          #      0.5 IPC
     kworker/7:2-23145                 3,173      inst_retired.any          #      0.7 IPC
    kworker/4:1H-1887                  1,719      inst_retired.any          #      0.3 IPC
      watchdog/0-11                    1,479      inst_retired.any          #      0.3 IPC
      watchdog/1-14                    1,479      inst_retired.any          #      0.3 IPC
      watchdog/2-20                    1,479      inst_retired.any          #      0.4 IPC
      watchdog/3-26                    1,479      inst_retired.any          #      0.4 IPC
      watchdog/4-32                    1,479      inst_retired.any          #      0.3 IPC
      watchdog/5-38                    1,479      inst_retired.any          #      0.3 IPC
      watchdog/6-44                    1,479      inst_retired.any          #      0.7 IPC
      watchdog/7-50                    1,479      inst_retired.any          #      0.7 IPC
   kworker/u16:2-23146                 1,408      inst_retired.any          #      0.5 IPC
            perf-24163             2,249,872      cpu_clk_unhalted.thread
          vmstat-23127             1,352,455      cpu_clk_unhalted.thread
        thermald-2841              1,161,140      cpu_clk_unhalted.thread
            sshd-23111               807,827      cpu_clk_unhalted.thread
           gmain-2700                375,535      cpu_clk_unhalted.thread
            sshd-23058               194,071      cpu_clk_unhalted.thread
   kworker/u16:1-18249               114,306      cpu_clk_unhalted.thread
    rtkit-daemon-3288                103,547      cpu_clk_unhalted.thread
     kworker/0:2-19991                46,550      cpu_clk_unhalted.thread
       rcu_sched-8                    18,855      cpu_clk_unhalted.thread
    rtkit-daemon-3289                 17,549      cpu_clk_unhalted.thread
     kworker/4:1-15354                 8,812      cpu_clk_unhalted.thread
     kworker/5:2-31362                 6,812      cpu_clk_unhalted.thread
    kworker/4:1H-1887                  5,270      cpu_clk_unhalted.thread
     kworker/6:0-17528                 5,111      cpu_clk_unhalted.thread
     kworker/7:2-23145                 4,667      cpu_clk_unhalted.thread
      watchdog/0-11                    4,663      cpu_clk_unhalted.thread
      watchdog/1-14                    4,663      cpu_clk_unhalted.thread
      watchdog/4-32                    4,626      cpu_clk_unhalted.thread
      watchdog/5-38                    4,403      cpu_clk_unhalted.thread
      watchdog/3-26                    3,936      cpu_clk_unhalted.thread
      watchdog/2-20                    3,850      cpu_clk_unhalted.thread
   kworker/u16:2-23146                 2,654      cpu_clk_unhalted.thread
      watchdog/6-44                    2,017      cpu_clk_unhalted.thread
      watchdog/7-50                    2,017      cpu_clk_unhalted.thread
          vmstat-23127             2,000,783      inst_retired.any          #      0.7 CPI
        thermald-2841              1,472,670      inst_retired.any          #      0.8 CPI
            sshd-23111               977,374      inst_retired.any          #      0.8 CPI
            perf-24163               495,037      inst_retired.any          #      4.7 CPI
           gmain-2700                341,213      inst_retired.any          #      1.1 CPI
            sshd-23058               148,891      inst_retired.any          #      1.3 CPI
    rtkit-daemon-3288                 71,210      inst_retired.any          #      1.5 CPI
   kworker/u16:1-18249                39,562      inst_retired.any          #      2.9 CPI
       rcu_sched-8                    14,474      inst_retired.any          #      1.3 CPI
     kworker/0:2-19991                 7,659      inst_retired.any          #      6.1 CPI
     kworker/4:1-15354                 6,714      inst_retired.any          #      1.3 CPI
    rtkit-daemon-3289                  4,839      inst_retired.any          #      3.6 CPI
     kworker/6:0-17528                 3,321      inst_retired.any          #      1.5 CPI
     kworker/5:2-31362                 3,215      inst_retired.any          #      2.1 CPI
     kworker/7:2-23145                 3,173      inst_retired.any          #      1.5 CPI
    kworker/4:1H-1887                  1,719      inst_retired.any          #      3.1 CPI
      watchdog/0-11                    1,479      inst_retired.any          #      3.2 CPI
      watchdog/1-14                    1,479      inst_retired.any          #      3.2 CPI
      watchdog/2-20                    1,479      inst_retired.any          #      2.6 CPI
      watchdog/3-26                    1,479      inst_retired.any          #      2.7 CPI
      watchdog/4-32                    1,479      inst_retired.any          #      3.1 CPI
      watchdog/5-38                    1,479      inst_retired.any          #      3.0 CPI
      watchdog/6-44                    1,479      inst_retired.any          #      1.4 CPI
      watchdog/7-50                    1,479      inst_retired.any          #      1.4 CPI
   kworker/u16:2-23146                 1,408      inst_retired.any          #      1.9 CPI
            perf-24163             2,302,323      cycles
          vmstat-23127             1,352,455      cycles
        thermald-2841              1,161,140      cycles
            sshd-23111               807,827      cycles
           gmain-2700                375,535      cycles
            sshd-23058               194,071      cycles
   kworker/u16:1-18249               114,306      cycles
    rtkit-daemon-3288                103,547      cycles
     kworker/0:2-19991                46,550      cycles
       rcu_sched-8                    18,855      cycles
    rtkit-daemon-3289                 17,549      cycles
     kworker/4:1-15354                 8,812      cycles
     kworker/5:2-31362                 6,812      cycles
    kworker/4:1H-1887                  5,270      cycles
     kworker/6:0-17528                 5,111      cycles
     kworker/7:2-23145                 4,667      cycles
      watchdog/0-11                    4,663      cycles
      watchdog/1-14                    4,663      cycles
      watchdog/4-32                    4,626      cycles
      watchdog/5-38                    4,403      cycles
      watchdog/3-26                    3,936      cycles
      watchdog/2-20                    3,850      cycles
   kworker/u16:2-23146                 2,654      cycles
      watchdog/6-44                    2,017      cycles
      watchdog/7-50                    2,017      cycles

       2.175726600 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:47 -03:00
Jin Yao
1d9f8d1b82 perf stat: Remove --per-thread pid/tid limitation
Currently, if we execute 'perf stat --per-thread' without specifying
pid/tid, perf will return error.

root@skl:/tmp# perf stat --per-thread
The --per-thread option is only available when monitoring via -p -t options.
    -p, --pid <pid>       stat events on existing process id
    -t, --tid <tid>       stat events on existing thread id

This patch removes this limitation. If no pid/tid specified, it returns
all threads (get threads from /proc).

Note that it doesn't support cpu_list yet so if it's a cpu_list case,
then skip.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-11-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:47 -03:00
Jin Yao
73c0ca1eee perf thread_map: Enumerate all threads from /proc
This patch calls thread_map__new_all_cpus() to enumerate all threads
from /proc if per-thread flag is enabled.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-10-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:46 -03:00
Jin Yao
14e72a21c7 perf stat: Update or print per-thread stats
If the stats pointer in stat_config structure is not null, it will
update the per-thread stats or print the per-thread stats on this
buffer.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-9-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:46 -03:00
Jin Yao
56739444d8 perf stat: Allocate shadow stats buffer for threads
After perf_evlist__create_maps() being executed, we can get all threads
from /proc. And via thread_map__nr(), we can also get the number of
threads.

With the number of threads, the patch allocates a buffer which will
record the shadow stats for these threads.

The buffer pointer is saved in stat_config.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-8-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:45 -03:00
Jin Yao
6a1e2c5c26 perf stat: Remove a set of shadow stats static variables
In previous patches, we have reconstructed the code and let it not
access the static variables directly.

This patch removes these static variables.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-7-git-send-email-yao.jin@linux.intel.com
[ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:44 -03:00
Jin Yao
e0128b30db perf stat: Print per-thread shadow stats
The function perf_stat__print_shadow_stats() is called to print the
shadow stats on a set of static variables.

But the static variables are the limitations to support
per-thread shadow stats.

This patch lets the perf_stat__print_shadow_stats() support
to print the shadow stats from a input parameter 'st'.

It will not directly get value from static variable. Instead,
it now uses runtime_stat_avg() and runtime_stat_n() to get and
compute the values.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-6-git-send-email-yao.jin@linux.intel.com
[ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:44 -03:00
Jin Yao
1fcd03946b perf stat: Update per-thread shadow stats
The functions perf_stat__update_shadow_stats() is called to update the
shadow stats on a set of static variables.

But the static variables are the limitations to be extended to support
per-thread shadow stats.

This patch lets the perf_stat__update_shadow_stats() support to update
the shadow stats on a input parameter 'st' and uses
update_runtime_stat() to update the stats. It will not directly update
the static variables as before.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-5-git-send-email-yao.jin@linux.intel.com
[ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:43 -03:00
Jin Yao
8efb2df128 perf stat: Create the runtime_stat init/exit function
It mainly initializes and releases the rblist which is defined in struct
runtime_stat.

For the original rblist 'runtime_saved_values', it's still kept there
for keeping the patch bisectable.

The rblist 'runtime_saved_values' will be removed in later patch at
switching time.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-4-git-send-email-yao.jin@linux.intel.com
[ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:43 -03:00
Jin Yao
49cd456af1 perf stat: Extend rbtree to support per-thread shadow stats
Previously the rbtree was used to link generic metrics.

This patches adds new ctx/type/stat into rbtree keys because we will use
this rbtree to maintain shadow metrics to replace original a couple of
static arrays for supporting per-thread shadow stats.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:42 -03:00
Jin Yao
e5fcc2abc3 perf stat: Define a structure for per-thread shadow stats
Perf has a set of static variables to record the runtime shadow metrics
stats.

While if we want to record the runtime shadow stats for per-thread, it
will be the limitation. This patch creates a structure and the next
patches will use this structure to update the runtime shadow stats for
per-thread.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512482591-4646-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-27 12:15:42 -03:00
Ingo Molnar
faaf95677f Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-18 18:13:00 +01:00
Ingo Molnar
2e36463525 perf/urgent fixes:
- Fix up build in hardened environments, such as fedora 27 (Jiri Olsa)
 
 - Do not include header files from the kernel sources for the s/390 arch,
   fixing the detached tarball building (Arnaldo Carvalho de Melo)
 
 - Allow again using asm.h when building for the 'bpf' clang target,
   guarding x86 specific bits under ifndef __BPF__ (Arnaldo Carvalho de Melo)
 
 - Generate correct debug information for inlined code when generating
   ELF images for JITted java programs (Ben Gainey)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEELb9bqkb7Te0zijNb1lAW81NSqkAFAlo315MACgkQ1lAW81NS
 qkA4xQ//XEk2+WydXH24kDL3GeoJU2stzM1A5v8nprjwE1d8MJbvpVLMwD/eLl8d
 /CxloxznycajG8gZR37+Yqkb0IiMTkb1sQ4WgkY4rwngTjWA9I4MYYE7YgHF6cNc
 bqjzSWpiKxis0qp1GfHNkpLF3T6bxCx/3Y2uIJnKnwfWepQESPmWJaPf3jv2Knv1
 Q8Ppcc6ObO7PO3HBnWAw9pImRniwRPq8uQ05yt8wGHBCai/84cyAYuAJuIR2+Eiy
 FStMge5AZpJvgoA6TvswyPkvvMJ2U7wJz0Ox0nPRN5piMXr9/6FvYcbYMnnZ/V8L
 BdJ09OTXw7JmO42t9hyCRyrD/dCJB8cOmjvaWOpnSvAq5bCGaZsCz0L6JOsrI0xv
 jmotSOy9j+pCgTQCIhPMRP0w8sidCTBomX2q5s5A3ZuhvI3mLn7295b6/XaaFh6A
 FbqH5aqtLhbcDnj2i0900J0PR0RTRuXK7ywGbxzMrAY9zg13+ckEJazBfxgZBdyo
 cWVLXhIRwtgfHkuVH6A4hCkyrSmIug+a/TXOCeu1U5zlP+6HVuERgSF9037qwfiE
 IZabRqdQEAgelrqwo4ldjlW5IDsYczu7OJz4iWGlpF5m9pHGEuuAJ6cbLO8Rt6pl
 VPnmNmDwAaOvaGGA4LrnlHFjBdKSN1DLCBNiEvW7+lsnO9IFumM=
 =IZXG
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-4.15-20171218' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

- Fix up build in hardened environments, such as fedora 27 (Jiri Olsa)

- Do not include header files from the kernel sources for the s/390 arch,
  fixing the detached tarball building (Arnaldo Carvalho de Melo)

- Allow again using asm.h when building for the 'bpf' clang target,
  guarding x86 specific bits under ifndef __BPF__ (Arnaldo Carvalho de Melo)

- Generate correct debug information for inlined code when generating
  ELF images for JITted java programs (Ben Gainey)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-18 18:10:32 +01:00
Arnaldo Carvalho de Melo
ca26cffa4e x86/asm: Allow again using asm.h when building for the 'bpf' clang target
Up to f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
we were able to use x86 headers to build to the 'bpf' clang target, as
done by the BPF code in tools/perf/.

With that commit, we ended up with following failure for 'perf test LLVM', this
is because "clang ... -target bpf ..." fails since 4.0 does not have bpf inline
asm support and 6.0 does not recognize the register 'esp', fix it by guarding
that part with an #ifndef __BPF__, that is defined by clang when building to
the "bpf" target.

  # perf test -v LLVM
  37: LLVM search and compile                               :
  37.1: Basic BPF llvm compile                              :
  --- start ---
  test child forked, pid 25526
  Kernel build dir is set to /lib/modules/4.14.0+/build
  set env: KBUILD_DIR=/lib/modules/4.14.0+/build
  unset env: KBUILD_OPTS
  include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: NR_CPUS=4
  set env: LINUX_VERSION_CODE=0x40e00
  set env: CLANG_EXEC=/usr/local/bin/clang
  set env: CLANG_OPTIONS=-xc
  set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: WORKING_DIR=/lib/modules/4.14.0+/build
  set env: CLANG_SOURCE=-
  llvm compiling command template: echo '/*
   * bpf-script-example.c
   * Test basic LLVM building
   */
  #ifndef LINUX_VERSION_CODE
  # error Need LINUX_VERSION_CODE
  # error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
  #endif
  #define BPF_ANY 0
  #define BPF_MAP_TYPE_ARRAY 2
  #define BPF_FUNC_map_lookup_elem 1
  #define BPF_FUNC_map_update_elem 2

  static void *(*bpf_map_lookup_elem)(void *map, void *key) =
	  (void *) BPF_FUNC_map_lookup_elem;
  static void *(*bpf_map_update_elem)(void *map, void *key, void *value, int flags) =
	  (void *) BPF_FUNC_map_update_elem;

  struct bpf_map_def {
	  unsigned int type;
	  unsigned int key_size;
	  unsigned int value_size;
	  unsigned int max_entries;
  };

  #define SEC(NAME) __attribute__((section(NAME), used))
  struct bpf_map_def SEC("maps") flip_table = {
	  .type = BPF_MAP_TYPE_ARRAY,
	  .key_size = sizeof(int),
	  .value_size = sizeof(int),
	  .max_entries = 1,
  };

  SEC("func=SyS_epoll_wait")
  int bpf_func__SyS_epoll_wait(void *ctx)
  {
	  int ind =0;
	  int *flag = bpf_map_lookup_elem(&flip_table, &ind);
	  int new_flag;
	  if (!flag)
		  return 0;
	  /* flip flag and store back */
	  new_flag = !*flag;
	  bpf_map_update_elem(&flip_table, &ind, &new_flag, BPF_ANY);
	  return new_flag;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  ' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
  test child finished with 0
  ---- end ----
  LLVM search and compile subtest 0: Ok
  37.2: kbuild searching                                    :
  --- start ---
  test child forked, pid 25950
  Kernel build dir is set to /lib/modules/4.14.0+/build
  set env: KBUILD_DIR=/lib/modules/4.14.0+/build
  unset env: KBUILD_OPTS
  include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: NR_CPUS=4
  set env: LINUX_VERSION_CODE=0x40e00
  set env: CLANG_EXEC=/usr/local/bin/clang
  set env: CLANG_OPTIONS=-xc
  set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: WORKING_DIR=/lib/modules/4.14.0+/build
  set env: CLANG_SOURCE=-
  llvm compiling command template: echo '/*
   * bpf-script-test-kbuild.c
   * Test include from kernel header
   */
  #ifndef LINUX_VERSION_CODE
  # error Need LINUX_VERSION_CODE
  # error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
  #endif
  #define SEC(NAME) __attribute__((section(NAME), used))

  #include <uapi/linux/fs.h>
  #include <uapi/asm/ptrace.h>

  SEC("func=vfs_llseek")
  int bpf_func__vfs_llseek(void *ctx)
  {
	  return 0;
  }

  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  ' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
  In file included from <stdin>:12:
  In file included from /home/acme/git/linux/arch/x86/include/uapi/asm/ptrace.h:5:
  In file included from /home/acme/git/linux/include/linux/compiler.h:242:
  In file included from /home/acme/git/linux/arch/x86/include/asm/barrier.h:5:
  In file included from /home/acme/git/linux/arch/x86/include/asm/alternative.h:10:
  /home/acme/git/linux/arch/x86/include/asm/asm.h:145:50: error: unknown register name 'esp' in asm
  register unsigned long current_stack_pointer asm(_ASM_SP);
                                                   ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:44:18: note: expanded from macro '_ASM_SP'
  #define _ASM_SP         __ASM_REG(sp)
                          ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:27:32: note: expanded from macro '__ASM_REG'
  #define __ASM_REG(reg)         __ASM_SEL_RAW(e##reg, r##reg)
                                 ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:18:29: note: expanded from macro '__ASM_SEL_RAW'
  # define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(a)
                              ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:11:32: note: expanded from macro '__ASM_FORM_RAW'
  # define __ASM_FORM_RAW(x)     #x
                                 ^
  <scratch space>:4:1: note: expanded from here
  "esp"
  ^
  1 error generated.
  ERROR:	unable to compile -
  Hint:	Check error message shown above.
  Hint:	You can also pre-compile it into .o using:
     		  clang -target bpf -O2 -c -
     	  with proper -I and -D options.
  Failed to compile test case: 'kbuild searching'
  test child finished with -1
  ---- end ----
  LLVM search and compile subtest 1: FAILED!

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/r/20171128175948.GL3298@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-18 11:56:22 -03:00
Arnaldo Carvalho de Melo
10b9baa701 tools arch s390: Do not include header files from the kernel sources
Long ago we decided to be verbotten including files in the kernel git
sources from tools/ living source code, to avoid disturbing kernel
development (and perf's and other tools/) when, say, a kernel hacker
adds something, tests everything but tools/ and have tools/ build
broken.

This got broken recently by s/390, fix it by copying
arch/s390/include/uapi/asm/perf_regs.h to tools/arch/s390/include/uapi/asm/,
making this one be used by means of <asm/perf_regs.h> and updating
tools/perf/check_headers.sh to make sure we are notified when the
original changes, so that we can check if anything is needed on the
tooling side.

This would have been caught by the 'tarkpg' test entry in:

$ make -C tools/perf build-test

When run on a s/390 build system or container.

Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: f704ef44602f ("s390/perf: add support for perf_regs and libdw")
Link: https://lkml.kernel.org/n/tip-n57139ic0v9uffx8wdqi3d8a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-18 11:56:13 -03:00
Ben Gainey
ca58d7e64b perf jvmti: Generate correct debug information for inlined code
tools/perf/jvmti is broken in so far as it generates incorrect debug
information. Specifically it attributes all debug lines to the original
method being output even in the case that some code is being inlined
from elsewhere.  This patch fixes the issue.

To test (from within linux/tools/perf):

export JDIR=/usr/lib/jvm/java-8-openjdk-amd64/
make
cat << __EOF > Test.java
public class Test
{
    private StringBuilder b = new StringBuilder();

    private void loop(int i, String... args)
    {
        for (String a : args)
            b.append(a);

        long hc = b.hashCode() * System.nanoTime();

        b = new StringBuilder();
        b.append(hc);

        System.out.printf("Iteration %d = %d\n", i, hc);
    }

    public void run(String... args)
    {
        for (int i = 0; i < 10000; ++i)
        {
            loop(i, args);
        }
    }

    public static void main(String... args)
    {
        Test t = new Test();
        t.run(args);
    }
}
__EOF
$JDIR/bin/javac Test.java
./perf record -F 10000 -g -k mono $JDIR/bin/java -agentpath:`pwd`/libperf-jvmti.so Test
./perf inject --jit -i perf.data -o perf.data.jitted
./perf annotate -i perf.data.jitted --stdio | grep Test\.java: | sort -u

Before this patch, Test.java line numbers get reported that are greater
than the number of lines in the Test.java file.  They come from the
source file of the inlined function, e.g. java/lang/String.java:1085.
For further validation one can examine those lines in the JDK source
distribution and confirm that they map to inlined functions called by
Test.java.

After this patch, the filename of the inlined function is output
rather than the incorrect original source filename.

Signed-off-by: Ben Gainey <ben.gainey@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Colin King <colin.king@canonical.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 598b7c6919c7 ("perf jit: add source line info support")
Link: http://lkml.kernel.org/r/20171122182541.d25599a3eb1ada3480d142fa@arm.com
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-18 11:54:08 -03:00
Jiri Olsa
61fb26a6a2 perf tools: Fix up build in hardened environments
On Fedora systems the perl and python CFLAGS/LDFLAGS include the
hardened specs from redhat-rpm-config package. We apply them only for
perl/python objects, which makes them not compatible with the rest of
the objects and the build fails with:

  /usr/bin/ld: perf-in.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -f
+PIC
  /usr/bin/ld: libperf.a(libperf-in.o): relocation R_X86_64_32S against `.text' can not be used when making a shared object; recompile w
+ith -fPIC
  /usr/bin/ld: final link failed: Nonrepresentable section on output
  collect2: error: ld returned 1 exit status
  make[2]: *** [Makefile.perf:507: perf] Error 1
  make[1]: *** [Makefile.perf:210: sub-make] Error 2
  make: *** [Makefile:69: all] Error 2

Mainly it's caused by perl/python objects being compiled with:

  -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1

which prevent the final link impossible, because it will check
for 'proper' objects with following option:

  -specs=/usr/lib/rpm/redhat/redhat-hardened-ld

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20171204082437.GC30564@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-18 11:54:08 -03:00
Jiri Olsa
5cfee7a357 perf tools: Use shell function for perl cflags retrieval
Using the shell function for perl CFLAGS retrieval instead of back
quotes (``). Both execute shell with the command, but the latter is more
explicit and seems to be the preferred way.

Also we don't have any other use of the back quotes in perf Makefiles.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171108102739.30338-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-12-18 11:54:08 -03:00
Ingo Molnar
1d2a7de8e9 Linux 4.15-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJaNy81AAoJEHm+PkMAQRiGq2YH/1C1so18qErhPosdfeLIXLbA
 iC9XcIvkPuMfjDw4EfSWOzhKnzgqGuc8q/Vzz0ulDreNVUb52nBeRy69QgNoZBTB
 NkLdrUKBnlArvRhBXToQGW/s1eI/gobuHBJb7/fbpvsUtPYcDE2nUXAEsMlagn5L
 BMHNzE3TByaWj0SqJtZAZvaQN2MdWV8ArHBPaC+MtR2C1VJIyl0mT9CdCu2NpTES
 +FncKJ6/qplSBNSUJSfYmFLfEKVcQxvHMi1kp9jOGlVjPM3cOPKRpv8x69x/IPoB
 3l82AikL+Ju0738oJ0Fp/IhfGUqpXz+FwUz1JmCdrcOby75RHomJuJCUBTtjXA4=
 =lYkx
 -----END PGP SIGNATURE-----

Merge tag 'v4.15-rc4' into perf/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-18 06:26:07 +01:00
Linus Torvalds
1291a0d504 Linux 4.15-rc4 2017-12-17 18:59:59 -08:00
Kees Cook
779f4e1c6c Revert "exec: avoid RLIMIT_STACK races with prlimit()"
This reverts commit 04e35f4495dd560db30c25efca4eecae8ec8c375.

SELinux runs with secureexec for all non-"noatsecure" domain transitions,
which means lots of processes end up hitting the stack hard-limit change
that was introduced in order to fix a race with prlimit(). That race fix
will need to be redesigned.

Reported-by: Laura Abbott <labbott@redhat.com>
Reported-by: Tomáš Trnka <trnka@scm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-17 14:26:25 -08:00
Linus Torvalds
f8940a0f20 Merge branch 'WIP.x86-pti.base-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull Page Table Isolation (PTI) v4.14 backporting base tree from Ingo Molnar:
 "This tree contains the v4.14 PTI backport preparatory tree, which
  consists of four merges of upstream trees and 7 cherry-picked commits,
  which the upcoming PTI work depends on"

NOTE! The resulting tree is exactly the same as the original base tree
(ie the diff between this commit and its immediate first parent is
empty).

The only reason for this merge is literally to have a common point for
the actual PTI changes so that the commits can be shared in both the
4.15 and 4.14 trees.

* 'WIP.x86-pti.base-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
  locking/barriers: Convert users of lockless_dereference() to READ_ONCE()
  locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE()
  bpf: fix build issues on um due to mising bpf_perf_event.h
  perf/x86: Enable free running PEBS for REGS_USER/INTR
  x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
  x86/cpufeature: Add User-Mode Instruction Prevention definitions
2017-12-17 13:57:08 -08:00
Linus Torvalds
6ba64feff6 Merge branch 'WIP.x86-pti.base.prep-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull Page Table Isolation (PTI) preparatory tree from Ingo Molnar:
 "This does a rename to free up linux/pti.h to be used by the upcoming
  page table isolation feature"

* 'WIP.x86-pti.base.prep-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  drivers/misc/intel/pti: Rename the header file to free up the namespace
2017-12-17 13:54:31 -08:00
Linus Torvalds
2ffb448ccb Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
 "A single bugfix which prevents arbitrary sigev_notify values in
  posix-timers"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  posix-timer: Properly check sigevent->sigev_notify
2017-12-17 13:48:50 -08:00
Linus Torvalds
c43727908f dmaengine fixes for v4.15-rc4
Here are fixes for this round
 - Fix for disable clk on error path in fsl-edma driver
 - Disable clk fail fix in jz4740 driver
 - Fix long pending bug in dmatest driver for dangling pointer
 - Fix potential NULL pointer dereference in at_hdmac driver
 - Error handling path in ioat driver
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaNp2PAAoJEHwUBw8lI4NHNfEP/RcERW0adeh4HXIges3FMW9N
 cqhRQpnUWmYIOTsfrWDH6WyR7pmEBOLryBzZXQ6Y8I3AOW4FTwBOG/W8NqFj+aiA
 JjeKRkXdpHD1eZpav72FUUhf4inb7BTtpGN5lBEFFk+gA3rWLscUw33OJLZmxsb5
 Vi82pJARfst42XB/jZUBDq+YNYlgG+NspYgUWvS5EnNXJJmy+47c5TS9EQUovctM
 X0yrKNomSW2qEO/IEXTUvXtArxpmxBrZI1q9RU21JJvF3izElqYO9C3OhGijM27u
 ERHWM1iStOxOl7zrzp8JJmVswdHyW/rT5uSlhl4689S2UZEJrqBGPo67Nh5w0Isi
 Nk+2X8I7Z/tNByggXASHX/xXymQGJF/mxAs1KypnSE0DpCJQi7oJAG6UY7lbIOz8
 CRGRt0gM8Xe/GrZa+7ZXmc7BoCrNPaGmW070XX3BoMcqT9m+SFOD2DhT351AVoOI
 uIPw5gAw0OCYhzkPP1c6AZPdLLFlVGrmPARRto/Af7w4Fnw4UlWGQ/+d6oQlvEyG
 G0kPxKE3e6uHXOizgm3GkbqpQWjTz649JMNw+xVey9e/pSVXORYIIomPmwahbh8x
 7Rfojaccq3w0YYG58z/Q5+WmzWf0jU//G/Cq5LsOm/5ghZTYRna7olA6lxFcn+ab
 6WQBW1rV2y8XUpDl4ScB
 =2egq
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fix-4.15-rc4' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine fixes from Vinod Koul:
 "This time consisting of fixes in a bunch of drivers and the dmatest
  module:

   - Fix for disable clk on error path in fsl-edma driver
   - Disable clk fail fix in jz4740 driver
   - Fix long pending bug in dmatest driver for dangling pointer
   - Fix potential NULL pointer dereference in at_hdmac driver
   - Error handling path in ioat driver"

* tag 'dmaengine-fix-4.15-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: fsl-edma: disable clks on all error paths
  dmaengine: jz4740: disable/unprepare clk if probe fails
  dmaengine: dmatest: move callback wait queue to thread context
  dmaengine: at_hdmac: fix potential NULL pointer dereference in atc_prep_dma_interleaved
  dmaengine: ioat: Fix error handling path
2017-12-17 13:28:49 -08:00
Arnd Bergmann
b9f5fb1800 cramfs: fix MTD dependency
With CONFIG_MTD=m and CONFIG_CRAMFS=y, we now get a link failure:

  fs/cramfs/inode.o: In function `cramfs_mount': inode.c:(.text+0x220): undefined reference to `mount_mtd'
  fs/cramfs/inode.o: In function `cramfs_mtd_fill_super':
  inode.c:(.text+0x6d8): undefined reference to `mtd_point'
  inode.c:(.text+0xae4): undefined reference to `mtd_unpoint'

This adds a more specific Kconfig dependency to avoid the broken
configuration.

Alternatively we could make CRAMFS itself depend on "MTD || !MTD" with a
similar result.

Fixes: 99c18ce580c6 ("cramfs: direct memory access support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-17 12:20:58 -08:00
Linus Torvalds
73d080d374 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "The alloc_super() one is a regression in this merge window, lazytime
  thing is older..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  VFS: Handle lazytime in do_mount()
  alloc_super(): do ->s_umount initialization earlier
2017-12-17 12:18:35 -08:00
Linus Torvalds
1c6b942d7d Fix a regression which caused us to fail to interpret symlinks in very
ancient ext3 file system images.  Also fix two xfstests failures, one
 of which could cause a OOPS, plus an additional bug fix caught by fuzz
 testing.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlo1y3EACgkQ8vlZVpUN
 gaNFOQf/bMf6ynai1dGGRwef+UcT874NZ2Hqm+UqI6pxusz0ZeKWm8HWfPfg31Fa
 o+OnUsZ7NXFBIHyfXKFJzdOgutjZ5eY0vMu+NrlyBdd6W+ZcHwn1PvQsLapFYvqK
 Rt+8nWTKqtnksSfh0vyODmUYgItOULOPPepjnIPm/Pd0DinJwo0GY/8MzLkz4SpX
 g6R60ou0ToEYNqBXAKIBnZ4aq8KWMtCMGcD270U5eAm/63Pt4riRwJbjITxZPAH1
 wKzivP4Ce5ce8W2g2/6mFFlBFWvtlB491T+BsgHUEv3OLze+kYS2PcxQthhEmBR8
 zeZ2o2/0tTxejE//cyJ4gCe3fYGRDg==
 =xqLC
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Fix a regression which caused us to fail to interpret symlinks in very
  ancient ext3 file system images.

  Also fix two xfstests failures, one of which could cause an OOPS, plus
  an additional bug fix caught by fuzz testing"

* tag 'ext4_for_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix crash when a directory's i_size is too small
  ext4: add missing error check in __ext4_new_inode()
  ext4: fix fdatasync(2) after fallocate(2) operation
  ext4: support fast symlinks from ext3 file systems
2017-12-17 12:14:33 -08:00