Commit Graph

10036 Commits

Author SHA1 Message Date
Jiri Olsa
1cd6472e3f tools build: Make fixdep parsing wait for last target
The fixdep tool, among other things, replaces the target of the object
in the gcc generated dependency output file.

The parsing code assumes there's only single target in the rule but this
is not always the case as described in here:

  https://gcc.gnu.org/ml/gcc-help/2016-11/msg00099.html

Make the fixdep code smart enough to skip all the possible targets.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Foley <pefoley2@pefoley.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20161201130025.GA16430@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05 15:51:42 -03:00
Kim Phillips
0fcb1da4ab perf annotate: AArch64 support
This is a regex converted version from the original:

	https://lkml.org/lkml/2016/5/19/461

Add basic support to recognise AArch64 assembly. This allows perf to
identify AArch64 instructions that branch to other parts within the
same function, thereby properly annotating them.

Rebased onto new cross-arch annotation bits:

	https://lkml.org/lkml/2016/11/25/546

Sample output:

security_file_permission  vmlinux
  5.80 │    ← ret                                                  ▒
       │70:   ldr    w0, [x21,#68]                                 ▒
  4.44 │    ↓ tbnz   d0                                            ▒
       │      mov    w0, #0x24                       // #36        ▒
  1.37 │      ands   w0, w22, w0                                   ▒
       │    ↑ b.eq   60                                            ▒
  1.37 │    ↓ tbnz   e4                                            ▒
       │      mov    w19, #0x20000                   // #131072    ▒
  1.02 │    ↓ tbz    ec                                            ▒
       │90:┌─→ldr    x3, [x21,#24]                                 ▒
  1.37 │   │  add    x21, x21, #0x10                               ▒
       │   │  mov    w2, w19                                       ▒
  1.02 │   │  mov    x0, x21                                       ▒
       │   │  mov    x1, x3                                        ▒
  1.71 │   │  ldr    x20, [x3,#48]                                 ▒
       │   │→ bl     __fsnotify_parent                             ▒
  0.68 │   │↑ cbnz   60                                            ▒
       │   │  mov    x2, x21                                       ▒
  1.37 │   │  mov    w1, w19                                       ▒
       │   │  mov    x0, x20                                       ▒
  0.68 │   │  mov    w5, #0x0                        // #0         ▒
       │   │  mov    x4, #0x0                        // #0         ▒
  1.71 │   │  mov    w3, #0x1                        // #1         ▒
       │   │→ bl     fsnotify                                      ▒
  1.37 │   │↑ b      60                                            ▒
       │d0:│  mov    w0, #0x0                        // #0         ▒
       │   │  ldp    x19, x20, [sp,#16]                            ▒
       │   │  ldp    x21, x22, [sp,#32]                            ▒
       │   │  ldp    x29, x30, [sp],#48                            ▒
       │   │← ret                                                  ▒
       │e4:│  mov    w19, #0x10000                   // #65536     ▒
       │   └──b      90                                            ◆
       │ec:   brk    #0x800                                        ▒
Press 'h' for help on key bindings

Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Signed-off-by: Chris Ryder <chris.ryder@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20161130092344.012e18e3e623bea395162f95@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:19 -03:00
Kim Phillips
859afa6ca9 perf annotate: Use arch->objdump.comment_char in dec__parse()
Presume neglected in commit 786c1b5 "perf annotate: Start supporting
cross arch annotation".  This doesn't fix a bug since none of the
affected arches support parsing dec/inc instructions yet.

Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Ryder <chris.ryder@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20161130092333.1cca5dd2c77e1790d61c1e9c@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:18 -03:00
David Ahern
46690a8051 perf report: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

Using the perf.data file captured via 'perf kmem record':

  # perf report --header-only
  # ========
  # captured on: Tue Nov 29 16:01:53 2016
  # hostname : jouet
  # os release : 4.8.8-300.fc25.x86_64
  # perf version : 4.9.rc6.g5a6aca
  # arch : x86_64
  # nrcpus online : 4
  # nrcpus avail : 4
  # cpudesc : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
  # cpuid : GenuineIntel,6,61,4
  # total memory : 20254660 kB
  # cmdline : /home/acme/bin/perf kmem record usleep 1
  # event : name = kmem:kmalloc, , id = { 931980, 931981, 931982, 931983 }, type = 2, size = 112, config = 0x1b9, { sample_period, sample_freq } = 1, sample_typ
  # event : name = kmem:kmalloc_node, , id = { 931984, 931985, 931986, 931987 }, type = 2, size = 112, config = 0x1b7, { sample_period, sample_freq } = 1, sampl
  # event : name = kmem:kfree, , id = { 931988, 931989, 931990, 931991 }, type = 2, size = 112, config = 0x1b5, { sample_period, sample_freq } = 1, sample_type
  # event : name = kmem:kmem_cache_alloc, , id = { 931992, 931993, 931994, 931995 }, type = 2, size = 112, config = 0x1b8, { sample_period, sample_freq } = 1, s
  # event : name = kmem:kmem_cache_alloc_node, , id = { 931996, 931997, 931998, 931999 }, type = 2, size = 112, config = 0x1b6, { sample_period, sample_freq } =
  # event : name = kmem:kmem_cache_free, , id = { 932000, 932001, 932002, 932003 }, type = 2, size = 112, config = 0x1b4, { sample_period, sample_freq } = 1, sa
  # HEADER_CPU_TOPOLOGY info available, use -I to display
  # HEADER_NUMA_TOPOLOGY info available, use -I to display
  # pmu mappings: cpu = 4, intel_pt = 7, intel_bts = 6, uncore_arb = 13, cstate_pkg = 15, breakpoint = 5, uncore_cbox_1 = 12, power = 9, software = 1, uncore_im
  # HEADER_CACHE info available, use -I to display
  # missing features: HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT
  # ========
  #
  # # Looking at just the histogram entries for the first event:
  #
  # perf report  | head -33
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 40  of event 'kmem:kmalloc'
  # Event count (approx.): 40
  #
  # Overhead  Trace output
  # ........  ...............................................................................................................
  #
    37.50%  call_site=ffffffffb91ad3c7 ptr=0xffff88895fc05000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
    10.00%  call_site=ffffffffb9258416 ptr=0xffff888a1dc61f00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
     7.50%  call_site=ffffffffb9258416 ptr=0xffff888a2640ac00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
     2.50%  call_site=ffffffffb92759ba ptr=0xffff888a26776000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb9276864 ptr=0xffff8886f6b82600 bytes_req=136 bytes_alloc=192 gfp_flags=GFP_KERNEL|__GFP_ZERO
     2.50%  call_site=ffffffffb9276903 ptr=0xffff888aefcf0460 bytes_req=32 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad0ce ptr=0xffff888756c98a00 bytes_req=392 bytes_alloc=512 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad0ce ptr=0xffff888756c9ba00 bytes_req=504 bytes_alloc=512 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad301 ptr=0xffff888a31747600 bytes_req=128 bytes_alloc=128 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad511 ptr=0xffff888a9d26a2a0 bytes_req=28 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c11a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c12c0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c1540 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c15a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c15e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c16e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c1c20 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff888a9d26a2a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931240 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
     2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931980 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
     2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931a00 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO

  #
  # # And then limiting using the example for 'perf kmem stat --time' used
  # # in the previous changeset committer note we see that there were no
  # # kmem:kmalloc in that last part of the file, but there were some
  # # kmem:kmem_cache_alloc ones:
  #
  # perf report --time 20119.782088, --stdio
  #
  # Total Lost Samples: 0
  #
  # Samples: 0  of event 'kmem:kmalloc'
  # Event count (approx.): 0
  #
  # Overhead  Trace output
  # ........  ............
  #

  # Samples: 0  of event 'kmem:kmalloc_node'
  # Event count (approx.): 0
  #
  # Overhead  Trace output
  # ........  ............
  #

  # Samples: 0  of event 'kmem:kfree'
  # Event count (approx.): 0
  #
  # Overhead  Trace output
  # ........  ............
  #

  # Samples: 8  of event 'kmem:kmem_cache_alloc'
  # Event count (approx.): 8
  #
  # Overhead  Trace output
  # ........  ..................................................................................................................
  #
    75.00%  call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
    12.50%  call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK
    12.50%  call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-7-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:10 -03:00
David Ahern
2a865bd8dd perf kmem: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

  # perf kmem record usleep 1
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 1.540 MB perf.data (2049 samples) ]
  # perf evlist
  kmem:kmalloc
  kmem:kmalloc_node
  kmem:kfree
  kmem:kmem_cache_alloc
  kmem:kmem_cache_alloc_node
  kmem:kmem_cache_free
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  #
  # # Use 'perf script' to get a first approach, select a chunk for then using
  # # with 'perf kmem stat --time'
  #
  # perf script | tail -15
    usleep 9889 [0] 20119.782088:  kmem:kmem_cache_free: (selinux_file_free_security+0x27) call_site=ffffffffb936aa07 ptr=0xffff888a1df49fc0
      perf 9888 [3] 20119.782088:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782089: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782090:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782090: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
    usleep 9889 [0] 20119.782091: kmem:kmem_cache_alloc: (__sigqueue_alloc+0x4a) call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK
      perf 9888 [3] 20119.782091:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782093:  kmem:kmem_cache_free: (__sigqueue_free.part.17+0x33) call_site=ffffffffb90ad3f3 ptr=0xffff8889f071f6e0
      perf 9888 [3] 20119.782098: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782098:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782099: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782100: kmem:kmem_cache_alloc: (alloc_buffer_head+0x21) call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782101:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782102: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782103:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
  #
  # # stats for the whole perf.data file, i.e. no interval specified
  #
  # perf kmem stat

  SUMMARY (SLAB allocator)
  ========================
  Total bytes requested: 172,628
  Total bytes allocated: 173,088
  Total bytes freed:     161,280
  Net total bytes allocated: 11,808
  Total bytes wasted on internal fragmentation: 460
  Internal fragmentation: 0.265761%
  Cross CPU allocations: 0/851
  #
  # # stats for an end open interval, after a certain time:
  #
  # perf kmem stat --time 20119.782088,

  SUMMARY (SLAB allocator)
  ========================
  Total bytes requested: 552
  Total bytes allocated: 552
  Total bytes freed:     448
  Net total bytes allocated: 104
  Total bytes wasted on internal fragmentation: 0
  Internal fragmentation: 0.000000%
  Cross CPU allocations: 0/8
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-6-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:02 -03:00
David Ahern
853b740711 perf sched timehist: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

  # perf sched record -a usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.593 MB perf.data (25 samples) ]
  #
  # perf sched timehist | head -18
  Samples do not have callchains.
          time    cpu   task name       wait time  sch delay  run time
                        [tid/pid]          (msec)     (msec)    (msec)
  ------------- ------  --------------- ---------  ---------  --------
   19818.635579 [0002]  <idle>              0.000      0.000     0.000
   19818.635613 [0000]  perf[9116]          0.000      0.000     0.000
   19818.635676 [0000]  <idle>              0.000      0.000     0.063
   19818.635678 [0000]  rcuos/2[29]         0.000      0.002     0.001
   19818.635696 [0002]  perf[9117]          0.000      0.004     0.116
   19818.635702 [0000]  <idle>              0.001      0.000     0.024
   19818.635709 [0002]  migration/2[25]     0.000      0.003     0.012
   19818.636263 [0000]  usleep[9117]        0.005      0.000     0.560
   19818.636316 [0000]  <idle>              0.560      0.000     0.053
   19818.636358 [0002]  <idle>              0.129      0.000     0.649
   19818.636358 [0000]  usleep[9117]        0.053      0.002     0.042
  #

  # perf sched timehist --time 19818.635696,
  Samples do not have callchains.
           time    cpu  task name       wait time  sch delay  run time
                        [tid/pid]          (msec)     (msec)    (msec)
  ------------- ------  ---------------  --------  --------- ---------
   19818.635696 [0002]  perf[9117]          0.000      0.120     0.000
   19818.635702 [0000]  <idle>              0.019      0.000     0.006
   19818.635709 [0002]  migration/2[25]     0.000      0.003     0.012
   19818.636263 [0000]  usleep[9117]        0.005      0.000     0.560
   19818.636316 [0000]  <idle>              0.560      0.000     0.053
   19818.636358 [0002]  <idle>              0.129      0.000     0.649
   19818.636358 [0000]  usleep[9117]        0.053      0.002     0.042
  #
  # perf sched timehist --time 19818.635696,19818.635709
  Samples do not have callchains.
           time    cpu  task name       wait time  sch delay  run time
                        [tid/pid]          (msec)     (msec)    (msec)
  ------------- ------  --------------- ---------  --------- ---------
   19818.635696 [0002]  perf[9117]          0.000      0.120     0.000
   19818.635702 [0000]  <idle>              0.019      0.000     0.006
   19818.635709 [0002]  migration/2[25]     0.000      0.003     0.012
   19818.635709 [0000]  usleep[9117]        0.005      0.000     0.006
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:52 -03:00
David Ahern
a91f4c473f perf script: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for some amount of time and analyze a segment of interest within that
window.

Committer notes:

Testing it:

  # perf evlist -v
  cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
  #
  # perf script --hide-call-graph | head -15
    swapper    0 [0] 9693.370039:      1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370044:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370046:      7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370048:    126 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370049:   2701 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370051:  58823 cycles:ppp: ffffffffb90cd2e0 idle_cpu (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370059:      1 cycles:ppp: ffffffffb91a713a ctx_resched (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370062:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370064:     13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370065:    250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370067:   5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370076:      1 cycles:ppp: ffffffffb91a76c1 __perf_event_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370091:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370095:      3 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
  #
  # perf script --hide-call-graph --time ,9693.370048
    swapper    0 [0] 9693.370039:      1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370044:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370046:      7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
  # perf script --hide-call-graph --time 9693.370064,9693.370076
    swapper    0 [1] 9693.370064:     13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370065:    250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370067:   5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux)
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:45 -03:00
David Ahern
c284d669a2 perf tools: Move parse_nsec_time to time-utils.c
Code move only; no functional change intended.

Committer notes:

Fix the build on Ubuntu 16.04 x86-64 cross-compiling to S/390, with this
set of auto-detected features:

  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  ...                          gtk2: [ OFF ]
  ...                      libaudit: [ OFF ]
  ...                        libbfd: [ OFF ]
  ...                        libelf: [ on  ]
  ...                       libnuma: [ OFF ]
  ...        numa_num_possible_cpus: [ OFF ]
  ...                       libperl: [ OFF ]
  ...                     libpython: [ OFF ]
  ...                      libslang: [ OFF ]
  ...                     libcrypto: [ OFF ]
  ...                     libunwind: [ OFF ]
  ...            libdw-dwarf-unwind: [ on  ]
  ...                          zlib: [ on  ]
  ...                          lzma: [ OFF ]
  ...                     get_cpuid: [ OFF ]
  ...                           bpf: [ on  ]

Where it was failing with:

    CC       /tmp/build/perf/util/time-utils.o
  util/time-utils.c: In function 'parse_nsec_time':
  util/time-utils.c:17:13: error: implicit declaration of function 'strtoul' [-Werror=implicit-function-declaration]
    time_sec = strtoul(str, &end, 10);
               ^
  util/time-utils.c:17:2: error: nested extern declaration of 'strtoul' [-Werror=nested-externs]
    time_sec = strtoul(str, &end, 10);
    ^
  util/time-utils.c: In function 'perf_time__parse_str':
  util/time-utils.c:93:2: error: implicit declaration of function 'free' [-Werror=implicit-function-declaration]
    free(str);
    ^
  util/time-utils.c:93:2: error: incompatible implicit declaration of built-in function 'free' [-Werror]
  util/time-utils.c:93:2: note: include '<stdlib.h>' or provide a declaration of 'free'

Do as suggested and add a '#include <stdlib.h>' to get the free() and strtoul()
declarations and fix the build.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:39 -03:00
David Ahern
fdf9dc4b34 perf tools: Add time-based utility functions
Add function to parse a user time string of the form <start>,<stop>
where start and stop are time in sec.nsec format. Both start and stop
times are optional.

Add function to determine if a sample time is within a given time
time window of interest.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:32 -03:00
David Ahern
64eff7d9c4 perf script: Add option to stop printing callchain
Allow user to specify list of symbols which cause the dump of callchains
to stop at that symbol.

Committer notes:

Testing it:

  # perf record -ag usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.177 MB perf.data (33 samples) ]
  #
  # # Without it:
  #
  # perf script
  swapper   0 [000]  9693.370039:          1 cycles:ppp:
                  2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                 137ffeb start_kernel ([kernel.vmlinux].init.text)
                 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
                 137f419 x86_64_start_kernel ([kernel.vmlinux].init.text)

  swapper   0 [000]  9693.370044:          1 cycles:ppp:
                  20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                 137ffeb start_kernel ([kernel.vmlinux].init.text)
                 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
  #
  # # Using it to see just what are the calls from the 'remote_function' function:
  #
  # perf script --stop-bt remote_function
  swapper   0 [000]  9693.370039:          1 cycles:ppp:
                  2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)

  swapper   0 [000]  9693.370044:          1 cycles:ppp:
                  20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480104021-36275-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-29 13:06:19 -03:00
David Ahern
aa58e9afb6 perf kmem stat: Track memory freed
Track freed memory as well as allocations and show the net in the
summary.

Committer notes:

Testing it:

  # perf kmem record usleep 1
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 1.626 MB perf.data (4208 samples) ]
  [root@jouet ~]# perf kmem stat --slab

  SUMMARY (SLAB allocator)
  ========================
  Total bytes requested: 234,011
  Total bytes allocated: 234,504
  Total bytes freed:     213,328                                 <------
  Net total bytes allocated: 21,176
  Total bytes wasted on internal fragmentation: 493
  Internal fragmentation: 0.210231%
  Cross CPU allocations: 4/1,963
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480110133-37039-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-29 12:50:32 -03:00
Arnaldo Carvalho de Melo
030910c085 perf test: Remove "test" and similar strings from test descriptions
Having "test" in almost all test descriptions is redundant, simplify it
removing and rewriting tests with such descriptions.

End result:

  # perf test
   1: vmlinux symtab matches kallsyms            : Ok
   2: Detect openat syscall event                : Ok
   3: Detect openat syscall event on all cpus    : Ok
   4: Read samples using the mmap interface      : Ok
   5: Parse event definition strings             : Ok
   6: PERF_RECORD_* events & perf_sample fields  : Ok
   7: Parse perf pmu format                      : Ok
   8: DSO data read                              : Ok
   9: DSO data cache                             : Ok
  10: DSO data reopen                            : Ok
  11: Roundtrip evsel->name                      : Ok
  12: Parse sched tracepoints fields             : Ok
  13: syscalls:sys_enter_openat event fields     : Ok
  14: Setup struct perf_event_attr               : Ok
  15: Match and link multiple hists              : Ok
  16: 'import perf' in python                    : Ok
  17: Breakpoint overflow signal handler         : Ok
  18: Breakpoint overflow sampling               : Ok
  19: Number of exit events of a simple workload : Ok
  20: Software clock events period values        : Ok
  21: Object code reading                        : Ok
  22: Sample parsing                             : Ok
  23: Use a dummy software event to keep tracking: Ok
  24: Parse with no sample_id_all bit set        : Ok
  25: Filter hist entries                        : Ok
  26: Lookup mmap thread                         : Ok
  27: Share thread mg                            : Ok
  28: Sort output of hist entries                : Ok
  29: Cumulate child hist entries                : Ok
  30: Track with sched_switch                    : Ok
  31: Filter fds with revents mask in a fdarray  : Ok
  32: Add fd to a fdarray, making it autogrow    : Ok
  33: kmod_path__parse                           : Ok
  34: Thread map                                 : Ok
  35: LLVM search and compile                    :
  35.1: Basic BPF llvm compile                    : Ok
  35.2: kbuild searching                          : Ok
  35.3: Compile source for BPF prologue generation: Ok
  35.4: Compile source for BPF relocation         : Ok
  36: Session topology                           : Ok
  37: BPF filter                                 :
  37.1: Basic BPF filtering                      : Ok
  37.2: BPF prologue generation                  : Ok
  37.3: BPF relocation checker                   : Ok
  38: Synthesize thread map                      : Ok
  39: Synthesize cpu map                         : Ok
  40: Synthesize stat config                     : Ok
  41: Synthesize stat                            : Ok
  42: Synthesize stat round                      : Ok
  43: Synthesize attr update                     : Ok
  44: Event times                                : Ok
  45: Read backward ring buffer                  : Ok
  46: Print cpu map                              : Ok
  47: Probe SDT events                           : Ok
  48: is_printable_array                         : Ok
  49: Print bitmap                               : Ok
  50: perf hooks                                 : Ok
  51: x86 rdpmc                                  : Ok
  52: Convert perf time to TSC                   : Ok
  53: DWARF unwind                               : Ok
  54: x86 instruction decoder - new instructions : Ok
  55: Intel cqm nmi context read                 : Skip
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-rx2lbfcrrio2yx1fxcljqy0e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-29 12:46:11 -03:00
Wang Nan
a074865e60 perf tools: Introduce perf hooks
Perf hooks allow hooking user code at perf events. They can be used for
manipulation of BPF maps, taking snapshot and reporting results. In this
patch two perf hook points are introduced: record_start and record_end.

To avoid buggy user actions, a SIGSEGV signal handler is introduced into
'perf record'. It turns off perf hook if it causes a segfault and report
an error to help debugging.

A test case for perf hook is introduced.

Test result:
  $ ./buildperf/perf test -v hook
  50: Test perf hooks                                          :
  --- start ---
  test child forked, pid 10311
  SIGSEGV is observed as expected, try to recover.
  Fatal error (SEGFAULT) in perf hook 'test'
  test child finished with 0
  ---- end ----
  Test perf hooks: Ok

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-5-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-29 12:13:27 -03:00
Wang Nan
5a6acad17d tools lib bpf: Retrive bpf_map through offset of bpf_map_def
Add a new API to libbpf, caller is able to get bpf_map through the
offset of bpf_map_def to 'maps' section.

The API will be used to help jitted perf hook code find fd of a map.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-4-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-29 12:10:19 -03:00
Wang Nan
10931d2413 tools lib bpf: Add private field for bpf_object
Similar to other classes defined in libbpf.h (map and program), allow
'object' class has its own private data.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-3-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-29 12:09:41 -03:00
Wang Nan
9742da0150 tools lib bpf: Add missing BPF functions
Add more BPF map operations to libbpf. Also add bpf_obj_{pin,get}(). They
can be used on not only BPF maps but also BPF programs.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Joe Stringer <joe@ovn.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161126070354.141764-2-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-29 12:09:36 -03:00
David Ahern
aa07df6eb5 perf trace: Update tid/pid filtering option to leverage symbol_conf
Leverage pid/tid filtering done by symbol_conf hooks.

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1480091392-35645-1-git-send-email-dsa@cumulusnetworks.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 16:04:22 -03:00
David Ahern
350f54fab2 perf sched timehist: Handle cpu migration events
Add handlers for sched:sched_migrate_task event. Total number of
migrations is added to summary display and -M/--migrations can be used
to show migration events.

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1480091321-35591-1-git-send-email-dsa@cumulusnetworks.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 16:00:22 -03:00
Arnaldo Carvalho de Melo
5252b1aeab perf annotate: Show invalid jump offset in error message
To help in debugging when the wrong offset is being used, like in:

       │13d98: ↓ jne    13dd1 <lzma_lzma_preset@@XZ_5.0+0x28e1>

That is the full line from objdump, and it seems what should be used is
13dd1, not 28e1.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4nc0marsgst1ft6inmvqber7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 15:56:34 -03:00
Arnaldo Carvalho de Melo
9484b86e9c perf ui helpline: Provide a printf variant
To print some values, like in the annotation code with invalid jump
offsets.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-1vk0g5twas2ioswn1mmvnvwq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 15:49:16 -03:00
Eric Leblond
4708bbda5c tools lib bpf: Fix maps resolution
It is not correct to assimilate the elf data of the maps section to an
array of map definition. In fact the sizes differ. The offset provided
in the symbol section has to be used instead.

This patch fixes a bug causing a elf with two maps not to load
correctly.

Wang Nan added:

This patch requires a name for each BPF map, so array of BPF maps is not
allowed. This restriction is reasonable, because kernel verifier forbid
indexing BPF map from such array unless the index is a fixed value, but
if the index is fixed why not merging it into name?

For example:

Program like this:
  ...
  unsigned long cpu = get_smp_processor_id();
  int *pval = map_lookup_elem(&map_array[cpu], &key);
  ...

Generates bytecode like this:

0: (b7) r1 = 0
1: (63) *(u32 *)(r10 -4) = r1
2: (b7) r1 = 680997
3: (63) *(u32 *)(r10 -8) = r1
4: (85) call 8
5: (67) r0 <<= 4
6: (18) r1 = 0x112dd000
8: (0f) r0 += r1
9: (bf) r2 = r10
10: (07) r2 += -4
11: (bf) r1 = r0
12: (85) call 1

Where instruction 8 is the computation, 8 and 11 render r1 to an invalid
value for function map_lookup_elem, causes verifier report error.

Signed-off-by: Eric Leblond <eric@regit.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Wang Nan <wangnan0@huawei.com>
[ Merge bpf_object__init_maps_name into bpf_object__init_maps.
  Fix segfault for buggy BPF script Validate obj->maps ]
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161115040617.69788-5-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 11:27:33 -03:00
Wang Nan
d6be16719e perf tools: Add missing struct definition in probe_event.h
Commit 0b3c2264ae ("perf symbols: Fix kallsyms perf test on ppc64le")
refers struct symbol in probe_event.h, but forgets to include its
definition.  Gcc will complain about it when that definition is not
added, by sheer luck, by some other header included before
probe_event.h.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161115040617.69788-4-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 11:25:46 -03:00
Wang Nan
3dbe46c524 perf record: Fix segfault when running with suid and kptr_restrict is 1
Before this patch perf panics if kptr_restrict is set to 1 and perf is
owned by root with suid set:

  $ whoami
  wangnan
  $ ls -l ./perf
  -rwsr-xr-x 1 root root 19781908 Sep 21 19:29 /home/wangnan/perf
  $ cat /proc/sys/kernel/kptr_restrict
  1
  $ cat /proc/sys/kernel/perf_event_paranoid
  -1
  $ ./perf record -a
  Segmentation fault (core dumped)
  $

The reason is that perf assumes it is allowed to read kptr from
/proc/kallsyms when euid is root, but in fact the kernel doesn't allow
reading kptr when euid and uid do not match with each other:

  $ cp /bin/cat .
  $ sudo chown root:root ./cat
  $ sudo chmod u+s ./cat
  $ cat /proc/kallsyms | grep do_fork
  0000000000000000 T _do_fork          <--- kptr is hidden even euid is root
  $ sudo cat /proc/kallsyms | grep do_fork
  ffffffff81080230 T _do_fork

See lib/vsprintf.c for kernel side code.

This patch fixes this problem by checking both uid and euid.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161115040617.69788-3-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 11:11:10 -03:00
Wang Nan
d18acd15c6 perf tools: Fix kernel version error in ubuntu
On ubuntu the internal kernel version code is different from what can
be retrived from uname:

 $ uname -r
 4.4.0-47-generic
 $ cat /lib/modules/`uname -r`/build/include/generated/uapi/linux/version.h
 #define LINUX_VERSION_CODE 263192
 #define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))
 $ cat /lib/modules/`uname -r`/build/include/generated/utsrelease.h
 #define UTS_RELEASE "4.4.0-47-generic"
 #define UTS_UBUNTU_RELEASE_ABI 47
 $ cat /proc/version_signature
 Ubuntu 4.4.0-47.68-generic 4.4.24

The macro LINUX_VERSION_CODE is set to 4.4.24 (263192 == 0x40418), but
`uname -r` reports 4.4.0.

This mismatch causes LINUX_VERSION_CODE macro passed to BPF script become
an incorrect value, results in magic failure in BPF loading:

 $ sudo ./buildperf/perf record -e ./tools/perf/tests/bpf-script-example.c ls
 event syntax error: './tools/perf/tests/bpf-script-example.c'
                      \___ Failed to load program for unknown reason

According to Ubuntu document (https://wiki.ubuntu.com/Kernel/FAQ), the
correct kernel version can be retrived through /proc/version_signature, which
is ubuntu specific.

This patch checks the existance of /proc/version_signature, and returns
version number through parsing this file instead of uname. Version string
is untouched (value returns from uname) because `uname -r` is required
to be consistence with path of kbuild directory in /lib/module.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/20161115040617.69788-2-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 11:00:30 -03:00
Namhyung Kim
8388deb3ba perf sched timehist: Enlarge max stack depth by 2
When it records callchains, they will always have 2 scheduler functions
(__schedule + schedule or __schedule + preempt_schedule) and get
ignored.  So it should collect 2 more functions to show the expected
number of callchains to user.

Committer Notes:

Example of final result, using the same perf.data file as in the
previous cset comment, but this time redirecting the output of 'perf
sched timehist' to a file instead of copy'n'pasting from xterm:

  [root@jouet experimental]# perf sched timehist > /tmp/bla
  [root@jouet experimental]# cat /tmp/bla
      time  cpu task name        wait time sch delay run time
                 [tid/pid]            (msec) (msec) (msec)
  -------- ----  -------------------- ------ ------ -----
  6.494998 [01] <idle>                0.000  0.000  0.000
  6.495027 [02] perf[519]             0.000  0.000  0.000 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_poll
  6.495096 [03] <idle>                0.000  0.000  0.000
  6.495100 [03] rcuos/0[9]            0.000  0.005  0.003 rcu_nocb_kthread <- kthread <- ret_from_fork
  6.495113 [01] perf[520]             0.000  0.008  0.114 preempt_schedule_common <- _cond_resched <- wait_for_completion <- stop_one_cpu <- sched_exec <- do_execveat_common.isra.35
  6.495121 [00] <idle>                0.000  0.000  0.000
  6.495129 [01] migration/1[17]       0.000  0.003  0.016 smpboot_thread_fn <- kthread <- ret_from_fork
  6.496085 [02] <idle>                0.000  0.000  1.057
  6.496096 [02] kworker/u16:1[31169]  0.000  0.004  0.011 worker_thread <- kthread <- ret_from_fork
  6.496096 [03] <idle>                0.003  0.000  0.996
  6.496169 [02] <idle>                0.011  0.000  0.072
  6.496171 [00] ls[520]               0.008  0.000  1.049 do_exit <- do_group_exit <- [unknown] <- entry_SYSCALL_64_fastpath
  6.496172 [03] gnome-terminal-[4391] 0.000  0.003  0.076 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_poll

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161124011114.7102-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 10:50:57 -03:00
Namhyung Kim
cdeb01bf78 perf sched timehist: Mark schedule function in callchains
The sched_switch event always captured from the scheduler function.  So
it'd be great omit them from the callchain.  This patch marks the
functions to be omitted by later patch.

Committer notes:

Testing it:

Before:

  [root@jouet experimental]# perf sched record -g ls
  Dockerfile  perf.data  x-mips64
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.355 MB perf.data (29 samples) ]
  [root@jouet experimental]# perf sched timehist
      time  cpu  task name         wait time sch delay run time
                 [tid/pid]             (msec) (msec) (msec)
  ----------- -----  ----------------- ------ ------ ------
  6.494998 [001] <idle>                0.000  0.000  0.000
  6.495027 [002] perf[519]             0.000  0.000  0.000 __schedule <- schedule <- schedule_hrtimeout_range_clock <- schedule_hrtimeou
  6.495096 [003] <idle>                0.000  0.000  0.000
  6.495100 [003] rcuos/0[9]            0.000  0.005  0.003 __schedule <- schedule <- rcu_nocb_kthread <- kthread <- ret_from_fork
  6.495113 [001] perf[520]             0.000  0.008  0.114 __schedule <- preempt_schedule_common <- _cond_resched <- wait_for_completion
  6.495121 [000] <idle>                0.000  0.000  0.000
  6.495129 [001] migration/1[17]       0.000  0.003  0.016 __schedule <- schedule <- smpboot_thread_fn <- kthread <- ret_from_fork
  6.496085 [002] <idle>                0.000  0.000  1.057
  6.496096 [002] kworker/u16:1[31169]  0.000  0.004  0.011 __schedule <- schedule <- worker_thread <- kthread <- ret_from_fork
  6.496096 [003] <idle>                0.003  0.000  0.996
  6.496169 [002] <idle>                0.011  0.000  0.072
  6.496171 [000] ls[520]               0.008  0.000  1.049 __schedule <- schedule <- do_exit <- do_group_exit <- [unknown]
  6.496172 [003] gnome-terminal-[4391] 0.000  0.003  0.076 __schedule <- schedule <- schedule_hrtimeout_range_clock <- schedule_hrtimeo

After:

  [root@jouet experimental]# perf sched timehist
      time  cpu  task name         wait time sch delay run time
                 [tid/pid]            (msec)  (msec)  (msec)
  ----------- -----  ----------------- -----  -----  ------
  6.494998 [001] <idle>                0.000  0.000  0.000
  6.495027 [002] perf[519]             0.000  0.000  0.000 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_t
  6.495096 [003] <idle>                0.000  0.000  0.000
  6.495100 [003] rcuos/0[9]            0.000  0.005  0.003 rcu_nocb_kthread <- kthread <- ret_from_fork
  6.495113 [001] perf[520]             0.000  0.008  0.114 preempt_schedule_common <- _cond_resched <- wait_for_completion <- stop_one_c
  6.495121 [000] <idle>                0.000  0.000  0.000
  6.495129 [001] migration/1[17]       0.000  0.003  0.016 smpboot_thread_fn <- kthread <- ret_from_fork
  6.496085 [002] <idle>                0.000  0.000  1.057
  6.496096 [002] kworker/u16:1[31169]  0.000  0.004  0.011 worker_thread <- kthread <- ret_from_fork
  6.496096 [003] <idle>                0.003  0.000  0.996
  6.496169 [002] <idle>                0.011  0.000  0.072
  6.496171 [000] ls[520]               0.008  0.000  1.049 do_exit <- do_group_exit <- [unknown]
  6.496172 [003] gnome-terminal-[4391] 0.000  0.003  0.076 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_
  [root@jouet experimental]#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161124011114.7102-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 10:49:43 -03:00
Namhyung Kim
2d9bbf6eb3 perf callchain: Add option to skip ignore symbol when printing callchains
For tracepoint events, callchains always contain certain functions.
Sometimes it'd be better to skip those functions as they have no value.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161124011114.7102-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 10:49:38 -03:00
Ravi Bangoria
dbdebdc538 perf annotate: Initial PowerPC support
Support the PowerPC architecture using the ins_ops association
method.

Committer notes:

Testing it with a perf.data file collected on a PowerPC machine and
cross-annotated on a x86_64 workstation, using the associated vmlinux
file:

$ perf report -i perf.data.f22vm.powerdev --vmlinux vmlinux.powerpc
  .ktime_get  vmlinux.powerpc
        │      clrldi r9,r28,63
   8.57 │   ┌──bne    e0                   <- TUI cursor positioned here
        │54:│  lwsync
   2.86 │   │  std    r2,40(r1)
        │   │  ld     r9,144(r31)
        │   │  ld     r3,136(r31)
        │   │  ld     r30,184(r31)
        │   │  ld     r10,0(r9)
        │   │  mtctr  r10
        │   │  ld     r2,8(r9)
   8.57 │   │→ bctrl
        │   │  ld     r2,40(r1)
        │   │  ld     r10,160(r31)
        │   │  ld     r5,152(r31)
        │   │  lwz    r7,168(r31)
        │   │  ld     r9,176(r31)
   8.57 │   │  lwz    r6,172(r31)
        │   │  lwsync
   2.86 │   │  lwz    r8,128(r31)
        │   │  cmpw   cr7,r8,r28
   2.86 │   │↑ bne    48
        │   │  subf   r10,r10,r3
        │   │  mr     r3,r29
        │   │  and    r10,r10,r5
   2.86 │   │  mulld  r10,r10,r7
        │   │  add    r9,r10,r9
        │   │  srd    r9,r9,r6
        │   │  add    r9,r9,r30
        │   │  std    r9,0(r29)
        │   │  addi   r1,r1,144
        │   │  ld     r0,16(r1)
        │   │  ld     r28,-32(r1)
        │   │  ld     r29,-24(r1)
        │   │  ld     r30,-16(r1)
        │   │  mtlr   r0
        │   │  ld     r31,-8(r1)
        │   │← blr
   5.71 │e0:└─→mr     r1,r1
  11.43 │      mr     r2,r2
  11.43 │      lwz    r28,128(r31)
  Press 'h' for help on key bindings

  $ perf report -i perf.data.f22vm.powerdev --header-only
  # ========
  # captured on: Thu Nov 24 12:40:38 2016
  # hostname : pdev-f22-qemu
  # os release : 4.4.10-200.fc22.ppc64
  # perf version : 4.9.rc1.g6298ce
  # arch : ppc64
  # nrcpus online : 48
  # nrcpus avail : 48
  # cpudesc : POWER7 (architected), altivec supported
  # cpuid : 74,513
  # total memory : 4158976 kB
  # cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a
  # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1
  # HEADER_CPU_TOPOLOGY info available, use -I to display
  # HEADER_NUMA_TOPOLOGY info available, use -I to display
  # pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5
  # missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE
  # ========
  #
  $

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/n/tip-tbjnp40ddoxxl474uvhwi6g4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 10:38:56 -03:00
Arnaldo Carvalho de Melo
acc9bfb5fa perf annotate: Improve support for ARM
By using arch->init() to set up some regular expressions to associate
ins_ops to ARM instructions, ditching that old table that has
instructions not present on ARM.

Take advantage of having an arch->init() to hide more arm specific stuff
from the common code, like the objdump details.

The regular expressions comes from a patch written by Kim Phillips.

Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-77m7lufz9ajjimkrebtg5ead@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 10:38:56 -03:00
Arnaldo Carvalho de Melo
0781ea9234 perf annotate: Allow arches to have a init routine and a priv area
Arches like ARM will want to use regular expressions when deciding what
instructions to associate with what ins_ops, provide infrastructure for
that.

Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-7dmnk9el2ipu3nxog092k9z5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 10:38:55 -03:00
Arnaldo Carvalho de Melo
2a1ff812c4 perf annotate: Introduce alternative method of keeping instructions table
Some arches may want to dynamically populate the table using regular
expressions on the instruction names to associate them with a set of
parsing/formatting/etc functions (struct ins_ops), so provide a fallback
for when the ins__find() method fails.

That fall back will be able to resize the arch->instructions, setting
arch->nr_instructions appropriately, helper functions to associate an
ins_ops to an instruction name, growing the arch->instructions if needed
and resorting it are provided, all the arch specific callback needs to
do is to decide if the missing instruction should be added to
arch->instructions with a ins_ops association.

Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-auu13yradxf7g5dgtpnzt97a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 10:38:44 -03:00
Arnaldo Carvalho de Melo
75b49202d8 perf annotate: Remove duplicate 'name' field from disasm_line
The disasm_line::name field is always equal to ins::name, being used
just to locate the instruction's ins_ops from the per-arch instructions
table.

Eliminate this duplication, nuking that field and instead make
ins__find() return an ins_ops, store it in disasm_line::ins.ops, and
keep just in disasm_line::ins.name what was in disasm_line::name, this
way we end up not keeping a reference to entries in the per-arch
instructions table.

This in turn will help supporting multiple ways to manage the per-arch
instructions table, allowing resorting that array, for instance, when
the entries will move after references to its addresses were made. The
same problem is avoided when one grows the array with realloc.

So architectures simply keeping a constant array will work as well as
architectures building the table using regular expressions or other
logic that involves resorting the table.

Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vr899azvabnw9gtuepuqfd9t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 10:24:16 -03:00
Ingo Molnar
47414424c5 perf/core improvements and fixes:
New tool:
 
 - 'perf sched timehist' provides an analysis of scheduling events.
 
   Example usage:
       perf sched record -- sleep 1
       perf sched timehist
 
   By default it shows the individual schedule events, including the wait
   time (time between sched-out and next sched-in events for the task), the
   task scheduling delay (time between wakeup and actually running) and run
   time for the task:
 
         time    cpu  task name         wait time  sch delay  run time
                      [tid/pid]            (msec)     (msec)    (msec)
     -------- ------  ----------------  ---------  ---------  --------
     1.874569 [0011]  gcc[31949]            0.014      0.000     1.148
     1.874591 [0010]  gcc[31951]            0.000      0.000     0.024
     1.874603 [0010]  migration/10[59]      3.350      0.004     0.011
     1.874604 [0011]  <idle>                1.148      0.000     0.035
     1.874723 [0005]  <idle>                0.016      0.000     1.383
     1.874746 [0005]  gcc[31949]            0.153      0.078     0.022
   ...
 
   Times are in msec.usec. (David Ahern, Namhyung Kim)
 
 Improvements:
 
 - Make 'perf c2c report' support -f/--force, to allow skipping the
   ownership check for root users, for instance, just like the other
   tools (Jiri Olsa)
 
 - Allow sorting cachelines by total number of HITMs, in addition to
   local and remote numbers (Jiri Olsa)
 
 Fixes:
 
 - Make sure errors aren't suppressed by the TUI reset at the end of
   a 'perf c2c report' session (Jiri Olsa)
 
 Infrastructure:
 
 - Initial work on having the annotate code better support multiple
   architectures, including the ability to cross-annotate, i.e. to
   annotate perf.data files collected on an ARM system on a x86_64
   workstation (Arnaldo Carvalho de Melo, Ravi Bangoria, Kim Phillips)
 
 - Use USECS_PER_SEC instead of hard coded number in libtraceevent (Steven Rostedt)
 
 - Add retrieval of preempt count and latency flags in libtraceevent (Steven Rostedt)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYNbr5AAoJENZQFvNTUqpAnq0P/1SkKcxUdjXHt59P9s1GH1W2
 VDDGdRMVG8IkhzNpVX7ojQ48rC/04e/QooFaASoMV9ySUI1V5aDi1JjcpSSqvEw7
 I4DobaJLwebqUJUP2LteoNAuX0UVq6jWUXFDCzeN9yAfoQ9qTNgejLtOACrQd32n
 l4FxyFvfrdhmy4I95Aa+1VaBGEOwzXmkr0h7DcGenYoKsO6lPJ/WtBhVtqvcq26G
 PtYhD2UZMmDhLfPy6kZffIfNtkJExeSqVkdoHYtt9cpvVO6JZdjfHVsvHc6TxW4f
 GXnHEC65Q7Gu2xRLPdaNYDXD9C7LZcOITnIwKt9GfCx2RV6nhVT2H7qnZM0xMP1l
 +362wIx9KJ628l/Q7SWQTjnL2a2yG4sCqNluSQizokYlUXvKOHfDzwT3TRy9QzVz
 H+mCL4f7eb8rZINRswVi7hi/KeQnLpUgNbJe9XCLdsCdA/lJeJ4kUcU52Nnx/Kp5
 nX7A+6KFthijJuAS0dFLsyi+t8Ln7TeeoDJ6n1REVwp7zNUBj+yQtOPNFKsPnaAq
 VFDpSkBxMHOC8vW2Dz1x7zkINjLsoOsc1Z3E5slc/ZAKfKeKyukCd0YDZitvIwuf
 67daqhoUtw4Gu9M5hKGx2jGy5osMlY9zzSBe/nENZGzcoLPBrHhCuV/w3IOKzLjY
 9EoFDSM2l34ihMGZliSa
 =gL8a
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-20161123' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

New tool:

- 'perf sched timehist' provides an analysis of scheduling events.

  Example usage:
      perf sched record -- sleep 1
      perf sched timehist

  By default it shows the individual schedule events, including the wait
  time (time between sched-out and next sched-in events for the task), the
  task scheduling delay (time between wakeup and actually running) and run
  time for the task:

        time    cpu  task name         wait time  sch delay  run time
                     [tid/pid]            (msec)     (msec)    (msec)
    -------- ------  ----------------  ---------  ---------  --------
    1.874569 [0011]  gcc[31949]            0.014      0.000     1.148
    1.874591 [0010]  gcc[31951]            0.000      0.000     0.024
    1.874603 [0010]  migration/10[59]      3.350      0.004     0.011
    1.874604 [0011]  <idle>                1.148      0.000     0.035
    1.874723 [0005]  <idle>                0.016      0.000     1.383
    1.874746 [0005]  gcc[31949]            0.153      0.078     0.022
  ...

  Times are in msec.usec. (David Ahern, Namhyung Kim)

Improvements:

- Make 'perf c2c report' support -f/--force, to allow skipping the
  ownership check for root users, for instance, just like the other
  tools (Jiri Olsa)

- Allow sorting cachelines by total number of HITMs, in addition to
  local and remote numbers (Jiri Olsa)

Fixes:

- Make sure errors aren't suppressed by the TUI reset at the end of
  a 'perf c2c report' session (Jiri Olsa)

Infrastructure changes:

- Initial work on having the annotate code better support multiple
  architectures, including the ability to cross-annotate, i.e. to
  annotate perf.data files collected on an ARM system on a x86_64
  workstation (Arnaldo Carvalho de Melo, Ravi Bangoria, Kim Phillips)

- Use USECS_PER_SEC instead of hard coded number in libtraceevent (Steven Rostedt)

- Add retrieval of preempt count and latency flags in libtraceevent (Steven Rostedt)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-24 05:09:31 +01:00
Ingo Molnar
69e6cdd0cf Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-24 05:09:08 +01:00
David Ahern
a407b0678b perf sched timehist: Add -V/--cpu-visual option
The -V option provides a visual aid for sched switches by cpu:

  $ perf sched timehist -V
             time    cpu  0123456789abc  task name              b/n time  sch delay   run time
                                         [tid/pid]                (msec)     (msec)     (msec)
  --------------- ------  -------------  --------------------  ---------  ---------  ---------
  ...
   2412598.429696 [0009]           i     <idle>                    0.000      0.000      0.000
   2412598.429767 [0002]    s            perf[7219]                0.000      0.000      0.000
   2412598.429783 [0009]           s     perf[7220]                0.000      0.006      0.087
   2412598.429794 [0010]            i    <idle>                    0.000      0.000      0.000
   2412598.429795 [0009]           s     migration/9[53]           0.000      0.003      0.011
   2412598.430370 [0010]            s    sleep[7220]               0.011      0.000      0.576
   2412598.432584 [0003]     i           <idle>                    0.000      0.000      0.000
  ...

Committer notes:

'i' marks idle time, 's' are scheduler events.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161116060634.28477-8-namhyung@kernel.org
[ Add documentation based on above commit message ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:09 -03:00
David Ahern
6c973c9085 perf sched timehist: Add call graph options
If callchains were recorded they are appended to the line with a default stack depth of 5:

  1.874569 [0011] gcc[31949]       0.014 0.000 1.148 wait_for_completion_killable <- do_fork <- sys_vfork <- stub_vfork <- __vfork
  1.874591 [0010] gcc[31951]       0.000 0.000 0.024 __cond_resched <- _cond_resched <- wait_for_completion <- stop_one_cpu <- sched_exec
  1.874603 [0010] migration/10[59] 3.350 0.004 0.011 smpboot_thread_fn <- kthread <- ret_from_fork
  1.874604 [0011] <idle>           1.148 0.000 0.035 cpu_startup_entry <- start_secondary
  1.874723 [0005] <idle>           0.016 0.000 1.383 cpu_startup_entry <- start_secondary
  1.874746 [0005] gcc[31949]       0.153 0.078 0.022 do_wait sys_wait4 <- system_call_fastpath <- __GI___waitpid

 --no-call-graph can be used to not show the callchains. --max-stack is used
to control the number of frames shown (default of 5). -x/--excl options can
be used to collapse redundant callchains to get more relevant data on screen.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161116060634.28477-7-namhyung@kernel.org
[ Add documentation based on above commit message ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:09 -03:00
David Ahern
fc1469f1b2 perf sched timehist: Add -w/--wakeups option
The -w option is to show wakeup events with timehist.

  $ perf sched timehist -w
             time    cpu  task name              b/n time  sch delay   run time
                          [tid/pid]                (msec)     (msec)     (msec)
  --------------- ------  --------------------  ---------  ---------  ---------
   2412598.429689 [0002]  perf[7219]                                             awakened: perf[7220]
   2412598.429696 [0009]  <idle>                    0.000      0.000      0.000
   2412598.429767 [0002]  perf[7219]                0.000      0.000      0.000
   2412598.429780 [0009]  perf[7220]                                             awakened: migration/9[53]
  ...

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161116060634.28477-6-namhyung@kernel.org
[ Add documentation based on above commit message ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:08 -03:00
David Ahern
52df138caa perf sched timehist: Add summary options
The -s/--summary option is to show process runtime statistics.  And the
 -S/--with-summary option is to show the stats with the normal output.

  $ perf sched timehist -s

  Runtime summary
                            comm  parent   sched-in     run-time    min-run     avg-run     max-run  stddev
                                            (count)       (msec)     (msec)      (msec)      (msec)       %
  ---------------------------------------------------------------------------------------------------------
                  ksoftirqd/0[3]       2          2        0.011      0.004       0.005       0.006   14.87
                  rcu_preempt[7]       2         11        0.071      0.002       0.006       0.017   20.23
                  watchdog/0[11]       2          1        0.002      0.002       0.002       0.002    0.00
                  watchdog/1[12]       2          1        0.004      0.004       0.004       0.004    0.00
  ...

  Terminated tasks:
                     sleep[7220]    7219          3        0.770      0.087       0.256       0.576   62.28

  Idle stats:
      CPU  0 idle for   2352.006  msec
      CPU  1 idle for   2764.497  msec
      CPU  2 idle for   2998.229  msec
      CPU  3 idle for   2967.800  msec

      Total number of unique tasks: 52
  Total number of context switches: 2532
             Total run time (msec): 218.036

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161116060634.28477-5-namhyung@kernel.org
[ Add documentation from last commit, so that docs comes with the cset that introduces the feature ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:08 -03:00
David Ahern
49394a2a24 perf sched timehist: Introduce timehist command
'perf sched timehist' provides an analysis of scheduling events.

Example usage:
    perf sched record -- sleep 1
    perf sched timehist

By default it shows the individual schedule events, including the wait
time (time between sched-out and next sched-in events for the task), the
task scheduling delay (time between wakeup and actually running) and run
time for the task:

            time    cpu  task name             wait time  sch delay   run time
                         [tid/pid]                (msec)     (msec)     (msec)
  -------------- ------  --------------------  ---------  ---------  ---------
    79371.874569 [0011]  gcc[31949]                0.014      0.000      1.148
    79371.874591 [0010]  gcc[31951]                0.000      0.000      0.024
    79371.874603 [0010]  migration/10[59]          3.350      0.004      0.011
    79371.874604 [0011]  <idle>                    1.148      0.000      0.035
    79371.874723 [0005]  <idle>                    0.016      0.000      1.383
    79371.874746 [0005]  gcc[31949]                0.153      0.078      0.022
...

Times are in msec.usec.

Committer note:

Add above explanation as the 'perf sched timehist' entry for 'man
perf-sched'.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161116060634.28477-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:07 -03:00
Namhyung Kim
69b7e48070 perf evsel: Support printing callchains with arrows
The EVSEL__PRINT_CALLCHAIN_ARROW options can be used to print callchains
with arrows for readability.  It will be used 'sched timehist' command
like below:

    __schedule <- schedule <- schedule_timeout <- rcu_gp_kthread <- kthread <- ret_from_fork
    __schedule <- schedule <- schedule_timeout <- rcu_gp_kthread <- kthread <- ret_from_fork
    __schedule <- schedule <- worker_thread <- kthread <- ret_from_fork

Suggested-and-Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161116060634.28477-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:07 -03:00
Namhyung Kim
a8763445f6 perf symbols: Print symbol offsets conditionally
The __symbol__fprintf_symname_offs() always shows symbol offsets.  So
there's no difference between 'perf script -F ip,sym' and 'perf script
-F ip,sym,symoff'.  I don't think it's a desired behavior..

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161116060634.28477-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:06 -03:00
Jiri Olsa
3a5bfab60e perf c2c: Support cascading options
Adding support for cascading options added by Namhyung in:

  commit 369a247897 ("tools lib subcmd: Support cascading options")

This way the report and record command share options with with c2c
command and can save some option duplicates. For now it's the 'v'
option.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:06 -03:00
Jiri Olsa
d940baccc9 perf c2c report: Display total HITMs on default
Currently we display the cacheline list sorted on remote HITMs by
default.

The problem is that they might not be always counted and 'perf c2c
report' displays an empty output. Thus it's more convenient to display
and sort the cacheline list based on the total of HITMs and have the
best change to see data in the default report run.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:05 -03:00
Jiri Olsa
dba8ab9379 perf c2c report: Add struct c2c_stats::tot_hitm field
Count total number of HITMs in a special field. This will ease up
addition of total HITM sorting into c2c report in the following patch.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:05 -03:00
Jiri Olsa
b7ac4f9f3b perf c2c report: Add -f/--force option
Adding -f/--force option to go through ownership validation:

  $ sudo perf c2c report
  File perf.data not owned by current user or root (use -f to override)
  $
  $ sudo perf c2c report -f
  < c2c report output >
  $

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:04 -03:00
Jiri Olsa
e8c5fe101e perf c2c report: Setup browser after opening perf.data
Because of the early browser switch we won't get possible error
messages, as it will clear the screen right after showing the message,
e.g.:

Before:

   $ sudo perf c2c report -d lcl
   $

After:
   $ sudo perf c2c report -d lcl
   File perf.data not owned by current user or root (use -f to override)
   $
   $ ls -la perf.data
   -rw-------. 1 acme acme 26648 Nov 22 15:11 perf.data
   $

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:04 -03:00
Jiri Olsa
7b4b82bced perf tools: Show event fd in debug output
It is useful for debug to see file descriptors for each event.

Before:

  $ perf stat -vvv -e cycles,cache-misses ls
  ...
  sys_perf_event_open: pid 12146  cpu -1  group_fd -1  flags 0x8
  ...
  sys_perf_event_open: pid 12146  cpu -1  group_fd 3  flags 0x8
  sys_perf_event_open failed, error -13

Now:

  $ perf stat -vvv -e cycles,cache-misses ls
  ...
  sys_perf_event_open: pid 12858  cpu -1  group_fd -1  flags 0x8 = 3
  ...
  sys_perf_event_open: pid 12858  cpu -1  group_fd 3  flags 0x8
  sys_perf_event_open failed, error -13

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:03 -03:00
Steven Rostedt
c52d9e4e67 tools lib traceevent: Add retrieval of preempt count and latency flags
Add a way to retrieve the preempt count as well as the latency flags from a
pevent_record.

  int pevent_data_preempt_count(pevent, record);

returns the preempt count of a record.

  int pevent_data_flags(pevent, record);

returns the latency flags for a record.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20161122113158.03a010a8@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:03 -03:00
Steven Rostedt
bb5a7316b9 tools lib traceevent: Use USECS_PER_SEC instead of hardcoded number
Instead of using 1000000, use the define in time64.h instead.

Also remove the the duplicate defines for NSECS_PER_SEC.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20161121114149.67111981@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:02 -03:00
Linus Torvalds
20afa6e2f9 ACPI fixes for v4.9-rc6
- Revert a recent ACPICA cleanup that attempted to get rid of all
    FADT version 2 legacy, but broke ACPI thermal management on at
    least one system (Rafael Wysocki).
 
  - Fix cross-compiled builds of ACPI tools that stopped working
    after a recent cleanup related to the handling of header files
    in ACPICA (Lv Zheng).
 
  - Fix a locking issue in the PCC channel initialization code that
    invokes devm_request_irq() under a spinlock (among other things)
    and causes lockdep to complain (Hoan Tran).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJYL2zcAAoJEILEb/54YlRx344P/1TDKMXiyWh8JmAAWFAPdK49
 jZpRS+hD15droQuhpBLz7KJDpmAU/KKOg+dRrtl4NpYMf+ApWtkBUblUYE/8d/sv
 L9WSoTx6KlxhSI+Av5BqaCyIYUwMtn4NRT2MP8lz9eQ1A91M+RWvdKm/8XzsppXc
 spCVi0NxO8qRIJjnFgkYdLRvGYLVMmCTkf2ptxnao5SnmyE/wefhZCW/bZFFEAiY
 QIIXkCZrdJ1PH0mre7P2+CvmAzdlxl02A/3aZquNTjDD33KBGvhQcvASGxrnAvs+
 QG4EF29cHO2XAxPo50PLkn5kE+Fef3ulSi2hpKIOqdayRFmCPEFneHed9E1G5hNA
 05nzMbw9nvz9QGFnqaWDbebfhueJ6ztqxgnqPb8j4YXTsr8P8giGj3Djk2eawCX+
 F67OCm0c0NZpo63zqAsCd3WC9s2MOKNntnRGq4dJ2xNfC6qk0+eJLCDaCxoKoOn4
 oMwYU+AWgjxK/vsXKL0RNQfd1xkuy3E4/HCy1knPkpYpJpC29qqFwcKH8DbLig/+
 PZ1yRC2ZZ1EqxUpRjEoGuU9g2ElkY6pqjTGh1qquB9PiYQrTK/HYGjnW23eKb1tH
 iUj/kSO486nlUP1hOk77BOUoKOBjDrt9o520WNHn3yLDknljZsUcG2QtmlO6fmpS
 BLxyCWXWCJcsPyEcGSoY
 =9ZhJ
 -----END PGP SIGNATURE-----

Merge tag 'acpi-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "They fix an ACPI thermal management regression introduced by a recent
  FADT handling cleanup, an ACPI tools build issue introduced by a
  recent ACPICA commit and a PCC mailbox initialization bug causing
  lockdep to complain loudly.

  Specifics:

   - Revert a recent ACPICA cleanup that attempted to get rid of all
     FADT version 2 legacy, but broke ACPI thermal management on at
     least one system (Rafael Wysocki).

   - Fix cross-compiled builds of ACPI tools that stopped working after
     a recent cleanup related to the handling of header files in ACPICA
     (Lv Zheng).

   - Fix a locking issue in the PCC channel initialization code that
     invokes devm_request_irq() under a spinlock (among other things)
     and causes lockdep to complain (Hoan Tran)"

* tag 'acpi-4.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  tools/power/acpi: Remove direct kernel source include reference
  mailbox: PCC: Fix lockdep warning when request PCC channel
  Revert "ACPICA: FADT support cleanup"
2016-11-18 17:21:58 -08:00