984262 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
c1bd8a2b9f Merge branch 'perf/urgent' into perf/core
To get some fixes that didn't made into 5.11.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-16 11:52:16 -03:00
Leo Yan
a89dbc9b98 perf arm-spe: Set sample's data source field
The sample structure contains the field 'data_src' which is used to
tell the data operation attributions, e.g. operation type is loading or
storing, cache level, it's snooping or remote accessing, etc.  At the
end, the 'data_src' will be parsed by perf mem/c2c tools to display
human readable strings.

This patch is to fill the 'data_src' field in the synthesized samples
base on different types.  Currently perf tool can display statistics for
L1/L2/L3 caches but it doesn't support the 'last level cache'.  To fit
to current implementation, 'data_src' field uses L3 cache for last level
cache.

Before this commit, perf mem report looks like this:
  # Samples: 75K of event 'l1d-miss'
  # Total weight : 75951
  # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
  #
  # Overhead  Samples  Local Weight  Memory access  Symbol                  Shared Object  Data Symbol             Data Object  Snoop  TLB access
  # ........  .......  ............  .............  ......................  .............  ......................  ...........  .....  ..........
  #
      81.56%    61945  0             N/A            [.] 0x00000000000009d8  serial_c       [.] 0000000000000000    [unknown]    N/A    N/A
      18.44%    14003  0             N/A            [.] 0x0000000000000828  serial_c       [.] 0000000000000000    [unknown]    N/A    N/A

Now on a system with Arm SPE, addresses and access types are displayed:

  # Samples: 75K of event 'l1d-miss'
  # Total weight : 75951
  # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
  #
  # Overhead  Samples  Local Weight  Memory access  Symbol                  Shared Object  Data Symbol             Data Object  Snoop  TLB access
  # ........  .......  ............  .............  ......................  .............  ......................  ...........  .....  ..........
  #
       0.43%      324  0             L1 miss        [.] 0x00000000000009d8  serial_c       [.] 0x0000ffff80794e00  anon         N/A    Walker hit
       0.42%      322  0             L1 miss        [.] 0x00000000000009d8  serial_c       [.] 0x0000ffff80794580  anon         N/A    Walker hit

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20210211133856.2137-6-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-16 11:52:00 -03:00
Leo Yan
e55ed3423c perf arm-spe: Synthesize memory event
The memory event can deliver two benefits:

- The first benefit is the memory event can give out global view for
  memory accessing, rather than organizing events with scatter mode
  (e.g. uses separate event for L1 cache, last level cache, etc) which
  which can only display a event for single memory type, memory events
  include all memory accessing so it can display the data accessing
  cross memory levels in the same view;

- The second benefit is the sample generation might introduce a big
  overhead and need to wait for long time for Perf reporting, we can
  specify itrace option '--itrace=M' to filter out other events and only
  output memory events, this can significantly reduce the overhead
  caused by generating samples.

This patch is to enable memory event for Arm SPE.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20210211133856.2137-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-16 11:52:00 -03:00
Leo Yan
54f7815efe perf arm-spe: Fill address info for samples
To properly handle memory and branch samples, this patch divides into
two functions for generating samples: arm_spe__synth_mem_sample() is for
synthesizing memory and TLB samples; arm_spe__synth_branch_sample() is
to synthesize branch samples.

Arm SPE backend decoder has passed virtual and physical address through
packets, the address info is stored into the synthesize samples in the
function arm_spe__synth_mem_sample().

Committer notes:

Fixed this:

  36    46.77 fedora:27                     : FAIL clang version 5.0.2 (tags/RELEASE_502/final)

    util/arm-spe.c:269:34: error: missing field 'pid' initializer [-Werror,-Wmissing-field-initializers]
            struct perf_sample sample = { 0 };
                                            ^
    util/arm-spe.c:288:34: error: missing field 'pid' initializer [-Werror,-Wmissing-field-initializers]
            struct perf_sample sample = { 0 };

By using = { .ip = 0, };

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20210211133856.2137-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-16 11:51:08 -03:00
Jianlin Lv
105f75ebf9 perf probe: Fix kretprobe issue caused by GCC bug
Perf failed to add a kretprobe event with debuginfo of vmlinux which is
compiled by gcc with -fpatchable-function-entry option enabled.  The
same issue with kernel module.

Issue:

  # perf probe  -v 'kernel_clone%return $retval'
  ......
  Writing event: r:probe/kernel_clone__return _text+599624 $retval
  Failed to write event: Invalid argument
    Error: Failed to add events. Reason: Invalid argument (Code: -22)

  # cat /sys/kernel/debug/tracing/error_log
  [156.75] trace_kprobe: error: Retprobe address must be an function entry
  Command: r:probe/kernel_clone__return _text+599624 $retval
                                        ^

  # llvm-dwarfdump  vmlinux |grep  -A 10  -w 0x00df2c2b
  0x00df2c2b:   DW_TAG_subprogram
                DW_AT_external  (true)
                DW_AT_name      ("kernel_clone")
                DW_AT_decl_file ("/home/code/linux-next/kernel/fork.c")
                DW_AT_decl_line (2423)
                DW_AT_decl_column       (0x07)
                DW_AT_prototyped        (true)
                DW_AT_type      (0x00dcd492 "pid_t")
                DW_AT_low_pc    (0xffff800010092648)
                DW_AT_high_pc   (0xffff800010092b9c)
                DW_AT_frame_base        (DW_OP_call_frame_cfa)

  # cat /proc/kallsyms |grep kernel_clone
  ffff800010092640 T kernel_clone
  # readelf -s vmlinux |grep -i kernel_clone
  183173: ffff800010092640  1372 FUNC    GLOBAL DEFAULT    2 kernel_clone

  # objdump -d vmlinux |grep -A 10  -w \<kernel_clone\>:
  ffff800010092640 <kernel_clone>:
  ffff800010092640:       d503201f        nop
  ffff800010092644:       d503201f        nop
  ffff800010092648:       d503233f        paciasp
  ffff80001009264c:       a9b87bfd        stp     x29, x30, [sp, #-128]!
  ffff800010092650:       910003fd        mov     x29, sp
  ffff800010092654:       a90153f3        stp     x19, x20, [sp, #16]

The entry address of kernel_clone converted by debuginfo is _text+599624
(0x92648), which is consistent with the value of DW_AT_low_pc attribute.
But the symbolic address of kernel_clone from /proc/kallsyms is
ffff800010092640.

This issue is found on arm64, -fpatchable-function-entry=2 is enabled when
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y;
Just as objdump displayed the assembler contents of kernel_clone,
GCC generate 2 NOPs  at the beginning of each function.

kprobe_on_func_entry detects that (_text+599624) is not the entry address
of the function, which leads to the failure of adding kretprobe event.

  kprobe_on_func_entry
  ->_kprobe_addr
  ->kallsyms_lookup_size_offset
  ->arch_kprobe_on_func_entry		// FALSE

The cause of the issue is that the first instruction in the compile unit
indicated by DW_AT_low_pc does not include NOPs.
This issue exists in all gcc versions that support
-fpatchable-function-entry option.

I have reported it to the GCC community:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776

Currently arm64 and PA-RISC may enable fpatchable-function-entry option.
The kernel compiled with clang does not have this issue.

FIX:

This GCC issue only cause the registration failure of the kretprobe event
which doesn't need debuginfo. So, stop using debuginfo for retprobe.
map will be used to query the probe function address.

Signed-off-by: Jianlin Lv <Jianlin.Lv@arm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: clang-built-linux@googlegroups.com
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20210210062646.2377995-1-Jianlin.Lv@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-12 18:34:25 -03:00
Nicholas Fraser
77771a9701 perf symbols: Fix return value when loading PE DSO
The first time dso__load() was called on a PE file it always returned -1
error. This caused the first call to map__find_symbol() to always fail
on a PE file so the first sample from each PE file always had symbol
<unknown>. Subsequent samples succeed however because the DSO is already
loaded.

This fixes dso__load() to return 0 when successfully loading a DSO with
libbfd.

Fixes: eac9a4342e5447ca ("perf symbols: Try reading the symbol table with libbfd")
Signed-off-by: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Huw Davies <huw@codeweavers.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Ulrich Czekalla <uczekalla@codeweavers.com>
Link: http://lore.kernel.org/lkml/1671b43b-09c3-1911-dbf8-7f030242fbf7@codeweavers.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-12 18:21:02 -03:00
Nicholas Fraser
00a3423492 perf symbols: Make dso__load_bfd_symbols() load PE files from debug cache only
dso__load_bfd_symbols() attempts to load a DSO at its original path,
then closes it and loads the file in the debug cache. This is incorrect.
It should ignore the original file and work with only the debug cache.

The original file may have changed or may not even exist, for example if
the debug cache has been transferred to another machine via "perf
archive".

This fix makes it only load the file in the debug cache.

Further notes from Nicholas:

dso__load_bfd_symbols() is called in a loop from dso__load() for a variety
of paths. These are generated by the various DSO_BINARY_TYPEs in the
binary_type_symtab list at the top of util/symbol.c. In each case the
debugfile passed to dso__load_bfd_symbols() is the path to try.

One of those iterations (the first one I believe) passes the original path
as the debugfile. If the file still exists at the original path, this is
the one that ends up being used in case the debugcache was deleted or the
PE file doesn't have a build-id.

A later iteration (BUILD_ID_CACHE) passes debugfile as the file in the
debugcache if it has a build-id. Even if the file was previously loaded at
its original path, (if I understand correctly) this load will override it
so the debugcache file ends up being used.

Committer notes:

So if it fails to find in the cache, it will eventually hope for the
best and look at the path in the local filesystem, which in many cases
is enough.

At some point we need to switch from this "hope for the best" approach
to one that warns the user that there is no guarantee, if no buildid is
present, that just by looking at the pathname the symbolisation will
work.

Signed-off-by: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Huw Davies <huw@codeweavers.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
Cc: Ulrich Czekalla <uczekalla@codeweavers.com>
Link: http://lore.kernel.org/lkml/e58e1237-94ab-e1c9-a7b9-473531906954@codeweavers.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-12 18:18:09 -03:00
Leo Yan
97ae666ae0 perf arm-spe: Store operation type in packet
This patch is to store operation type in packet structure.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20210211133856.2137-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-12 17:43:24 -03:00
Leo Yan
265cfb9586 perf arm-spe: Store memory address in packet
This patch is to store virtual and physical memory addresses in packet,
which will be used for memory samples.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210211133856.2137-2-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-12 17:43:21 -03:00
Leo Yan
845d3a65c3 perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC
This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in
the perf data, when output the tracing data, it tells tools that it
contains data source in the memory event.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210211133856.2137-1-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-12 17:37:46 -03:00
Ian Rogers
e73b0d586e perf env: Remove unneeded internal/cpumap inclusions
Minor cleanup.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210211183914.4093187-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-12 17:35:48 -03:00
Ian Rogers
b1cdc7d33f perf tools: Remove unused xyarray.c as it was moved to tools/lib/perf
Migrated to libperf in:

  4b247fa7314ce482 ("libperf: Adopt xyarray class from perf")

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210212043803.365993-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-12 17:25:17 -03:00
Dmitry Safonov
96de68fff5 perf symbols: Use (long) for iterator for bfd symbols
GCC (GCC) 8.4.0 20200304 fails to build perf with:
: util/symbol.c: In function 'dso__load_bfd_symbols':
: util/symbol.c:1626:16: error: comparison of integer expressions of different signednes
:   for (i = 0; i < symbols_count; ++i) {
:                 ^
: util/symbol.c:1632:16: error: comparison of integer expressions of different signednes
:    while (i + 1 < symbols_count &&
:                 ^
: util/symbol.c:1637:13: error: comparison of integer expressions of different signednes
:    if (i + 1 < symbols_count &&
:              ^
: cc1: all warnings being treated as errors

It's unlikely that the symtable will be that big, but the fix is an
oneliner and as perf has CORE_CFLAGS += -Wextra, which makes build to
fail together with CORE_CFLAGS += -Werror

Fixes: eac9a4342e54 ("perf symbols: Try reading the symbol table with libbfd")
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Jacek Caban <jacek@codeweavers.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Link: http://lore.kernel.org/lkml/20210209145148.178702-1-dima@arista.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 20:51:29 -03:00
Martin Liška
1f0e6edcd9 perf annotate: Fix jump parsing for C++ code.
Considering the following testcase:

  int
  foo(int a, int b)
  {
     for (unsigned i = 0; i < 1000000000; i++)
       a += b;
     return a;
  }

  int main()
  {
     foo (3, 4);
     return 0;
  }

'perf annotate' displays:

  86.52 │40055e: → ja   40056c <foo(int, int)+0x26>
  13.37 │400560:   mov  -0x18(%rbp),%eax
        │400563:   add  %eax,-0x14(%rbp)
        │400566:   addl $0x1,-0x4(%rbp)
   0.11 │40056a: → jmp  400557 <foo(int, int)+0x11>
        │40056c:   mov  -0x14(%rbp),%eax
        │40056f:   pop  %rbp

and the 'ja 40056c' does not link to the location in the function.  It's
caused by fact that comma is wrongly parsed, it's part of function
signature.

With my patch I see:

  86.52 │   ┌──ja   26
  13.37 │   │  mov  -0x18(%rbp),%eax
        │   │  add  %eax,-0x14(%rbp)
        │   │  addl $0x1,-0x4(%rbp)
   0.11 │   │↑ jmp  11
        │26:└─→mov  -0x14(%rbp),%eax

and 'o' output prints:

  86.52 │4005┌── ↓ ja   40056c <foo(int, int)+0x26>
  13.37 │4005│0:   mov  -0x18(%rbp),%eax
        │4005│3:   add  %eax,-0x14(%rbp)
        │4005│6:   addl $0x1,-0x4(%rbp)
   0.11 │4005│a: ↑ jmp  400557 <foo(int, int)+0x11>
        │4005└─→   mov  -0x14(%rbp),%eax

On the contrary, compiling the very same file with gcc -x c, the parsing
is fine because function arguments are not displayed:

  jmp  400543 <foo+0x1d>

Committer testing:

Before:

  $ cat cpp_args_annotate.c
  int
  foo(int a, int b)
  {
     for (unsigned i = 0; i < 1000000000; i++)
       a += b;
     return a;
  }

  int main()
  {
     foo (3, 4);
     return 0;
  }
  $ gcc --version |& head -1
  gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9)
  $ gcc -g cpp_args_annotate.c -o cpp_args_annotate
  $ perf record ./cpp_args_annotate
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.275 MB perf.data (7188 samples) ]
  $ perf annotate --stdio2 foo
  Samples: 7K of event 'cycles:u', 4000 Hz, Event count (approx.): 7468429289, [percent: local period]
  foo() /home/acme/c/cpp_args_annotate
  Percent
              0000000000401106 <foo>:
              foo():
              int
              foo(int a, int b)
              {
                push %rbp
                mov  %rsp,%rbp
                mov  %edi,-0x14(%rbp)
                mov  %esi,-0x18(%rbp)
              for (unsigned i = 0; i < 1000000000; i++)
                movl $0x0,-0x4(%rbp)
              ↓ jmp  1d
              a += b;
   13.45  13:   mov  -0x18(%rbp),%eax
                add  %eax,-0x14(%rbp)
              for (unsigned i = 0; i < 1000000000; i++)
                addl $0x1,-0x4(%rbp)
    0.09  1d:   cmpl $0x3b9ac9ff,-0x4(%rbp)
   86.46      ↑ jbe  13
              return a;
                mov  -0x14(%rbp),%eax
              }
                pop  %rbp
              ← retq
  $

I.e. works for C, now lets switch to C++:

  $ g++ -g cpp_args_annotate.c -o cpp_args_annotate
  $ perf record ./cpp_args_annotate
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.268 MB perf.data (6976 samples) ]
  $ perf annotate --stdio2 foo
  Samples: 6K of event 'cycles:u', 4000 Hz, Event count (approx.): 7380681761, [percent: local period]
  foo() /home/acme/c/cpp_args_annotate
  Percent
              0000000000401106 <foo(int, int)>:
              foo(int, int):
              int
              foo(int a, int b)
              {
                push %rbp
                mov  %rsp,%rbp
                mov  %edi,-0x14(%rbp)
                mov  %esi,-0x18(%rbp)
              for (unsigned i = 0; i < 1000000000; i++)
                movl $0x0,-0x4(%rbp)
                cmpl $0x3b9ac9ff,-0x4(%rbp)
   86.53      → ja   40112c <foo(int, int)+0x26>
              a += b;
   13.32        mov  -0x18(%rbp),%eax
    0.00        add  %eax,-0x14(%rbp)
              for (unsigned i = 0; i < 1000000000; i++)
                addl $0x1,-0x4(%rbp)
    0.15      → jmp  401117 <foo(int, int)+0x11>
              return a;
                mov  -0x14(%rbp),%eax
              }
                pop  %rbp
              ← retq
  $

Reproduced.

Now with this patch:

Reusing the C++ built binary, as we can see here:

  $ readelf -wi cpp_args_annotate | grep producer
    <c>   DW_AT_producer    : (indirect string, offset: 0x2e): GNU C++14 10.2.1 20201125 (Red Hat 10.2.1-9) -mtune=generic -march=x86-64 -g
  $

And furthermore:

  $ file cpp_args_annotate
  cpp_args_annotate: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=4fe3cab260204765605ec630d0dc7a7e93c361a9, for GNU/Linux 3.2.0, with debug_info, not stripped
  $ perf buildid-list -i cpp_args_annotate
  4fe3cab260204765605ec630d0dc7a7e93c361a9
  $ perf buildid-list | grep cpp_args_annotate
  4fe3cab260204765605ec630d0dc7a7e93c361a9 /home/acme/c/cpp_args_annotate
  $

It now works:

  $ perf annotate --stdio2 foo
  Samples: 6K of event 'cycles:u', 4000 Hz, Event count (approx.): 7380681761, [percent: local period]
  foo() /home/acme/c/cpp_args_annotate
  Percent
              0000000000401106 <foo(int, int)>:
              foo(int, int):
              int
              foo(int a, int b)
              {
                push %rbp
                mov  %rsp,%rbp
                mov  %edi,-0x14(%rbp)
                mov  %esi,-0x18(%rbp)
              for (unsigned i = 0; i < 1000000000; i++)
                movl $0x0,-0x4(%rbp)
          11:   cmpl $0x3b9ac9ff,-0x4(%rbp)
   86.53      ↓ ja   26
              a += b;
   13.32        mov  -0x18(%rbp),%eax
    0.00        add  %eax,-0x14(%rbp)
              for (unsigned i = 0; i < 1000000000; i++)
                addl $0x1,-0x4(%rbp)
    0.15      ↑ jmp  11
              return a;
          26:   mov  -0x14(%rbp),%eax
              }
                pop  %rbp
              ← retq
  $

Signed-off-by: Martin Liška <mliska@suse.cz>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Link: http://lore.kernel.org/lkml/13e1a405-edf9-e4c2-4327-a9b454353730@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 18:23:53 -03:00
Kees Cook
6edfd0ebb8 perf tools: Replace lkml.org links with lore
As started by commit 05a5f51ca566 ("Documentation: Replace lkml.org
links with lore"), replace lkml.org links with lore to better use a
single source that's more likely to stay available long-term.

Signed-off-by: Kees Kook <keescook@chromium.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: http://lore.kernel.org/lkml/20210210234220.2401035-1-keescook@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 12:54:27 -03:00
Arnaldo Carvalho de Melo
fc52336288 tools headers UAPI: Sync linux/prctl.h with the kernel sources
To pick a new prctl introduced in:

  36a6c843fd0d8e02 ("entry: Use different define for selector variable in SUD")

That don't result in any changes in tooling:

  $ tools/perf/trace/beauty/prctl_option.sh > before
  $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
  $ tools/perf/trace/beauty/prctl_option.sh > after
  $ diff -u before after

Just silences this perf tools build warning:

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 12:50:52 -03:00
Jiri Olsa
dec34515b5 perf tests: Add daemon 'lock' test
Add a test for the perf daemon 'lock' command ensuring only one instance
of daemon can run over one base directory.

Committer testing:

  [root@five ~]# perf test -v daemon
  76: daemon operations                                               :
  --- start ---
  test child forked, pid 793255
  test daemon list
  test daemon reconfig
  test daemon stop
  test daemon signal
  signal 12 sent to session 'test [793506]'
  signal 12 sent to session 'test [793506]'
  test daemon ping
  test daemon lock
  test child finished with 0
  ---- end ----
  daemon operations: Ok
  [root@five ~]#

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-25-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:19:53 -03:00
Jiri Olsa
63551dc771 perf tests: Add daemon 'ping' command test
Add a test for the perf daemon 'ping' command. The tests verifies the
ping command gets proper answer from sessions.

Committer testing:

  [root@five ~]# perf test daemon
  76: daemon operations                                               : Ok
  [root@five ~]# perf test -v daemon
  76: daemon operations                                               :
  --- start ---
  test child forked, pid 792143
  test daemon list
  test daemon reconfig
  test daemon stop
  test daemon signal
  signal 12 sent to session 'test [792415]'
  signal 12 sent to session 'test [792415]'
  test daemon ping
  test child finished with 0
  ---- end ----
  daemon operations: Ok
  [root@five ~]#

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-24-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:19:53 -03:00
Jiri Olsa
f32102aa33 perf tests: Add daemon 'signal' command test
Add a test for the perf daemon 'signal' command. The test sends a signal
to configured sessions and verifies the perf data files were generated
accordingly.

  Committer testing:

  [root@five ~]# perf test daemon
  76: daemon operations                                               : Ok
  [root@five ~]# perf test -v daemon
  76: daemon operations                                               :
  --- start ---
  test child forked, pid 790017
  test daemon list
  test daemon reconfig
  test daemon stop
  test daemon signal
  signal 12 sent to session 'test [790268]'
  signal 12 sent to session 'test [790268]'
  test child finished with 0
  ---- end ----
  daemon operations: Ok
  [root@five ~]#

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:19:53 -03:00
Jiri Olsa
f624f6d0f6 perf tests: Add daemon 'stop' command test
Add a test for the perf daemon 'stop' command. The test stops the daemon
and verifies all the configured sessions are properly terminated.

Committer testing:

  [root@five ~]# time perf test daemon
  76: daemon operations                                               : Ok
  [root@five ~]# time perf test -v daemon
  76: daemon operations                                               :
  --- start ---
  test child forked, pid 788560
  test daemon list
  test daemon reconfig
  test daemon stop
  test child finished with 0
  ---- end ----
  daemon operations: Ok
  #

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-22-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:19:53 -03:00
Jiri Olsa
91a17d6f63 perf tests: Add daemon reconfig test
Add a test for daemon reconfiguration. The test changes the
configuration file and checks that the session is changed properly.

Committer testing:

  [root@five ~]# perf test daemon
  76: daemon operations                                               : Ok
  [root@five ~]# time perf test daemon
  76: daemon operations                                               : Ok

  real	0m6.055s
  user	0m0.174s
  sys	0m0.147s
  [root@five ~]# time perf test -v daemon
  76: daemon operations                                               :
  --- start ---
  test child forked, pid 786863
  test daemon list
  test daemon reconfig
  test child finished with 0
  ---- end ----
  daemon operations: Ok

  real	0m6.127s
  user	0m0.222s
  sys	0m0.165s
  [root@five ~]#

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-21-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:19:53 -03:00
Jiri Olsa
2291bb915b perf tests: Add daemon 'list' command test
Add test for basic perf daemon listing via the CSV output mode (-x
option).

Check that the configured sessions display expected values.

Committer testing:

  [root@five ~]# perf test daemon
  76: daemon operations                                               : Ok
  [root@five ~]#
  [root@five ~]# perf test -v daemon
  76: daemon operations                                               :
  --- start ---
  test child forked, pid 785037
  test daemon list
  test child finished with 0
  ---- end ----
  daemon operations: Ok
  [root@five ~]#

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:19:53 -03:00
Jiri Olsa
13fb3b9f5b perf daemon: Add examples to man page
Add usage examples to the man page.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-19-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:19:52 -03:00
Jiri Olsa
5bdee4f051 perf daemon: Add up time for daemon/session list
Display up time for both daemon and sessions.

Example:

  # cat ~/.perfconfig
  [daemon]
  base=/opt/perfdata

  [session-cycles]
  run = -m 10M -e cycles --overwrite --switch-output -a

  [session-sched]
  run = -m 20M -e sched:* --overwrite --switch-output -a

Starting the daemon:

  # perf daemon start

Get the details with up time:

  # perf daemon -v
  [778315:daemon] base: /opt/perfdata
    output:  /opt/perfdata/output
    lock:    /opt/perfdata/lock
    up:      15 minutes
  [778316:cycles] perf record -m 20M -e cycles --overwrite --switch-output -a
    base:    /opt/perfdata/session-cycles
    output:  /opt/perfdata/session-cycles/output
    control: /opt/perfdata/session-cycles/control
    ack:     /opt/perfdata/session-cycles/ack
    up:      10 minutes
  [778317:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a
    base:    /opt/perfdata/session-sched
    output:  /opt/perfdata/session-sched/output
    control: /opt/perfdata/session-sched/control
    ack:     /opt/perfdata/session-sched/ack
    up:      2 minutes

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-18-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:19:52 -03:00
Jiri Olsa
6d6162d51c perf daemon: Use control to stop session
Use the 'stop' control command to stop perf record session.  If that
fails, fall back to current SIGTERM/SIGKILL pair.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-17-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:19:52 -03:00
Jiri Olsa
edcaa47958 perf daemon: Add 'ping' command
Add a 'ping' command to verify that the 'perf record' session is up and
operational.

It's used in the following patches via test code to make sure 'perf
record' is ready to receive signals.

Example:

  # cat ~/.perfconfig
  [daemon]
  base=/opt/perfdata

  [session-cycles]
  run = -m 10M -e cycles --overwrite --switch-output -a

  [session-sched]
  run = -m 20M -e sched:* --overwrite --switch-output -a

Start the daemon:

  # perf daemon start

Ping all sessions:

  # perf daemon ping
  OK   cycles
  OK   sched

Ping specific session:

  # perf daemon ping --session sched
  OK   sched

Committer notes:

Fixed up bug pointed by clang:

Buggy:

  if (!pollfd.revents & POLLIN)

Correct code:

  if (!(pollfd.revents & POLLIN))

clang warning:

  builtin-daemon.c:560:6: error: logical not is only applied to the left hand side of this bitwise operator [-Werror,-Wlogical-not-parentheses]
          if (!pollfd.revents & POLLIN) {
              ^               ~
  builtin-daemon.c:560:6: note: add parentheses after the '!' to evaluate the bitwise operator first

Also use designated initialized with pollfd, i.e.:

  struct pollfd pollfd = { .events = POLLIN, };

Instead of:

  struct pollfd pollfd = { 0, };

To get past:

    builtin-daemon.c:510:30: error: missing field 'events' initializer [-Werror,-Wmissing-field-initializers]
            struct pollfd pollfd = { 0, };
                                        ^
    1 error generated.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-16-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:19:52 -03:00
Jiri Olsa
6a6d1804a1 perf daemon: Set control fifo for session
Setup control fifos for session and add --control option to session
arguments.

Example:

  # cat ~/.perfconfig
  [daemon]
  base=/opt/perfdata

  [session-cycles]
  run = -m 10M -e cycles --overwrite --switch-output -a

  [session-sched]
  run = -m 20M -e sched:* --overwrite --switch-output -a

Starting the daemon:

  # perf daemon start

Use can list control fifos with (control and ack files):

  # perf daemon -v
  [776459:daemon] base: /opt/perfdata
    output:  /opt/perfdata/output
    lock:    /opt/perfdata/lock
  [776460:cycles] perf record -m 20M -e cycles --overwrite --switch-output -a
    base:    /opt/perfdata/session-cycles
    output:  /opt/perfdata/session-cycles/output
    control: /opt/perfdata/session-cycles/control
    ack:     /opt/perfdata/session-cycles/ack
  [776461:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a
    base:    /opt/perfdata/session-sched
    output:  /opt/perfdata/session-sched/output
    control: /opt/perfdata/session-sched/control
    ack:     /opt/perfdata/session-sched/ack

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-15-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:19:52 -03:00
Jiri Olsa
8c98be6c36 perf daemon: Allow only one daemon over base directory
Add 'lock' file under daemon base and flock it, so only one perf daemon
can run on top of it.

Each daemon tries to create and lock BASE/lock file, if it's successful
we are sure we're the only daemon running over the BASE.

Once daemon is finished, file descriptor to lock file is closed and lock
is released.

Example:

  # cat ~/.perfconfig
  [daemon]
  base=/opt/perfdata

  [session-cycles]
  run = -m 10M -e cycles --overwrite --switch-output -a

  [session-sched]
  run = -m 20M -e sched:* --overwrite --switch-output -a

Starting the daemon:

  # perf daemon start

And try once more:

  # perf daemon start
  failed: another perf daemon (pid 775594) owns /opt/perfdata

will end up with an error, because there's already one running
on top of /opt/perfdata.

Committer notes:

Provide lockf(F_TLOCK) when not available, i.e. transform:

  lockf(fd, F_TLOCK, 0);

into:

  flock(fd, LOCK_EX | LOCK_NB);

Which should be equivalent.

Noticed when cross building to some odd Android NDK.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-14-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:16:56 -03:00
Jiri Olsa
23c5831e2e perf daemon: Add 'stop' command
Add 'perf daemon stop' command to stop daemon process and all running
sessions.

Example:

  # cat ~/.perfconfig
  [daemon]
  base=/opt/perfdata

  [session-cycles]
  run = -m 10M -e cycles --overwrite --switch-output -a

  [session-sched]
  run = -m 20M -e sched:* --overwrite --switch-output -a

Start the daemon:

  # perf daemon start

Stop the daemon

  # perf daemon stop

Daemon is not running, nothing to connect to:

  # perf daemon
  connect error: Connection refused

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-13-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:02:54 -03:00
Jiri Olsa
2d6914cd59 perf daemon: Add 'signal' command
Allow the 'perf daemon' to send SIGUSR2 to all running sessions or just
to a specific session.

Example:

  # cat ~/.perfconfig
  [daemon]
  base=/opt/perfdata

  [session-cycles]
  run = -m 10M -e cycles --overwrite --switch-output -a

  [session-sched]
  run = -m 20M -e sched:* --overwrite --switch-output -a

Start the daemon:

  # perf daemon start

Send signal to all running sessions:

  # perf daemon signal
  signal 12 sent to session 'cycles [773738]'
  signal 12 sent to session 'sched [773739]'

Or to specific one:

  # perf daemon signal --session sched
  signal 12 sent to session 'sched [773739]'

And verify signals were delivered and perf.data dumped:

  # cat /opt/perfdata/session-cycles/output
  rounding mmap pages size to 32M (8192 pages)
  [ perf record: dump data: Woken up 1 times ]
  [ perf record: Dump perf.data.2021010220382490 ]

  # car /opt/perfdata/session-sched/output
  rounding mmap pages size to 32M (8192 pages)
  [ perf record: dump data: Woken up 1 times ]
  [ perf record: Dump perf.data.2021010220382489 ]
  [ perf record: dump data: Woken up 1 times ]
  [ perf record: Dump perf.data.2021010220393745 ]

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-12-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:02:54 -03:00
Jiri Olsa
b325f7be25 perf daemon: Add 'list' command
Add a 'list' command to display all running sessions.  It's the default
command if no other command is specified.

Example:

  # cat ~/.perfconfig
  [daemon]
  base=/opt/perfdata

  [session-cycles]
  run = -m 10M -e cycles --overwrite --switch-output -a

  [session-sched]
  run = -m 20M -e sched:* --overwrite --switch-output -a

Start the daemon:

  # perf daemon start

List sessions:

  # perf daemon
  [771394:daemon] base: /opt/perfdata
  [771395:cycles] perf record -m 10M -e cycles --overwrite --switch-output -a
  [771396:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a

List sessions with more info:

  # perf daemon -v
  [771394:daemon] base: /opt/perfdata
    output:  /opt/perfdata/output
  [771395:cycles] perf record -m 10M -e cycles --overwrite --switch-output -a
    base:    /opt/perfdata/session-cycles
    output:  /opt/perfdata/session-cycles/output
  [771396:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a
    base:    /opt/perfdata/session-sched
    output:  /opt/perfdata/session-sched/output

The 'output' file is perf record output for specific session.

Note you have to stop all running perf processes manually at this point,
stop command is coming in following patches.

Committer notes:

Fixup union initialization to overcome this in multiple older systems:

  22    15.74 debian:8                      : FAIL gcc version 4.9.2 (Debian 4.9.2-10+deb8u2)

    builtin-daemon.c: In function 'send_cmd_list':
    builtin-daemon.c:1386:2: error: missing initializer for field 'csv_sep' of 'struct <anonymous>' [-Werror=missing-field-initializers]
      };
      ^
    builtin-daemon.c:641:8: note: 'csv_sep' declared here
       char csv_sep;
            ^
    cc1: all warnings being treated as errors

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:02:54 -03:00
Jiri Olsa
12c1a415eb perf daemon: Add signalfd support
Use a signalfd fd to track SIGCHLD signals as notifications for perf
session termination.

This way we don't need to actively check for child status, being
notified if there's change.

Suggested-by: Alexei Budankov <abudankov@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:02:54 -03:00
Jiri Olsa
88adb1194c perf daemon: Add background support
Add support to put the daemon process in the background.

It's now enabled by default and -f option is added to keep the daemon
process on the console for debugging.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:02:54 -03:00
Jiri Olsa
3cda062520 perf daemon: Add config file change check
Add support to detect changes to the daemon's config file triggering a
re-read of the configuration when that happens.

Use a inotify file descriptor plugged into the main fdarray object for
polling.

Example:

  # cat ~/.perfconfig
  [daemon]
  base=/opt/perfdata

  [session-cycles]
  run = -m 10M -e cycles --overwrite --switch-output -a

Starting the daemon:

  # perf daemon start

Check sessions:

  # perf daemon
  [772262:daemon] base: /opt/perfdata
  [772263:cycles] perf record -m 10M -e cycles --overwrite --switch-output -a

Change '-m 10M' to '-m 20M', and check daemon log:

  # tail -f /opt/perfdata/output
  [2021-01-02 20:31:41.234045] daemon started (pid 772262)
  [2021-01-02 20:31:41.235072] reconfig: ruining session [cycles:772263]: -m 10M -e cycles --overwrite --switch-output -a
  [2021-01-02 20:32:08.310137] reconfig: session 'cycles' killed
  [2021-01-02 20:32:08.310847] reconfig: ruining session [cycles:772338]: -m 20M -e cycles --overwrite --switch-output -a

And the session list:

  # perf daemon
  [772262:daemon] base: /opt/perfdata
  [772338:cycles] perf record -m 20M -e cycles --overwrite --switch-output -a

Note the changed '-m 20M' option is in place.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:02:54 -03:00
Jiri Olsa
c0666261ff perf daemon: Add config file support
Adding support to configure daemon with config file.

Each client or server invocation of perf daemon needs to know the
base directory, where all sessions data is stored.

The base is defined with:

  daemon.base
    Base path for daemon data. All sessions data are stored under
    this path.

The daemon allows to create record sessions. Each session is a
record command spawned and monitored by perf daemon.

The session is defined with:

  session-<NAME>.run
    Defines new record session for daemon. The value is record's
    command line without the 'record' keyword.

Example:

  # cat ~/.perfconfig
  [daemon]
  base=/opt/perfdata

  [session-cycles]
  run = -m 10M -e cycles --overwrite --switch-output -a

  [session-sched]
  run = -m 20M -e sched:* --overwrite --switch-output -a

The example above defines '/opt/perfdata' as the base directory and 2
record sessions.

  # perf daemon start
  [2021-01-28 19:47:33.454413] daemon started (pid 16015)
  [2021-01-28 19:47:33.455910] reconfig: ruining session [cycles:16016]: -m 10M -e cycles --overwrite --switch-output -a
  [2021-01-28 19:47:33.456599] reconfig: ruining session [sched:16017]: -m 20M -e sched:* --overwrite --switch-output -a

  # ps -ef | grep perf
  ... perf daemon start
  ... /home/jolsa/.../perf record -m 20M -e cycles --overwrite --switch-output -a
  ... /home/jolsa/.../perf record -m 20M -e sched:* --overwrite --switch-output -a

The base directory is populated with:

  # find /opt/perfdata/
  /opt/perfdata/
  /opt/perfdata/control                    <- control socket
  /opt/perfdata/session-cycles             <- data for session 'cycles':
  /opt/perfdata/session-cycles/output      <-   perf record output
  /opt/perfdata/session-cycles/perf.data   <-   perf data
  /opt/perfdata/session-sched              <- ditto for session 'sched'
  /opt/perfdata/session-sched/output
  /opt/perfdata/session-sched/perf.data

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 10:02:54 -03:00
Jiri Olsa
90b0aad8f6 perf daemon: Add client socket support
Add support for client socket side that will be used to send commands to
the daemon server socket.

This patch adds only the core support, all commands using this
functionality are coming in the following patches.

Committer notes:

Hat to patch patch it to deal with this in some systems:

  cc1: warnings being treated as errors
  builtin-daemon.c: In function 'send_cmd':  MKDIR    /tmp/build/perf/bench/

  builtin-daemon.c:1368: error: ignoring return value of 'fwrite', declared with attribute warn_unused_result
    MKDIR    /tmp/build/perf/tests/
  make[3]: *** [/tmp/build/perf/builtin-daemon.o] Error 1

And also to not leak the 'line' buffer allocated by getline(), since you
initialized line to NULL and len to zero, man page says:

  If *lineptr is set to NULL and *n is set 0 before the call,
  then getline() will allocate a buffer for storing the line.
  This buffer should be freed by the user program even if
  getline() failed.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20210208200908.1019149-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11 09:52:28 -03:00
Linus Torvalds
291009f656 Power management fixes for 5.11-rc8
Address a performance regression related to scale-invariance on x86
 that may prevent turbo CPU frequencies from being used in certain
 workloads on systems using acpi-cpufreq as the CPU performance
 scaling driver and schedutil as the scaling governor.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmAkGloSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxCNkP/3uQly0iE4WdsxiBlWgF6zQH5PkezzSu
 vXY0E2NbzAlUpke3zDIZ6zkN6DfG1yAl4vpsVzy5N/kTzFnaFPbPH3ylZ2x/oKBZ
 Rd9yl0uz13UR4txkY49ZRF3c3vMhFoGzNfIXYjMQDGevyfarMLdpR96GFbpFTCT5
 I5gZfDtOuAwpXY+Mr+UplTu7PTrmkf2jfQ/T/b+jog3NAjqODwnLT8dwIOTZnuPk
 vbCOhb5vsiUHaqilrKkuGS5TzGsb/KCa6k4kaf7WhoFiU99KKcaZRLca/4FlJuVj
 Q4rgSrtPsbvhG2vmucprunrsyt21JQMDnERqMlPcEls/c0ONgS4fMc5YJlO6KgZZ
 Mlu01f/oE84jQ//0Y3LVi6v6w+yOiBi1Ie9yD8wnOkn6c+r6sWWbvd5Kg/guGnwi
 LLSdemslw4r0ltimFmWD5I86ZXDJ1gwU9iuv+SdxoyppHHwOOAu5l/FhEgNvuWbl
 LeuLrl7BhYTbN40ouKivoQ8smTpI0EmZX2MRm+l5NV4hHQ+8df16Rt7hoFvAdI2L
 VJe/i0sgOcdCVnSovxQ8WeuSMGQtqrgFC4B/9+q6WOgAGIAFtyHQUtpHzB+XsIRo
 P2VAmxcdJsqrmnbtoxxopMcqov5cAYI5PPJO/4yR+Z+gOBjeBvEJaxDF/shFT2NO
 iaAQXEYLIPW9
 =iOYP
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "Address a performance regression related to scale-invariance on x86
  that may prevent turbo CPU frequencies from being used in certain
  workloads on systems using acpi-cpufreq as the CPU performance scaling
  driver and schedutil as the scaling governor"

* tag 'pm-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: ACPI: Update arch scale-invariance max perf ratio if CPPC is not there
  cpufreq: ACPI: Extend frequency tables to cover boost frequencies
2021-02-10 12:03:35 -08:00
Linus Torvalds
a3961497bd ACPI fix for 5.11-rc8
Revert a problematic ACPICA commit that changed the code to attempt
 to update memory regions which may be read-only on some systems (Ard
 Biesheuvel).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmAkGeESHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx90EQAJvCLtbXhf+AsQbYiDhGwH0VE4EvDjqJ
 M3qZL8QDeXx+HRqicdCevpdukk3CAosm468yudwrEIpk7vUTtW2vgjTeVNuH3O4n
 Yz7Aw5EXfOO7LGXpZ+oyYPNpPZ9DK5BugpZW6C85qlBPHGzTem7IkoEOEem7lvUV
 QwahrwZ4D20/blgunGGt5jx8ilp9YA9wQl0FUaoiAfH7ydoA4YTbFqdcG2jEe9Xb
 yQvpLPoD50QTO0Gc1R86C0rehNfVQaXI9L5Nf8w5LU0UPGL2fTWkNwDFga8XvrH/
 ZkIVWIFYBF6cgiOaB5TaZ/ufWw7BXe8c2X8lEdn2uJjZ4CXyKBVxAtdiqk+Bf57o
 mMszQDwf9hot+fXpq3TDeDhu8KwoLMpHITGxUzZ2Y50wtsb/vwkT45x9PWmUObTz
 fXvKbvGWZ/QFABsAGlBCAXUsTsUhkCnBuZ7icbbdG+HAqLg3/QJHvORg7pFYNCJH
 wlfChKpoH9n69VIo5Ae5NFxp678rW616E3N1yTF+0OK9uKhVqu7PxqRd9qWLOG4F
 Fm/fJB9XtIOzNaVmSqMd8YONpFv1Kf450d8uC3D0QPrCSE9OPl3b+JnqEfhxx2xl
 sjcT0UKewwm+H0pTzk/FdYqW1GpTV/5uWu+onhfZvVGePuJCRSJ+URw7FuhUfmiE
 iTAT8mV3zo6u
 =0hUo
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "Revert a problematic ACPICA commit that changed the code to attempt to
  update memory regions which may be read-only on some systems (Ard
  Biesheuvel)"

* tag 'acpi-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPICA: Interpreter: fix memory leak by using existing buffer"
2021-02-10 11:58:21 -08:00
Linus Torvalds
708c2e4181 dmaengine fixes-2 for v5.11
Some late fixes for dmaengine:
  - Core: fix channel device_node deletion
  - Driver fixes for:
    - dw: revert of runtime pm enabling
    - idxd: device state fix, interrupt completion and list corruption
    - ti: resource leak
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmAjrkQACgkQfBQHDyUj
 g0foZA/+Iqpi6fU0Dth4bdoJa5HO63a62G5nrhofF/vH681GMaazNj46byol3vuA
 Gc0/EZ2UtIkEY29ix0XaHkksQrsqn/Q5E4QK+5u9x32DHf3jvtbOblOSIBCdr//3
 i+uc/K90ot4ERtNvwiPxQGWjS7rF+6BvHItRDxOaiele0Uvf18/VGn2x7fH5vNeK
 GqtZK47E11y5UhqpJAiwcgNAhKXC6I6s/tP0pidyWuXWeqVm+usr6Pun9YExMJQm
 N+kiR8eJoh5F0N9KAg3rOppxf4iEblvgh2vfMgcNC63GdeWB2x1OMgizAXjE136K
 HAvcp/3rQf76tUhjZkr/YZaNB7wCqzCRRcgQ/xyhSJt24yswfv9NFGVHd2ltkfx9
 Yp+rl8ZC0dSvdGR3ECF9z98MzRbBPgu+TCW/50/Hh42Va0FJZbXyY45hpfR9qPe2
 hiXwQkJ8IKH7C8BpDKA8vMlJc4xhbNsYW0GaSyoAUzhaStwTHcKNB4+5Xeia55e3
 RR2OPJXl+y3jywcO15fmFdNIRsSvRVGYioFH0NzneaVVIlbQk5hRqADNMWelnwiA
 DJc21v7yurHeCh3lefn5Aml10n986S1b7XNPA7Ls+2FMmJeIt2vrKqvmKLhHVANY
 bvSKEXda2pAvb3zw2fCcCuPq6KUdJAfDrB0oorIlRBM3IkY+B5Y=
 =j/zV
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fix2-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine fixes from Vinod Koul:
 "Some late fixes for dmaengine:

  Core:
   - fix channel device_node deletion

  Driver fixes:
   - dw: revert of runtime pm enabling
   - idxd: device state fix, interrupt completion and list corruption
   - ti: resource leak

* tag 'dmaengine-fix2-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine dw: Revert "dmaengine: dw: Enable runtime PM"
  dmaengine: idxd: check device state before issue command
  dmaengine: ti: k3-udma: Fix a resource leak in an error handling path
  dmaengine: move channel device_node deletion to driver
  dmaengine: idxd: fix misc interrupt completion
  dmaengine: idxd: Fix list corruption in description completion
2021-02-10 11:51:25 -08:00
Linus Torvalds
6016bf19b3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
 "Another pile of networing fixes:

   1) ath9k build error fix from Arnd Bergmann

   2) dma memory leak fix in mediatec driver from Lorenzo Bianconi.

   3) bpf int3 kprobe fix from Alexei Starovoitov.

   4) bpf stackmap integer overflow fix from Bui Quang Minh.

   5) Add usb device ids for Cinterion MV31 to qmi_qwwan driver, from
      Christoph Schemmel.

   6) Don't update deleted entry in xt_recent netfilter module, from
      Jazsef Kadlecsik.

   7) Use after free in nftables, fix from Pablo Neira Ayuso.

   8) Header checksum fix in flowtable from Sven Auhagen.

   9) Validate user controlled length in qrtr code, from Sabyrzhan
      Tasbolatov.

  10) Fix race in xen/netback, from Juergen Gross,

  11) New device ID in cxgb4, from Raju Rangoju.

  12) Fix ring locking in rxrpc release call, from David Howells.

  13) Don't return LAPB error codes from x25_open(), from Xie He.

  14) Missing error returns in gsi_channel_setup() from Alex Elder.

  15) Get skb_copy_and_csum_datagram working properly with odd segment
      sizes, from Willem de Bruijn.

  16) Missing RFS/RSS table init in enetc driver, from Vladimir Oltean.

  17) Do teardown on probe failure in DSA, from Vladimir Oltean.

  18) Fix compilation failures of txtimestamp selftest, from Vadim
      Fedorenko.

  19) Limit rx per-napi gro queue size to fix latency regression, from
      Eric Dumazet.

  20) dpaa_eth xdp fixes from Camelia Groza.

  21) Missing txq mode update when switching CBS off, in stmmac driver,
      from Mohammad Athari Bin Ismail.

  22) Failover pending logic fix in ibmvnic driver, from Sukadev
      Bhattiprolu.

  23) Null deref fix in vmw_vsock, from Norbert Slusarek.

  24) Missing verdict update in xdp paths of ena driver, from Shay
      Agroskin.

  25) seq_file iteration fix in sctp from Neil Brown.

  26) bpf 32-bit src register truncation fix on div/mod, from Daniel
      Borkmann.

  27) Fix jmp32 pruning in bpf verifier, from Daniel Borkmann.

  28) Fix locking in vsock_shutdown(), from Stefano Garzarella.

  29) Various missing index bound checks in hns3 driver, from Yufeng Mo.

  30) Flush ports on .phylink_mac_link_down() in dsa felix driver, from
      Vladimir Oltean.

  31) Don't mix up stp and mrp port states in bridge layer, from Horatiu
      Vultur.

  32) Fix locking during netif_tx_disable(), from Edwin Peer"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
  bpf: Fix 32 bit src register truncation on div/mod
  bpf: Fix verifier jmp32 pruning decision logic
  bpf: Fix verifier jsgt branch analysis on max bound
  vsock: fix locking in vsock_shutdown()
  net: hns3: add a check for index in hclge_get_rss_key()
  net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()
  net: hns3: add a check for queue_id in hclge_reset_vf_queue()
  net: dsa: felix: implement port flushing on .phylink_mac_link_down
  switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STAT
  bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_state
  net: watchdog: hold device global xmit lock during tx disable
  netfilter: nftables: relax check for stateful expressions in set definition
  netfilter: conntrack: skip identical origin tuple in same zone only
  vsock/virtio: update credit only if socket is not closed
  net: fix iteration for sctp transport seq_files
  net: ena: Update XDP verdict upon failure
  net/vmw_vsock: improve locking in vsock_connect_timeout()
  net/vmw_vsock: fix NULL pointer dereference
  ibmvnic: Clear failover_pending if unable to schedule
  net: stmmac: set TxQ mode back to DCB after disabling CBS
  ...
2021-02-10 11:33:39 -08:00
Linus Torvalds
4b16b656b1 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "14 patches.

  Subsystems affected by this patch series: mm (kasan, mremap, tmpfs,
  selftests, memcg, and slub), MAINTAINERS, squashfs, nilfs2, and
  firmware"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  nilfs2: make splice write available again
  mm, slub: better heuristic for number of cpus when calculating slab order
  Revert "mm: memcontrol: avoid workload stalls when lowering memory.high"
  MAINTAINERS: update Andrey Ryabinin's email address
  selftests/vm: rename file run_vmtests to run_vmtests.sh
  tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha
  tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
  mm/mremap: fix BUILD_BUG_ON() error in get_extent
  firmware_loader: align .builtin_fw to 8
  kasan: fix stack traces dependency for HW_TAGS
  squashfs: add more sanity checks in xattr id lookup
  squashfs: add more sanity checks in inode lookup
  squashfs: add more sanity checks in id lookup
  squashfs: avoid out of bounds writes in decompressors
2021-02-10 11:22:41 -08:00
Joachim Henke
a35d8f016e nilfs2: make splice write available again
Since 5.10, splice() or sendfile() to NILFS2 return EINVAL.  This was
caused by commit 36e2c7421f02 ("fs: don't allow splice read/write
without explicit ops").

This patch initializes the splice_write field in file_operations, like
most file systems do, to restore the functionality.

Link: https://lkml.kernel.org/r/1612784101-14353-1-git-send-email-konishi.ryusuke@gmail.com
Signed-off-by: Joachim Henke <joachim.henke@t-systems.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>	[5.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-10 11:19:58 -08:00
Vlastimil Babka
3286222fc6 mm, slub: better heuristic for number of cpus when calculating slab order
When creating a new kmem cache, SLUB determines how large the slab pages
will based on number of inputs, including the number of CPUs in the
system.  Larger slab pages mean that more objects can be allocated/free
from per-cpu slabs before accessing shared structures, but also
potentially more memory can be wasted due to low slab usage and
fragmentation.  The rough idea of using number of CPUs is that larger
systems will be more likely to benefit from reduced contention, and also
should have enough memory to spare.

Number of CPUs used to be determined as nr_cpu_ids, which is number of
possible cpus, but on some systems many will never be onlined, thus
commit 045ab8c9487b ("mm/slub: let number of online CPUs determine the
slub page order") changed it to nr_online_cpus().  However, for kmem
caches created early before CPUs are onlined, this may lead to
permamently low slab page sizes.

Vincent reports a regression [1] of hackbench on arm64 systems:

  "I'm facing significant performances regression on a large arm64
   server system (224 CPUs). Regressions is also present on small arm64
   system (8 CPUs) but in a far smaller order of magnitude

   On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16
   v5.11-rc4 : 9.135sec (+/- 0.45%)
   v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%)
   v5.10: 3.136sec (+/- 0.40%)"

Mel reports a regression [2] of hackbench on x86_64, with lockstat suggesting
page allocator contention:

  "i.e. the patch incurs a 7% to 32% performance penalty. This bisected
   cleanly yesterday when I was looking for the regression and then
   found the thread.

   Numerous caches change size. For example, kmalloc-512 goes from
   order-0 (vanilla) to order-2 with the revert.

   So mostly this is down to the number of times SLUB calls into the
   page allocator which only caches order-0 pages on a per-cpu basis"

Clearly num_online_cpus() doesn't work too early in bootup.  We could
change the order dynamically in a memory hotplug callback, but runtime
order changing for existing kmem caches has been already shown as
dangerous, and removed in 32a6f409b693 ("mm, slub: remove runtime
allocation order changes").

It could be resurrected in a safe manner with some effort, but to fix
the regression we need something simpler.

We could use num_present_cpus() that should be the number of physically
present CPUs even before they are onlined.  That would work for PowerPC
[3], which triggered the original commit, but that still doesn't work on
arm64 [4] as explained in [5].

So this patch tries to determine the best available value without
specific arch knowledge.

 - num_present_cpus() if the number is larger than 1, as that means the
   arch is likely setting it properly

 - nr_cpu_ids otherwise

This should fix the reported regressions while also keeping the effect
of 045ab8c9487b for PowerPC systems.  It's possible there are
configurations where num_present_cpus() is 1 during boot while
nr_cpu_ids is at the same time bloated, so these (if they exist) would
keep the large orders based on nr_cpu_ids as was before 045ab8c9487b.

[1] https://lore.kernel.org/linux-mm/CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj7Rou=xzZg@mail.gmail.com/
[2] https://lore.kernel.org/linux-mm/20210128134512.GF3592@techsingularity.net/
[3] https://lore.kernel.org/linux-mm/20210123051607.GC2587010@in.ibm.com/
[4] https://lore.kernel.org/linux-mm/CAKfTPtAjyVmS5VYvU6DBxg4-JEo5bdmWbngf-03YsY18cmWv_g@mail.gmail.com/
[5] https://lore.kernel.org/linux-mm/20210126230305.GD30941@willie-the-truck/

Link: https://lkml.kernel.org/r/20210208134108.22286-1-vbabka@suse.cz
Fixes: 045ab8c9487b ("mm/slub: let number of online CPUs determine the slub page order")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jann Horn <jannh@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-10 11:19:27 -08:00
David S. Miller
b8776f14a4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-02-10

The following pull-request contains BPF updates for your *net* tree.

We've added 5 non-merge commits during the last 8 day(s) which contain
a total of 3 files changed, 22 insertions(+), 21 deletions(-).

The main changes are:

1) Fix missed execution of kprobes BPF progs when kprobe is firing via
   int3, from Alexei Starovoitov.

2) Fix potential integer overflow in map max_entries for stackmap on
   32 bit archs, from Bui Quang Minh.

3) Fix a verifier pruning and a insn rewrite issue related to 32 bit ops,
   from Daniel Borkmann.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
c# Please enter a commit message to explain why this merge is necessary,
2021-02-09 18:55:17 -08:00
Johannes Weiner
e82553c10b Revert "mm: memcontrol: avoid workload stalls when lowering memory.high"
This reverts commit 536d3bf261a2fc3b05b3e91e7eef7383443015cf, as it can
cause writers to memory.high to get stuck in the kernel forever,
performing page reclaim and consuming excessive amounts of CPU cycles.

Before the patch, a write to memory.high would first put the new limit
in place for the workload, and then reclaim the requested delta.  After
the patch, the kernel tries to reclaim the delta before putting the new
limit into place, in order to not overwhelm the workload with a sudden,
large excess over the limit.  However, if reclaim is actively racing
with new allocations from the uncurbed workload, it can keep the write()
working inside the kernel indefinitely.

This is causing problems in Facebook production.  A privileged
system-level daemon that adjusts memory.high for various workloads
running on a host can get unexpectedly stuck in the kernel and
essentially turn into a sort of involuntary kswapd for one of the
workloads.  We've observed that daemon busy-spin in a write() for
minutes at a time, neglecting its other duties on the system, and
expending privileged system resources on behalf of a workload.

To remedy this, we have first considered changing the reclaim logic to
break out after a couple of loops - whether the workload has converged
to the new limit or not - and bound the write() call this way.  However,
the root cause that inspired the sequence change in the first place has
been fixed through other means, and so a revert back to the proven
limit-setting sequence, also used by memory.max, is preferable.

The sequence was changed to avoid extreme latencies in the workload when
the limit was lowered: the sudden, large excess created by the limit
lowering would erroneously trigger the penalty sleeping code that is
meant to throttle excessive growth from below.  Allocating threads could
end up sleeping long after the write() had already reclaimed the delta
for which they were being punished.

However, erroneous throttling also caused problems in other scenarios at
around the same time.  This resulted in commit b3ff92916af3 ("mm, memcg:
reclaim more aggressively before high allocator throttling"), included
in the same release as the offending commit.  When allocating threads
now encounter large excess caused by a racing write() to memory.high,
instead of entering punitive sleeps, they will simply be tasked with
helping reclaim down the excess, and will be held no longer than it
takes to accomplish that.  This is in line with regular limit
enforcement - i.e.  if the workload allocates up against or over an
otherwise unchanged limit from below.

With the patch breaking userspace, and the root cause addressed by other
means already, revert it again.

Link: https://lkml.kernel.org/r/20210122184341.292461-1-hannes@cmpxchg.org
Fixes: 536d3bf261a2 ("mm: memcontrol: avoid workload stalls when lowering memory.high")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Tejun Heo <tj@kernel.org>
Acked-by: Chris Down <chris@chrisdown.name>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: <stable@vger.kernel.org>	[5.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-09 17:26:44 -08:00
Andrey Ryabinin
a0c2eb0a43 MAINTAINERS: update Andrey Ryabinin's email address
Update my email, @virtuozzo.com will stop working shortly.

Link: https://lkml.kernel.org/r/20210204223904.3824-1-ryabinin.a.a@gmail.com
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-09 17:26:44 -08:00
Rong Chen
d52db80084 selftests/vm: rename file run_vmtests to run_vmtests.sh
Commit c2aa8afc36fa has renamed run_vmtests in Makefile, but the file
still uses the old name.

The kernel test robot reported the following issue:

  # selftests: vm: run_vmtests.sh
  # Warning: file run_vmtests.sh is missing!
  not ok 1 selftests: vm: run_vmtests.sh

Link: https://lkml.kernel.org/r/20210205085507.1479894-1-rong.a.chen@intel.com
Fixes: c2aa8afc36fa (selftests/vm: rename run_vmtests --> run_vmtests.sh)
Signed-off-by: Rong Chen <rong.a.chen@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-09 17:26:44 -08:00
Seth Forshee
ad69c389ec tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha
As with s390, alpha is a 64-bit architecture with a 32-bit ino_t.  With
CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and
display "inode64" in the mount options, whereas passing "inode64" in the
mount options will fail.  This leads to erroneous behaviours such as
this:

  # mkdir mnt
  # mount -t tmpfs nodev mnt
  # mount -o remount,rw mnt
  mount: /home/ubuntu/mnt: mount point not mounted or bad option.

Prevent CONFIG_TMPFS_INODE64 from being selected on alpha.

Link: https://lkml.kernel.org/r/20210208215726.608197-1-seth.forshee@canonical.com
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: <stable@vger.kernel.org>	[5.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-09 17:26:44 -08:00
Seth Forshee
b85a7a8bb5 tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
Currently there is an assumption in tmpfs that 64-bit architectures also
have a 64-bit ino_t.  This is not true on s390 which has a 32-bit ino_t.
With CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers
and display "inode64" in the mount options, but passing the "inode64"
mount option will fail.  This leads to the following behavior:

  # mkdir mnt
  # mount -t tmpfs nodev mnt
  # mount -o remount,rw mnt
  mount: /home/ubuntu/mnt: mount point not mounted or bad option.

As mount sees "inode64" in the mount options and thus passes it in the
options for the remount.

So prevent CONFIG_TMPFS_INODE64 from being selected on s390.

Link: https://lkml.kernel.org/r/20210205230620.518245-1-seth.forshee@canonical.com
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: Hugh Dickins <hughd@google.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: <stable@vger.kernel.org>	[5.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-09 17:26:44 -08:00
Arnd Bergmann
a30a29091b mm/mremap: fix BUILD_BUG_ON() error in get_extent
clang can't evaluate this function argument at compile time when the
function is not inlined, which leads to a link time failure:

  ld.lld: error: undefined symbol: __compiletime_assert_414
  >>> referenced by mremap.c
  >>>               mremap.o:(get_extent) in archive mm/built-in.a

Mark the function as __always_inline to avoid it.

Link: https://lkml.kernel.org/r/20201230154104.522605-1-arnd@kernel.org
Fixes: 9ad9718bfa41 ("mm/mremap: calculate extent in one place")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Brian Geffon <bgeffon@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-09 17:26:44 -08:00