Commit Graph

442198 Commits

Author SHA1 Message Date
Namhyung Kim
6fe8c26d7a perf top: Add --fields option to specify output fields
The --fields option is to allow user setup output field in any order.
It can receive any sort keys and following (hpp) fields:

  overhead, overhead_sys, overhead_us, sample and period

If guest profiling is enabled, overhead_guest_{sys,us} will be
available too.

More more information, please see previous patch "perf report:
Add -F option to specify output fields"

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-15-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:36 +02:00
Namhyung Kim
c0f1527b7e perf report/tui: Fix a bug when --fields/sort is given
The hists__filter_entries() function is called when down arrow key is
pressed for navigating through the entries in TUI.  It has a check for
filtering out entries that have very small overhead (under min_pcnt).

However it just assumed the entries are sorted by the overhead so when
it saw such a small overheaded entry, it just stopped navigating as an
optimization.  But it's not true anymore due to new --fields and
--sort optoin behavior and this case users cannot go down to a next
entry if ther's an entry with small overhead in-between.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-14-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:36 +02:00
Namhyung Kim
202e7a6d16 perf tools: Add ->sort() member to struct sort_entry
Currently, what the sort_entry does is just identifying hist entries
so that they can be grouped properly.  However, with -F option
support, it indeed needs to sort entries appropriately to be shown to
users.  So add ->sort() member to do it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-13-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:35 +02:00
Namhyung Kim
a7d945bc91 perf report: Add -F option to specify output fields
The -F/--fields option is to allow user setup output field in any
order.  It can receive any sort keys and following (hpp) fields:

  overhead, overhead_sys, overhead_us, sample and period

If guest profiling is enabled, overhead_guest_{sys,us} will be
available too.

The output fields also affect sort order unless you give -s/--sort
option.  And any keys specified on -s option, will also be added to
the output field list automatically.

  $ perf report -F sym,sample,overhead
  ...
  #                     Symbol       Samples  Overhead
  # ..........................  ............  ........
  #
    [.] __cxa_atexit                       2     2.50%
    [.] __libc_csu_init                    4     5.00%
    [.] __new_exitfn                       3     3.75%
    [.] _dl_check_map_versions             1     1.25%
    [.] _dl_name_match_p                   4     5.00%
    [.] _dl_setup_hash                     1     1.25%
    [.] _dl_sysdep_start                   1     1.25%
    [.] _init                              5     6.25%
    [.] _setjmp                            6     7.50%
    [.] a                                  8    10.00%
    [.] b                                  8    10.00%
    [.] brk                                1     1.25%
    [.] c                                  8    10.00%

Note that, the example output above is captured after applying next
patch which fixes sort/comparing behavior.

Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-12-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:35 +02:00
Namhyung Kim
22af969e8c perf tools: Call perf_hpp__init() before setting up GUI browsers
So that it can be set properly prior to set up output fields.  That
makes easy to handle/warn errors during the setup since it doesn't
need to be bothered with the GUI.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-11-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:35 +02:00
Namhyung Kim
512ae1bd6a perf tools: Consolidate management of default sort orders
The perf uses different default sort orders for different use-cases,
and this was scattered throughout the code.  Add get_default_sort_
order() function to handle this and change initial value of sort_order
to NULL to distinguish it from user-given one.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1400480762-22852-10-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:35 +02:00
Namhyung Kim
a2ce067e55 perf tools: Allow hpp fields to be sort keys
Add overhead{,_sys,_us,_guest_sys,_guest_us}, sample and period sort
keys so that they can be selected with --sort/-s option.

  $ perf report -s period,comm --stdio
  ...
  # Overhead        Period          Command
  # ........  ............  ...............
  #
      47.06%           152          swapper
      13.93%            45  qemu-system-arm
      12.38%            40         synergys
       3.72%            12          firefox
       2.48%             8            xchat

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-9-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:34 +02:00
Namhyung Kim
fb821c9e71 perf ui: Get rid of callback from __hpp__fmt()
The callback was used by TUI for determining color of folded sign
using percent of first field/column. But it cannot be used anymore
since it now support dynamic reordering of output field.

So move the logic to the hist_browser__show_entry().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-8-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:34 +02:00
Namhyung Kim
26d8b33827 perf tools: Consolidate output field handling to hpp format routines
Until now the hpp and sort functions do similar jobs different ways.
Since the sort functions converted/wrapped to hpp formats it can do
the job in a uniform way.

The perf_hpp__sort_list has a list of hpp formats to sort entries and
the perf_hpp__list has a list of hpp formats to print output result.

To have a backward compatibility, it automatically adds 'overhead'
field in front of sort list.  And then all of fields in sort list
added to the output list (if it's not already there).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/n/tip-7g3h86woz2sckg3h1lj42ygj@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:34 +02:00
Namhyung Kim
043ca389a3 perf tools: Use hpp formats to sort final output
Convert output sorting function to use ->sort hpp functions.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-6-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:34 +02:00
Namhyung Kim
f156d84e42 perf tools: Support event grouping in hpp ->sort()
Move logic of hist_entry__sort_on_period to __hpp__sort() in order to
support event group report.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-5-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:34 +02:00
Namhyung Kim
093f0ef34c perf tools: Use hpp formats to sort hist entries
It wrapped sort entries to hpp functions, so using the hpp sort list
to sort entries.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:34 +02:00
Namhyung Kim
8b536999cd perf tools: Convert sort entries to hpp formats
This is a preparation of consolidating management of output field and
sort keys.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-3-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:34 +02:00
Namhyung Kim
bc18b7f2e3 perf tools: Add ->cmp(), ->collapse() and ->sort() to perf_hpp_fmt
Those function pointers will be used to sort report output based on
the selected fields.  This is a preparation of later change.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:33 +02:00
Ingo Molnar
6480c56130 perf/core improvements and fixes:
. Add libdw DWARF post unwind support for ARM (Jean Pihet)
 
 . Consolidate types.h for ARM and ARM64 (Jean Pihet)
 
 . Fix possible null pointer dereference in session.c (Masanari Iida)
 
 . Cleanup, remove unused variables in map_switch_event() (Dongsheng Yang)
 
 . Remove nr_state_machine_bugs in perf latency (Dongsheng Yang)
 
 . Remove usage of trace_sched_wakeup(.success) (Peter Zijlstra)
 
 Signed-off-by: Jiri Olsa <jolsa@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTefKDAAoJEPZqUSBWB3s9KhsP/A+vkQ2ClJsCTK9ADSRA1NSx
 wnsXjE4UzgjQeaqSz+6tm/2BbckHhkzt2hcrQ690Gm0cqvY+WJ8FNxxiEmFXDgFG
 TCISeYjqpao7V1XgQX+deyIJSYD3TP/EzTUbpoJWZXIImKGs4mCP/D5C+5SNnXm2
 NREvZeHErqcrRiha6Qa9AwdoVG/uQAQKaZeEQKIMIhH7aiqgRTbGrNbobcZ5R9bT
 dZCE1Zq3ns5/Vco7HAsu0Ij6SvmETNb2OPdGrBb7zr2tNRXFKVyZAGfG+rN8G915
 Dt5fioE3s0kQMYpq8PHDcbkBH1HNbfz+QujJp7OKThC0zl6lH8Tyx8Oj2L8C7UyI
 J1vL87HLjxRvTb1EgTw7vR+n2te4vNgTd1TdXjAQyxhTU0V2AnhI+8ZQQUoKPvOX
 exQlt9ftL8IEC4EyKCyx6B5dRNIK7RVLFPKevf7a39QVEtqTbutMqu9u0I/+1PxZ
 s/v3sgLKKzVLjo3Bfsne0ZMJ+a9eJ8vz2vi63UCw44QaFXCl1VmMvoZaJxOMGnvN
 Mjlg5Vlgl0tSGToJkq0nF7mlb9iKrAYsUUWO9ND7Ya4B8C79u482oJDXI0YqH76g
 rCQlPwLbAcJF17+9z9kol2RvJolFgQnnZ52ypSYErJoKSMTdVhZ0HaaAsO4R6uKz
 RoHcyunF7ZNAmeOnzOg5
 =l453
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core

Pull perf/core improvements and fixes from Jiri Olsa:

  * Add libdw DWARF post unwind support for ARM (Jean Pihet)

  * Consolidate types.h for ARM and ARM64 (Jean Pihet)

  * Fix possible null pointer dereference in session.c (Masanari Iida)

  * Cleanup, remove unused variables in map_switch_event() (Dongsheng Yang)

  * Remove nr_state_machine_bugs in perf latency (Dongsheng Yang)

  * Remove usage of trace_sched_wakeup(.success) (Peter Zijlstra)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-20 08:36:09 +02:00
Stephane Eranian
722e76e60f fix Haswell precise store data source encoding
This patch fixes a bug in  precise_store_data_hsw() whereby
it would set the data source memory level to the wrong value.

As per the the SDM Vol 3b Table 18-41 (Layout of Data Linear
Address Information in PEBS Record), when status bit 0 is set
this is a L1 hit, otherwise this is a L1 miss.

This patch encodes the memory level according to the specification.

In V2, we added the filtering on the store events.
Only the following events produce L1 information:
 * MEM_UOPS_RETIRED.STLB_MISS_STORES
 * MEM_UOPS_RETIRED.LOCK_STORES
 * MEM_UOPS_RETIRED.SPLIT_STORES
 * MEM_UOPS_RETIRED.ALL_STORES

Cc: mingo@elte.hu
Cc: acme@ghostprotocols.net
Cc: jolsa@redhat.com
Cc: jmario@redhat.com
Cc: ak@linux.intel.com
Tested-and-Reviewed-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140515155644.GA3884@quad
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-05-19 21:52:59 +09:00
Peter Zijlstra
643fd0b9f5 perf: Fix perf_event_open(.flags) test
Vince noticed that we test the (unsigned long) flags field against an
(unsigned int) constant. This would allow setting the high bits on 64bit
platforms and not get an error.

There is nothing that uses the high bits, so it should be entirely
harmless, but we don't want userspace to accidentally set them anyway,
so fix the constants.

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140423102254.GL11096@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-05-19 21:52:59 +09:00
Borislav Petkov
12665b35b0 perf/events/core: Drop unused variable after cleanup
... in 3a497f4863 ("perf: Simplify perf_event_exit_task_context()")

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1399720259-28275-1-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-05-19 21:52:58 +09:00
Jean Pihet
97eac381b1 perf tools: Add libdw DWARF post unwind support for ARM
Adding libdw DWARF post unwind support, which is part
of elfutils-devel/libdw-dev package from version 0.158.

The new code is contained in unwin-libdw.c object, and
implements unwind__get_entries unwind interface function.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1400229672-16104-4-git-send-email-jean.pihet@linaro.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-16 11:39:29 +02:00
Jean Pihet
90fa9deb32 perf tests: Add dwarf unwind test on ARM
Adding dwarf unwind test, that setups live machine data over
the perf test thread and does the remote unwind.

Need to use -fno-optimize-sibling-calls for test compilation,
otherwise 'krava_*' function calls are optimized into jumps
and omitted from the stack unwind.

So far it was enabled only for x86.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1400229672-16104-3-git-send-email-jean.pihet@linaro.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-16 11:39:18 +02:00
Jean Pihet
3418f9667e perf tests: Introduce perf_regs_load function on ARM
Introducing perf_regs_load function, which is going
to be used for dwarf unwind test in following patches.

It takes single argument as a pointer to the regs dump
buffer and populates it with current registers values.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1400229672-16104-2-git-send-email-jean.pihet@linaro.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-16 11:39:05 +02:00
Jean Pihet
21a8b756b8 perf tools: Consolidate types.h for ARM and ARM64
Prevents a build breakage since commit d944c4eebc
'tools: Consolidate types.h'

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Link: http://lkml.kernel.org/r/1400229672-16104-1-git-send-email-jean.pihet@linaro.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-16 11:38:57 +02:00
Masanari Iida
c5765ece8a perf session: Fix possible null pointer dereference in session.c
cppcheck detected following warning:
[tools/perf/util/session.c:1628] -> [tools/perf/util/session.c:1632]:
 (warning) Possible null pointer dereference: session - otherwise it
 is redundant to check it against null.

In order to avoide null pointer, check the pointer before use.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Link: http://lkml.kernel.org/r/1400087618-13628-1-git-send-email-standby24x7@gmail.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-16 09:18:51 +02:00
Dongsheng Yang
9d372ca59b perf sched: Cleanup, remove unused variables in map_switch_event()
In map_switch_event(), we don't care the previous process currently,
this patch remove the infomation we get but not used.

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/1400218625-14613-1-git-send-email-yangds.fnst@cn.fujitsu.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-16 09:17:50 +02:00
Dongsheng Yang
67d6259dd0 perf sched: Remove nr_state_machine_bugs in perf latency
As we do not use .success in sched_wakeup event any more, then
we can not guarantee that the task when wakeup event happen is
out of run queue. So the message of nr_state_machine_bugs is
not correct.

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/1399945101-21736-1-git-send-email-yangds.fnst@cn.fujitsu.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-16 09:17:36 +02:00
Oleg Nesterov
b02ef20a9f uprobes/x86: Fix the wrong ->si_addr when xol triggers a trap
If the probed insn triggers a trap, ->si_addr = regs->ip is technically
correct, but this is not what the signal handler wants; we need to pass
the address of the probed insn, not the address of xol slot.

Add the new arch-agnostic helper, uprobe_get_trap_addr(), and change
fill_trap_info() and math_error() to use it. !CONFIG_UPROBES case in
uprobes.h uses a macro to avoid include hell and ensure that it can be
compiled even if an architecture doesn't define instruction_pointer().

Test-case:

	#include <signal.h>
	#include <stdio.h>
	#include <unistd.h>

	extern void probe_div(void);

	void sigh(int sig, siginfo_t *info, void *c)
	{
		int passed = (info->si_addr == probe_div);
		printf(passed ? "PASS\n" : "FAIL\n");
		_exit(!passed);
	}

	int main(void)
	{
		struct sigaction sa = {
			.sa_sigaction	= sigh,
			.sa_flags	= SA_SIGINFO,
		};

		sigaction(SIGFPE, &sa, NULL);

		asm (
			"xor %ecx,%ecx\n"
			".globl probe_div; probe_div:\n"
			"idiv %ecx\n"
		);

		return 0;
	}

it fails if probe_div() is probed.

Note: show_unhandled_signals users should probably use this helper too,
but we need to cleanup them first.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
2014-05-14 13:57:28 +02:00
Oleg Nesterov
0eb14833d5 x86/traps: Kill DO_ERROR_INFO()
Now that DO_ERROR_INFO() doesn't differ from DO_ERROR() we can remove
it and use DO_ERROR() instead.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2014-05-14 13:57:28 +02:00
Oleg Nesterov
1c326c4dfe x86/traps: Shift fill_trap_info() from DO_ERROR_INFO() to do_error_trap()
Move the callsite of fill_trap_info() into do_error_trap() and remove
the "siginfo_t *info" argument.

This obviously breaks DO_ERROR() which passed info == NULL, we simply
change fill_trap_info() to return "siginfo_t *" and add the "default"
case which returns SEND_SIG_PRIV.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2014-05-14 13:57:27 +02:00
Oleg Nesterov
958d3d7298 x86/traps: Introduce fill_trap_info(), simplify DO_ERROR_INFO()
Extract the fill-siginfo code from DO_ERROR_INFO() into the new helper,
fill_trap_info().

It can calculate si_code and si_addr looking at trapnr, so we can remove
these arguments from DO_ERROR_INFO() and simplify the source code. The
generated code is the same, __builtin_constant_p(trapnr) == T.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2014-05-14 13:57:27 +02:00
Oleg Nesterov
dff0796e53 x86/traps: Introduce do_error_trap()
Move the common code from DO_ERROR() and DO_ERROR_INFO() into the new
helper, do_error_trap(). This simplifies define's and shaves 527 bytes
from traps.o.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2014-05-14 13:57:27 +02:00
Oleg Nesterov
38cad57be9 x86/traps: Use SEND_SIG_PRIV instead of force_sig()
force_sig() is just force_sig_info(SEND_SIG_PRIV). Imho it should die,
we have too many ugly "send signal" helpers.

And do_trap() looks just ugly because it uses force_sig_info() or
force_sig() depending on info != NULL.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2014-05-14 13:57:26 +02:00
Oleg Nesterov
5e1b05beec x86/traps: Make math_error() static
Trivial, make math_error() static.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2014-05-14 13:57:26 +02:00
Denys Vlasenko
1ea30fb645 uprobes/x86: Fix scratch register selection for rip-relative fixups
Before this patch, instructions such as div, mul, shifts with count
in CL, cmpxchg are mishandled.

This patch adds vex prefix handling. In particular, it avoids colliding
with register operand encoded in vex.vvvv field.

Since we need to avoid two possible register operands, the selection of
scratch register needs to be from at least three registers.

After looking through a lot of CPU docs, it looks like the safest choice
is SI,DI,BX. Selecting BX needs care to not collide with implicit use of
BX by cmpxchg8b.

Test-case:

	#include <stdio.h>

	static const char *const pass[] = { "FAIL", "pass" };

	long two = 2;
	void test1(void)
	{
		long ax = 0, dx = 0;
		asm volatile("\n"
	"			xor	%%edx,%%edx\n"
	"			lea	2(%%edx),%%eax\n"
	// We divide 2 by 2. Result (in eax) should be 1:
	"	probe1:		.globl	probe1\n"
	"			divl	two(%%rip)\n"
	// If we have a bug (eax mangled on entry) the result will be 2,
	// because eax gets restored by probe machinery.
		: "=a" (ax), "=d" (dx) /*out*/
		: "0" (ax), "1" (dx) /*in*/
		: "memory" /*clobber*/
		);
		dprintf(2, "%s: %s\n", __func__,
			pass[ax == 1]
		);
	}

	long val2 = 0;
	void test2(void)
	{
		long old_val = val2;
		long ax = 0, dx = 0;
		asm volatile("\n"
	"			mov	val2,%%eax\n"     // eax := val2
	"			lea	1(%%eax),%%edx\n" // edx := eax+1
	// eax is equal to val2. cmpxchg should store edx to val2:
	"	probe2:		.globl  probe2\n"
	"			cmpxchg %%edx,val2(%%rip)\n"
	// If we have a bug (eax mangled on entry), val2 will stay unchanged
		: "=a" (ax), "=d" (dx) /*out*/
		: "0" (ax), "1" (dx) /*in*/
		: "memory" /*clobber*/
		);
		dprintf(2, "%s: %s\n", __func__,
			pass[val2 == old_val + 1]
		);
	}

	long val3[2] = {0,0};
	void test3(void)
	{
		long old_val = val3[0];
		long ax = 0, dx = 0;
		asm volatile("\n"
	"			mov	val3,%%eax\n"  // edx:eax := val3
	"			mov	val3+4,%%edx\n"
	"			mov	%%eax,%%ebx\n" // ecx:ebx := edx:eax + 1
	"			mov	%%edx,%%ecx\n"
	"			add	$1,%%ebx\n"
	"			adc	$0,%%ecx\n"
	// edx:eax is equal to val3. cmpxchg8b should store ecx:ebx to val3:
	"	probe3:		.globl  probe3\n"
	"			cmpxchg8b val3(%%rip)\n"
	// If we have a bug (edx:eax mangled on entry), val3 will stay unchanged.
	// If ecx:edx in mangled, val3 will get wrong value.
		: "=a" (ax), "=d" (dx) /*out*/
		: "0" (ax), "1" (dx) /*in*/
		: "cx", "bx", "memory" /*clobber*/
		);
		dprintf(2, "%s: %s\n", __func__,
			pass[val3[0] == old_val + 1 && val3[1] == 0]
		);
	}

	int main(int argc, char **argv)
	{
		test1();
		test2();
		test3();
		return 0;
	}

Before this change all tests fail if probe{1,2,3} are probed.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2014-05-14 13:57:25 +02:00
Denys Vlasenko
50204c6f6d uprobes/x86: Simplify rip-relative handling
It is possible to replace rip-relative addressing mode with addressing
mode of the same length: (reg+disp32). This eliminates the need to fix
up immediate and correct for changing instruction length.

And we can kill arch_uprobe->def.riprel_target.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Reviewed-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2014-05-14 13:57:25 +02:00
Oleg Nesterov
29dedee0e6 uprobes: Add mem_cgroup_charge_anon() into uprobe_write_opcode()
Hugh says:

    The one I noticed was that it forgets all about memcg (because
    it was copied from KSM, and there the replacement page has already
    been charged to a memcg). See how mm/memory.c do_anonymous_page()
    does a mem_cgroup_charge_anon().

Hopefully not a big problem, uprobes is a system-wide thing and only
root can insert the probes. But I agree, should be fixed anyway.

Add mem_cgroup_{un,}charge_anon() into uprobe_write_opcode(). To simplify
the error handling (and avoid the new "uncharge" label) the patch also
moves anon_vma_prepare() up before we alloc/charge the new page.

While at it fix the comment about ->mmap_sem, it is held for write.

Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2014-05-14 13:57:24 +02:00
Peter Zijlstra
0680ee7db1 perf tools: Remove usage of trace_sched_wakeup(.success)
trace_sched_wakeup(.success) is a dead argument and has been for ages,
the only reason its still there is because of brain dead software, which
apparently includes perf tools

There's a few more instances in pearly snake shit, but that's not
supported as far as I care anyhow, so let that bitrot.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140512181946.GG13467@laptop.programming.kicks-ass.net
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-12 21:13:44 +02:00
Ingo Molnar
26f273802b perf/core improvements and fixes:
. Propagate exit status of a command line workload for
   record command (Namhyung Kim)
 
 . Use tid for finding thread (Namhyung Kim)
 
 . Clarify the output of perf sched map plus small sched
   command fixies (Dongsheng Yang)
 
 Signed-off-by: Jiri Olsa <jolsa@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTcJFuAAoJEPZqUSBWB3s9hjkP/08VG633605baNo554GFCe3l
 jUDg1gXOmCB82FIj8zUrdoF7VCD2Yz5KNaDSWLihJlUOCDkyOliufMufNYhtVz5G
 XKjxzd8+wq+y4yu6fZBg4SDs4LrPCGrktIuwCgnX2Jns5Ps9CpRxUgMNIA1/Ee3J
 y6ELc3zdYQzOVcuiqH0FemVckzGTSRCQ26JbMrdFFuBCaPu8iEAXQQis8InRTyyA
 RDcQb54TfjnQQAFj7WX8QF2X9wXhv7xAA2547ICLBJ5wb+Tf26K8vEsxTLHGBKvM
 0bGd1BkMp52WY2ZEVzdPz4kVnMUGwSrjhQJqCnnv+DErko3UaBSH+u6xVaNUymWY
 UgJsywEVGYtanHOjrQ6u9ESi9tdsY4d6ykTtZHfydXRUJgwfW194cYwGZz+fGXbs
 3tdAksax6TrytXzPhC2tIfUElvHeu7BpmefDIe8+fwyLsSJhuCyM+DE7yBe+65Jl
 a4g8BFkVSCx8nLhKsjQZoieXO51GD9zQHkOGzsJGkYCFqSlnXPXULEuESNrAlZjK
 CnTUfJJo+Oc2n5KyqzLDLwMq09dF8orVCZI0k9V56AZJwBYXlBA31dBdK220zFlM
 SGQBGOZI5I+lV1cMNECvGnYgi+m0rCxpOsjsoJdHRKxwoAIGUzezWpf44Vl1QRZv
 76aFWLxv9DWXhhwRylD0
 =felX
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core

Pull perf/core improvements and fixes from Jiri Olsa:

  * Propagate exit status of a command line workload for
    record command (Namhyung Kim)

  * Use tid for finding thread (Namhyung Kim)

  * Clarify the output of perf sched map plus small sched
    command fixies (Dongsheng Yang)

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-12 17:57:48 +02:00
Namhyung Kim
13ce34df11 perf tools: Use tid for finding thread
I believe that passing pid (instead of tid) as the 3rd arg of the
machine__find*_thread() was to find a main thread so that it can
search proper map group for symbols.  However with the map sharing
patch applied, it now can do it in any thread.

It fixes a bug when each thread has different name, it only reports a
main thread for samples in other threads.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1399856202-26221-1-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-12 11:09:50 +02:00
Namhyung Kim
bac1e4d103 perf tools: Get rid of on_exit() feature test
The on_exit() function was only used in perf record but it's gone in
previous patch.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Bernhard Rosenkraenzer <Bernhard.Rosenkranzer@linaro.org>
Cc: Irina Tirdea <irina.tirdea@intel.com>
Link: http://lkml.kernel.org/r/1399855645-25815-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-12 11:09:50 +02:00
Namhyung Kim
4560471053 perf record: Propagate exit status of a command line workload
Currently perf record doesn't propagate the exit status of a workload
given by the command line.  But sometimes it'd useful if it's
propagated so that a monitoring script can handle errors
appropriately.

To do that, it moves most of logic out of the exit handlers and run
them directly in the __cmd_record().  The only thing needs to be done
in the handler is propagating terminating signal so that the shell can
terminate its loop properly when Ctrl-C was pressed.  Also it cleaned
up the resource management code in record__exit().

With this change, perf record returns the child exit status in case of
normal termination and send signal to itself when terminated by signal.

Example run of Stephane's case:

  $ perf record true && echo yes || echo no
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.013 MB perf.data (~589 samples) ]
  yes

  $ perf record false && echo yes || echo no
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.013 MB perf.data (~589 samples) ]
  no

Jiri's case (error in parent):

  $ perf record -m 10G true && echo yes || echo no
  rounding mmap pages size to 17179869184 bytes (4194304 pages)
  failed to mmap with 12 (Cannot allocate memory)
  no

  $ ulimit -n 6
  $ perf record sleep 1 && echo yes || echo no
  failed to create 'go' pipe: Too many open files
  Couldn't run the workload!
  no

And Peter's case (interrupted by signal):

  $ while :; do perf record sleep 1; done
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.014 MB perf.data (~593 samples) ]

Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1399855645-25815-1-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-12 11:09:49 +02:00
Dongsheng
6bcab4e1ea perf tools: Clarify the output of perf sched map.
In output of perf sched map, any shortname of thread will be explained
at the first time when it appear.

Example:
              *A0       228836.978985 secs A0 => perf:23032
          *.   A0       228836.979016 secs B0 => swapper:0
           .  *C0       228836.979099 secs C0 => migration/3:22
  *A0      .   C0       228836.979115 secs
   A0      .  *.        228836.979115 secs

But B0, which is explained as swapper:0 did not appear in the
left part of output. Instead, we use '.' as the shortname of
swapper:0. So the comment of "B0 => swapper:0" is not easy to
understand.

This patch clarify the output of perf sched map with not allocating
one letter-number shortname for swapper:0 and print ". => swapper:0"
as the explanation for swapper:0.

Example:
              *A0       228836.978985 secs A0 => perf:23032
          * .  A0       228836.979016 secs .  => swapper:0
            . *B0       228836.979099 secs B0 => migration/3:22
  *A0       .  B0       228836.979115 secs
   A0       . * .       228836.979115 secs
   A0     *C0   .       228836.979225 secs C0 => ksoftirqd/2:18
   A0     *D0   .       228836.979236 secs D0 => rcu_sched:7

Signed-off-by: Dongsheng <yangds.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1399354741-19522-1-git-send-email-yangds.fnst@cn.fujitsu.com
[ small style fixes to make checkpatch happy ]
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-12 11:09:05 +02:00
Dongsheng
e936e8e459 perf tools: Adapt the TASK_STATE_TO_CHAR_STR to new value in kernel space.
Currently, TASK_STATE_TO_CHAR_STR in kernel space is already expanded to RSDTtZXxKWP,
but it is still RSDTtZX in perf sched tool.

This patch update TASK_STATE_TO_CHAR_STR to the new value in kernel space.

Signed-off-by: Dongsheng <yangds.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/6d2f55dc1e02c1e29a5d70bfeb9d6e8863caf2aa.1399273302.git.yangds.fnst@cn.fujitsu.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-12 10:01:49 +02:00
Dongsheng
7fff959783 perf tools: Add missing event for perf sched record.
We should record and process sched:sched_wakeup_new event in
perf sched tool, but currently, there is the process function
for it, without recording it in record subcommand.

This patch add -e sched:sched_wakeup_new to perf sched record.

Signed-off-by: Dongsheng <yangds.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/710c6edd2162b2cea1711443f54de47c0210d9fd.1399273302.git.yangds.fnst@cn.fujitsu.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-12 10:01:41 +02:00
Vineet Gupta
69902c718c kprobes: Ensure blacklist data is aligned
ARC Linux (not supporting native unaligned access) was failing
to boot because __start_kprobe_blacklist was not aligned.

This was because per generated vmlinux.lds it was emitted right
next to .rodata with strings etc hence could be randomly
unaligned.

Fix that by ensuring a word alignment. While 4 would suffice for
32bit arches and problem at hand, it is probably better to put 8.

| Path: (null) CPU: 0 PID: 1 Comm: swapper Not tainted
| 3.15.0-rc3-next-20140430 #2
| task: 8f044000 ti: 8f01e000 task.ti: 8f01e000
|
| [ECR   ]: 0x00230400 => Misaligned r/w from 0x800fb0d3
| [EFA   ]: 0x800fb0d3
| [BLINK ]: do_one_initcall+0x86/0x1bc
| [ERET  ]: init_kprobes+0x52/0x120

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: <torvalds@linux-foundation.org>
Cc: <rusty@rustcorp.com.au>
Cc: <rdunlap@infradead.org>
Cc: <jeremy@goop.org>
Cc: <arnd@arndb.de>
Cc: <dl9pf@gmx.de>
Cc: <sparse@chrisli.org>
Cc: <anil.s.keshavamurthy@intel.com>
Cc: <davem@davemloft.net>
Cc: <ananth@in.ibm.com>
Cc: <masami.hiramatsu.pt@hitachi.com>
Cc: <chrisw@sous-sol.org>
Cc: <akataria@vmware.com>
Cc: anton Kolesov <Anton.Kolesov@synopsys.com>
Link: http://lkml.kernel.org/r/5361DB14.7010406@synopsys.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-07 21:04:57 +02:00
Peter Zijlstra
3a497f4863 perf: Simplify perf_event_exit_task_context()
Instead of jumping through hoops to make sure to find (and exit) each
event, do it the simple straight fwd way.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-tij931199thfkys8vbnokdpf@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-07 13:44:19 +02:00
Peter Zijlstra
683ede43dd perf: Rework free paths
Primarily make perf_event_release_kernel() into put_event(), this will
allow kernel space to create per-task inherited events, and is safer
in general.

Also, document the free_event() assumptions.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-rk9pvr6e1d0559lxstltbztc@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-07 13:44:19 +02:00
Peter Zijlstra
63342411ef perf: Validate locking assumption
Document and validate the locking assumption of event_sched_in().

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-sybq1publ9xt5no77cwvi0eo@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-07 13:44:18 +02:00
Peter Zijlstra
15a2d4de0e perf: Always destroy groups on exit
Commit 38b435b16c ("perf: Fix tear-down of inherited group events")
states that we need to destroy groups for inherited events, but it
doesn't make any sense to not also destroy groups for normal events.

And while it usually makes no difference (the normal events won't
leak, and its very likely all the group events will die in quick
succession) it does make the code more consistent and closes a
potential hole for trouble.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-426egt8zmsm12d2q8k2xz4tt@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-07 13:44:18 +02:00
Peter Zijlstra
1f4ee5038f perf: Ensure consistent inherit state in groups
Make sure all events in a group have the same inherit state. It was
possible for group leaders to have inherit set while sibling events
would not have inherit set.

In this case we'd still inherit the siblings, leading to some
non-fatal weirdness.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-r32tt8yldvic3jlcghd3g35u@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-07 13:44:17 +02:00
Ingo Molnar
37b16beaa9 Merge branch 'perf/urgent' into perf/core, to avoid conflicts
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-07 13:39:22 +02:00