linux/include/asm-generic
Steven Rostedt 1f0d69a9fc tracing: profile likely and unlikely annotations
Impact: new unlikely/likely profiler

Andrew Morton recently suggested having an in-kernel way to profile
likely and unlikely macros. This patch achieves that goal.

When configured, every(*) likely and unlikely macro gets a counter attached
to it. When the condition is hit, the hit and misses of that condition
are recorded. These numbers can later be retrieved by:

  /debugfs/tracing/profile_likely    - All likely markers
  /debugfs/tracing/profile_unlikely  - All unlikely markers.

# cat /debug/tracing/profile_unlikely | head
 correct incorrect  %        Function                  File              Line
 ------- ---------  -        --------                  ----              ----
    2167        0   0 do_arch_prctl                  process_64.c         832
       0        0   0 do_arch_prctl                  process_64.c         804
    2670        0   0 IS_ERR                         err.h                34
   71230     5693   7 __switch_to                    process_64.c         673
   76919        0   0 __switch_to                    process_64.c         639
   43184    33743  43 __switch_to                    process_64.c         624
   12740    64181  83 __switch_to                    process_64.c         594
   12740    64174  83 __switch_to                    process_64.c         590

# cat /debug/tracing/profile_unlikely | \
  awk '{ if ($3 > 25) print $0; }' |head -20
   44963    35259  43 __switch_to                    process_64.c         624
   12762    67454  84 __switch_to                    process_64.c         594
   12762    67447  84 __switch_to                    process_64.c         590
    1478      595  28 syscall_get_error              syscall.h            51
       0     2821 100 syscall_trace_leave            ptrace.c             1567
       0        1 100 native_smp_prepare_cpus        smpboot.c            1237
   86338   265881  75 calc_delta_fair                sched_fair.c         408
  210410   108540  34 calc_delta_mine                sched.c              1267
       0    54550 100 sched_info_queued              sched_stats.h        222
   51899    66435  56 pick_next_task_fair            sched_fair.c         1422
       6       10  62 yield_task_fair                sched_fair.c         982
    7325     2692  26 rt_policy                      sched.c              144
       0     1270 100 pre_schedule_rt                sched_rt.c           1261
    1268    48073  97 pick_next_task_rt              sched_rt.c           884
       0    45181 100 sched_info_dequeued            sched_stats.h        177
       0       15 100 sched_move_task                sched.c              8700
       0       15 100 sched_move_task                sched.c              8690
   53167    33217  38 schedule                       sched.c              4457
       0    80208 100 sched_info_switch              sched_stats.h        270
   30585    49631  61 context_switch                 sched.c              2619

# cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }'
   39900    36577  47 pick_next_task                 sched.c              4397
   20824    15233  42 switch_mm                      mmu_context_64.h     18
       0        7 100 __cancel_work_timer            workqueue.c          560
     617    66484  99 clocksource_adjust             timekeeping.c        456
       0   346340 100 audit_syscall_exit             auditsc.c            1570
      38   347350  99 audit_get_context              auditsc.c            732
       0   345244 100 audit_syscall_entry            auditsc.c            1541
      38     1017  96 audit_free                     auditsc.c            1446
       0     1090 100 audit_alloc                    auditsc.c            862
    2618     1090  29 audit_alloc                    auditsc.c            858
       0        6 100 move_masked_irq                migration.c          9
       1      198  99 probe_sched_wakeup             trace_sched_switch.c 58
       2        2  50 probe_wakeup                   trace_sched_wakeup.c 227
       0        2 100 probe_wakeup_sched_switch      trace_sched_wakeup.c 144
    4514     2090  31 __grab_cache_page              filemap.c            2149
   12882   228786  94 mapping_unevictable            pagemap.h            50
       4       11  73 __flush_cpu_slab               slub.c               1466
  627757   330451  34 slab_free                      slub.c               1731
    2959    61245  95 dentry_lru_del_init            dcache.c             153
     946     1217  56 load_elf_binary                binfmt_elf.c         904
     102       82  44 disk_put_part                  genhd.h              206
       1        1  50 dst_gc_task                    dst.c                82
       0       19 100 tcp_mss_split_point            tcp_output.c         1126

As you can see by the above, there's a bit of work to do in rethinking
the use of some unlikelys and likelys. Note: the unlikely case had 71 hits
that were more than 25%.

Note:  After submitting my first version of this patch, Andrew Morton
  showed me a version written by Daniel Walker, where I picked up
  the following ideas from:

  1)  Using __builtin_constant_p to avoid profiling fixed values.
  2)  Using __FILE__ instead of instruction pointers.
  3)  Using the preprocessor to stop all profiling of likely
       annotations from vsyscall_64.c.

Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar
for their feed back on this patch.

(*) Not ever unlikely is recorded, those that are used by vsyscalls
 (a few of them) had to have profiling disabled.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Theodore Tso <tytso@mit.edu>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 11:52:02 +01:00
..
bitops bitops: use __fls for fls64 on 64-bit archs 2008-04-26 19:21:16 +02:00
4level-fixup.h add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
atomic.h Christoph has moved 2008-07-04 10:40:04 -07:00
audit_change_attr.h [PATCH] fix missing ifdefs in syscall classes hookup for generic targets 2006-09-22 17:48:56 -07:00
audit_dir_write.h [PATCH] fix missing ifdefs in syscall classes hookup for generic targets 2006-09-22 17:48:56 -07:00
audit_read.h
audit_signal.h [PATCH] add SIGNAL syscall class (v3) 2007-05-11 05:38:25 -04:00
audit_write.h
bitops.h remove __KERNEL__ tests of unexported headers under asm-generic/ 2008-04-30 08:29:54 -07:00
bug.h fix WARN() for PPC 2008-10-20 15:29:57 -07:00
cmpxchg-local.h Add cmpxchg_local to asm-generic for per cpu atomic operations 2008-02-07 08:42:30 -08:00
cmpxchg.h Add cmpxchg_local to asm-generic for per cpu atomic operations 2008-02-07 08:42:30 -08:00
cputime.h taskstats scaled time cleanup 2008-02-06 10:41:00 -08:00
device.h Driver core: add dev_archdata to struct device 2006-12-01 14:52:01 -08:00
div64.h rename div64_64 to div64_u64 2008-05-01 08:03:58 -07:00
dma-coherent.h generic: per-device coherent dma allocator 2008-06-30 12:51:05 +02:00
dma-mapping-broken.h dma-mapping: add the device argument to dma_mapping_error() 2008-07-26 12:00:03 -07:00
dma-mapping.h dma-mapping: add the device argument to dma_mapping_error() 2008-07-26 12:00:03 -07:00
emergency-restart.h
errno-base.h
errno.h
fcntl.h Introduce O_CLOEXEC 2007-07-16 09:05:45 -07:00
futex.h remove __KERNEL__ tests of unexported headers under asm-generic/ 2008-04-30 08:29:54 -07:00
gpio.h gpiolib: request/free hooks 2008-10-16 11:21:40 -07:00
ide_iops.h
int-l64.h types: add C99-style constructors to <asm-generic/int-*.h> 2008-05-02 16:18:42 -07:00
int-ll64.h asm-generic/int-ll64.h: always provide __{s,u}64 2008-07-25 10:53:27 -07:00
ioctl.h Make ioctl.h compatible with userland 2008-08-12 16:07:31 -07:00
iomap.h generic: add ioremap_wc() interface wrapper 2008-04-24 23:40:47 +02:00
irq_regs.h IRQ: Maintain regs pointer globally rather than passing to IRQ handlers 2006-10-05 15:10:12 +01:00
Kbuild types: create <asm-generic/int-*.h> 2008-05-02 16:18:19 -07:00
Kbuild.asm Fix conditional export of kvh.h and a.out.h to userspace. 2008-09-05 15:44:31 +01:00
kdebug.h asm-generic: define DIE_OOPS in asm-generic 2008-10-27 11:39:03 +01:00
libata-portmap.h libata-portmap: Remove unused definitions 2007-10-12 14:55:37 -04:00
local.h local_t: architecture independent extension 2007-05-08 11:15:20 -07:00
memory_model.h Fix __pfn_to_page(pfn) for CONFIG_DISCONTIGMEM=y 2008-11-08 10:02:48 -08:00
mm_hooks.h [PATCH] x86: PARAVIRT: add hooks to intercept mm creation and destruction 2007-05-02 19:27:14 +02:00
mman.h [PATCH] Remove final references to deprecated "MAP_ANON" page protection flag 2007-02-11 10:51:17 -08:00
mutex-dec.h mutex: speed up generic mutex implementations 2008-10-23 09:18:20 -07:00
mutex-null.h fix file specification in comments 2006-10-03 23:01:26 +02:00
mutex-xchg.h mutex: speed up generic mutex implementations 2008-10-23 09:18:20 -07:00
page.h remove __KERNEL__ tests of unexported headers under asm-generic/ 2008-04-30 08:29:54 -07:00
pci-dma-compat.h dma-mapping: add the device argument to dma_mapping_error() 2008-07-26 12:00:03 -07:00
pci.h
percpu.h percpu: fix DEBUG_PREEMPT per_cpu checking 2008-02-23 12:09:28 -08:00
pgtable-nopmd.h include/asm-generic/pgtable-nopmd.h: macros are noxious, reason #435 2008-07-28 16:30:21 -07:00
pgtable-nopud.h add mm argument to pte/pmd/pud/pgd_free 2008-02-05 09:44:18 -08:00
pgtable.h mm: fix build on non-mmu machines 2008-07-15 13:58:40 -07:00
poll.h Consolidate asm/poll.h 2007-05-11 08:29:34 -07:00
resource.h sched: SCHED_FIFO/SCHED_RR watchdog timer 2008-01-25 21:08:27 +01:00
rtc.h rtc: use bcd2bin/bin2bcd 2008-10-20 08:52:41 -07:00
sections.h lib: Correct printk %pF to work on all architectures 2008-09-09 11:51:15 -07:00
siginfo.h signals: demultiplexing SIGTRAP signal 2008-09-23 13:26:52 +02:00
signal.h
statfs.h Make <asm-generic/statfs.h> suitable for 64-bit platforms. 2008-09-04 09:46:08 +01:00
syscall.h tracehook: comment pasto fixes 2008-09-05 14:39:38 -07:00
termios.h tty: let architectures override the user/kernel macros. 2008-02-08 09:22:24 -08:00
tlb.h asm-generic/tlb.h: remove <linux/quicklist.h> 2008-02-04 16:48:00 +01:00
topology.h x86: change _node_to_cpumask_ptr to return const ptr 2008-07-13 19:11:58 +02:00
uaccess.h
vmlinux.lds.h tracing: profile likely and unlikely annotations 2008-11-12 11:52:02 +01:00
xor.h