linux/kernel/trace
Steven Rostedt f431b634f2 tracing/syscalls: Allow archs to ignore tracing compat syscalls
The tracing of ia32 compat system calls has been a bit of a pain as they
use different system call numbers than the 64bit equivalents.

I wrote a simple 'lls' program that lists files. I compiled it as a i686
ELF binary and ran it under a x86_64 box. This is the result:

echo 0 > /debug/tracing/tracing_on
echo 1 > /debug/tracing/events/syscalls/enable
echo 1 > /debug/tracing/tracing_on ; ./lls ; echo 0 > /debug/tracing/tracing_on

grep lls /debug/tracing/trace

[.. skipping calls before TS_COMPAT is set ...]

             lls-1127  [005] d...   936.409188: sys_recvfrom(fd: 0, ubuf: 4d560fc4, size: 0, flags: 8048034, addr: 8, addr_len: f7700420)
             lls-1127  [005] d...   936.409190: sys_recvfrom -> 0x8a77000
             lls-1127  [005] d...   936.409211: sys_lgetxattr(pathname: 0, name: 1000, value: 3, size: 22)
             lls-1127  [005] d...   936.409215: sys_lgetxattr -> 0xf76ff000
             lls-1127  [005] d...   936.409223: sys_dup2(oldfd: 4d55ae9b, newfd: 4)
             lls-1127  [005] d...   936.409228: sys_dup2 -> 0xfffffffffffffffe
             lls-1127  [005] d...   936.409236: sys_newfstat(fd: 4d55b085, statbuf: 80000)
             lls-1127  [005] d...   936.409242: sys_newfstat -> 0x3
             lls-1127  [005] d...   936.409243: sys_removexattr(pathname: 3, name: ffcd0060)
             lls-1127  [005] d...   936.409244: sys_removexattr -> 0x0
             lls-1127  [005] d...   936.409245: sys_lgetxattr(pathname: 0, name: 19614, value: 1, size: 2)
             lls-1127  [005] d...   936.409248: sys_lgetxattr -> 0xf76e5000
             lls-1127  [005] d...   936.409248: sys_newlstat(filename: 3, statbuf: 19614)
             lls-1127  [005] d...   936.409249: sys_newlstat -> 0x0
             lls-1127  [005] d...   936.409262: sys_newfstat(fd: f76fb588, statbuf: 80000)
             lls-1127  [005] d...   936.409279: sys_newfstat -> 0x3
             lls-1127  [005] d...   936.409279: sys_close(fd: 3)
             lls-1127  [005] d...   936.421550: sys_close -> 0x200
             lls-1127  [005] d...   936.421558: sys_removexattr(pathname: 3, name: ffcd00d0)
             lls-1127  [005] d...   936.421560: sys_removexattr -> 0x0
             lls-1127  [005] d...   936.421569: sys_lgetxattr(pathname: 4d564000, name: 1b1abc, value: 5, size: 802)
             lls-1127  [005] d...   936.421574: sys_lgetxattr -> 0x4d564000
             lls-1127  [005] d...   936.421575: sys_capget(header: 4d70f000, dataptr: 1000)
             lls-1127  [005] d...   936.421580: sys_capget -> 0x0
             lls-1127  [005] d...   936.421580: sys_lgetxattr(pathname: 4d710000, name: 3000, value: 3, size: 812)
             lls-1127  [005] d...   936.421589: sys_lgetxattr -> 0x4d710000
             lls-1127  [005] d...   936.426130: sys_lgetxattr(pathname: 4d713000, name: 2abc, value: 3, size: 32)
             lls-1127  [005] d...   936.426141: sys_lgetxattr -> 0x4d713000
             lls-1127  [005] d...   936.426145: sys_newlstat(filename: 3, statbuf: f76ff3f0)
             lls-1127  [005] d...   936.426146: sys_newlstat -> 0x0
             lls-1127  [005] d...   936.431748: sys_lgetxattr(pathname: 0, name: 1000, value: 3, size: 22)

Obviously I'm not calling newfstat with a fd of 4d55b085. The calls are
obviously incorrect, and confusing.

Other efforts have been made to fix this:

https://lkml.org/lkml/2012/3/26/367

But the real solution is to rewrite the syscall internals and come up
with a fixed solution. One that doesn't require all the kluge that the
current solution has.

Thus for now, instead of outputting incorrect data, simply ignore them.
With this patch the changes now have:

 #> grep lls /debug/tracing/trace
 #>

Compat system calls simply are not traced. If users need compat
syscalls, then they should just use the raw syscall tracepoints.

For an architecture to make their compat syscalls ignored, it must
define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS (done in asm/ftrace.h) and also
define an arch_trace_is_compat_syscall() function that will return true
if the current task should ignore tracing the syscall.

I want to stress that this change does not affect actual syscalls in any
way, shape or form. It is only used within the tracing system and
doesn't interfere with the syscall logic at all. The changes are
consolidated nicely into trace_syscalls.c and asm/ftrace.h.

I had to make one small modification to asm/thread_info.h and that was
to remove the include of asm/ftrace.h. As asm/ftrace.h required the
current_thread_info() it was causing include hell. That include was
added back in 2008 when the function graph tracer was added:

 commit caf4b323 "tracing, x86: add low level support for ftrace return tracing"

It does not need to be included there.

Link: http://lkml.kernel.org/r/1360703939.21867.99.camel@gandalf.local.home

Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-02-12 17:46:28 -05:00
..
blktrace.c tracing: Use this_cpu_ptr per-cpu helper 2013-01-21 13:22:30 -05:00
ftrace.c tracing: Avoid unnecessary multiple recursion checks 2013-01-22 23:38:01 -05:00
Kconfig tracing: Make a snapshot feature available from userspace 2013-01-30 11:02:06 -05:00
Makefile trace: Stop compiling in trace_clock unconditionally 2012-09-13 22:52:08 -04:00
power-traces.c perf: Clean up power events by introducing new, more generic ones 2011-01-04 08:16:54 +01:00
ring_buffer_benchmark.c tracing: Use NUMA allocation for per-cpu ring buffer pages 2011-06-14 22:04:39 -04:00
ring_buffer.c ring-buffer: Add stats field for amount read from trace ring buffer 2013-01-30 11:01:53 -05:00
rpm-traces.c PM / Runtime: Introduce trace points for tracing rpm_* functions 2011-09-27 22:53:27 +02:00
trace_branch.c tracing: Cache comms only after an event occurred 2012-10-31 16:45:31 -04:00
trace_clock.c tracing: Use sched_clock_cpu for trace_clock_global 2013-01-30 11:02:05 -05:00
trace_entries.h Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent 2012-03-24 08:19:09 +01:00
trace_event_perf.c perf/core improvements and fixes: 2012-08-21 11:27:00 +02:00
trace_events_filter_test.h tracing/filter: Add startup tests for events filter 2011-08-19 14:35:59 -04:00
trace_events_filter.c tracing: Replace strict_strto* with kstrto* 2012-10-31 16:45:23 -04:00
trace_events.c tracing: Remove the extra 4 bytes of padding in events 2013-01-21 21:05:41 -05:00
trace_export.c tracing: Do not enable function event with enable 2012-05-10 15:55:43 -04:00
trace_functions_graph.c tracing/fgraph: Adjust fgraph depth before calling trace return callback 2013-01-29 17:30:31 -05:00
trace_functions.c tracing: Fix unsigned int compare of zero in recursion check 2013-01-24 07:52:34 -05:00
trace_irqsoff.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-12-13 12:00:02 -08:00
trace_kdb.c kdb,ftdump: Remove reference to internal kdb include 2010-10-22 15:34:11 -05:00
trace_kprobe.c tracing: Use irq_work for wake ups and remove *_nowake_*() functions 2012-11-02 10:21:52 -04:00
trace_mmiotrace.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
trace_nop.c
trace_output.c tracing: Format non-nanosec times from tsc clock without a decimal point. 2012-11-13 15:48:40 -05:00
trace_output.h
trace_printk.c tracing: Add percpu buffers for trace_printk() 2012-04-23 21:15:55 -04:00
trace_probe.c tracing: Replace strict_strto* with kstrto* 2012-10-31 16:45:23 -04:00
trace_probe.h tracing: Provide trace events interface for uprobes 2012-05-07 14:30:17 +02:00
trace_sched_switch.c tracing: Use irq_work for wake ups and remove *_nowake_*() functions 2012-11-02 10:21:52 -04:00
trace_sched_wakeup.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-12-13 12:00:02 -08:00
trace_selftest_dynamic.c ftrace: Add self-tests for multiple function trace users 2011-05-18 19:24:51 -04:00
trace_selftest.c ftrace: Fix function tracing recursion self test 2013-01-22 23:37:58 -05:00
trace_stack.c tracing: Remove unneeded checks from the stack tracer 2012-11-19 15:07:13 -05:00
trace_stat.c
trace_stat.h
trace_syscalls.c tracing/syscalls: Allow archs to ignore tracing compat syscalls 2013-02-12 17:46:28 -05:00
trace_uprobe.c tracing: Verify target file before registering a uprobe event 2013-01-21 13:22:31 -05:00
trace.c tracing: Init current_trace to nop_trace and remove NULL checks 2013-02-01 18:38:47 -05:00
trace.h tracing: Make a snapshot feature available from userspace 2013-01-30 11:02:06 -05:00