linux/kernel/trace
Steven Rostedt c32e827b25 tracing: add raw trace point recording infrastructure
Impact: lower overhead tracing

The current event tracer can automatically pick up trace points
that are registered with the TRACE_FORMAT macro. But it required
a printf format string and parsing. Although, this adds the ability
to get guaranteed information like task names and such, it took
a hit in overhead processing. This processing can add about 500-1000
nanoseconds overhead, but in some cases that too is considered
too much and we want to shave off as much from this overhead as
possible.

Tom Zanussi recently posted tracing patches to lkml that are based
on a nice idea about capturing the data via C structs using
STRUCT_ENTER, STRUCT_EXIT type of macros.

I liked that method very much, but did not like the implementation
that required a developer to add data/code in several disjoint
locations.

This patch extends the event_tracer macros to do a similar "raw C"
approach that Tom Zanussi did. But instead of having the developers
needing to tweak a bunch of code all over the place, they can do it
all in one macro - preferably placed near the code that it is
tracing. That makes it much more likely that tracepoints will be
maintained on an ongoing basis by the code they modify.

The new macro TRACE_EVENT_FORMAT is created for this approach. (Note,
a developer may still utilize the more low level DECLARE_TRACE macros
if they don't care about getting their traces automatically in the event
tracer.)

They can also use the existing TRACE_FORMAT if they don't need to code
the tracepoint in C, but just want to use the convenience of printf.

So if the developer wants to "hardwire" a tracepoint in the fastest
possible way, and wants to acquire their data via a user space utility
in a raw binary format, or wants to see it in the trace output but not
sacrifice any performance, then they can implement the faster but
more complex TRACE_EVENT_FORMAT macro.

Here's what usage looks like:

  TRACE_EVENT_FORMAT(name,
	TPPROTO(proto),
	TPARGS(args),
	TPFMT(fmt, fmt_args),
	TRACE_STUCT(
		TRACE_FIELD(type1, item1, assign1)
		TRACE_FIELD(type2, item2, assign2)
			[...]
	),
	TPRAWFMT(raw_fmt)
	);

Note name, proto, args, and fmt, are all identical to what TRACE_FORMAT
uses.

 name: is the unique identifier of the trace point
 proto: The proto type that the trace point uses
 args: the args in the proto type
 fmt: printf format to use with the event printf tracer
 fmt_args: the printf argments to match fmt

 TRACE_STRUCT starts the ability to create a structure.
 Each item in the structure is defined with a TRACE_FIELD

  TRACE_FIELD(type, item, assign)

 type: the C type of item.
 item: the name of the item in the stucture
 assign: what to assign the item in the trace point callback

 raw_fmt is a way to pretty print the struct. It must match
  the order of the items are added in TRACE_STUCT

 An example of this would be:

 TRACE_EVENT_FORMAT(sched_wakeup,
	TPPROTO(struct rq *rq, struct task_struct *p, int success),
	TPARGS(rq, p, success),
	TPFMT("task %s:%d %s",
	      p->comm, p->pid, success?"succeeded":"failed"),
	TRACE_STRUCT(
		TRACE_FIELD(pid_t, pid, p->pid)
		TRACE_FIELD(int, success, success)
	),
	TPRAWFMT("task %d success=%d")
	);

 This creates us a unique struct of:

 struct {
	pid_t		pid;
	int		success;
 };

 And the way the call back would assign these values would be:

	entry->pid = p->pid;
	entry->success = success;

The nice part about this is that the creation of the assignent is done
via macro magic in the event tracer.  Once the TRACE_EVENT_FORMAT is
created, the developer will then have a faster method to record
into the ring buffer. They do not need to worry about the tracer itself.

The developer would only need to touch the files in include/trace/*.h

Again, I would like to give special thanks to Tom Zanussi for this
nice idea.

Idea-from: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-28 03:09:32 -05:00
..
blktrace.c Merge branch 'linus' into tracing/blktrace 2009-02-19 09:00:35 +01:00
events.c tracing: add raw trace point recording infrastructure 2009-02-28 03:09:32 -05:00
ftrace.c Merge branch 'tip/x86/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-02-22 18:12:01 +01:00
Kconfig tracing: add event trace infrastructure 2009-02-24 21:54:05 -05:00
kmemtrace.c tracing: Introduce trace_buffer_{lock_reserve,unlock_commit} 2009-02-06 01:01:41 +01:00
Makefile tracing: implement trace_clock_*() APIs 2009-02-26 18:44:06 +01:00
ring_buffer.c tracing: implement trace_clock_*() APIs 2009-02-26 18:44:06 +01:00
trace_boot.c tracing: Introduce trace_buffer_{lock_reserve,unlock_commit} 2009-02-06 01:01:41 +01:00
trace_branch.c tracing: remove unneeded variable 2009-02-10 12:32:18 -05:00
trace_clock.c tracing: implement trace_clock_*() APIs 2009-02-26 18:44:06 +01:00
trace_events_stage_1.h tracing: add raw trace point recording infrastructure 2009-02-28 03:09:32 -05:00
trace_events_stage_2.h tracing: add raw trace point recording infrastructure 2009-02-28 03:09:32 -05:00
trace_events_stage_3.h tracing: add raw trace point recording infrastructure 2009-02-28 03:09:32 -05:00
trace_events.c tracing: add raw trace point recording infrastructure 2009-02-28 03:09:32 -05:00
trace_functions_graph.c tracing/function-graph-tracer: fix merge 2009-02-19 13:01:37 +01:00
trace_functions.c tracing/core: use appropriate waiting on trace_pipe 2009-02-18 01:40:20 +01:00
trace_hw_branches.c tracing/hw-branch-tracing: convert bts-tracer mutex to a spinlock 2009-02-25 09:16:01 +01:00
trace_irqsoff.c tracing: fix typing mistake in hint message and comments 2009-02-17 12:38:24 -05:00
trace_mmiotrace.c mmiotrace: count events lost due to not recording 2009-02-15 20:02:42 +01:00
trace_nop.c trace: Call tracing_reset_online_cpus before tracer->init() 2009-02-06 01:01:41 +01:00
trace_output.c Merge branch 'tip/tracing/core/devel' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-02-09 10:35:12 +01:00
trace_output.h trace: make the trace_event callbacks return enum print_line_t 2009-02-04 20:48:39 +01:00
trace_power.c tracing: convert c/p state power tracer to use tracepoints 2009-02-13 09:06:18 -05:00
trace_sched_switch.c tracing/core: use appropriate waiting on trace_pipe 2009-02-18 01:40:20 +01:00
trace_sched_wakeup.c tracing/core: use appropriate waiting on trace_pipe 2009-02-18 01:40:20 +01:00
trace_selftest_dynamic.c
trace_selftest.c Merge branches 'tracing/blktrace', 'tracing/ftrace' and 'tracing/urgent' into tracing/core 2009-02-19 10:20:17 +01:00
trace_stack.c trace: better use of stack_trace_enabled for boot up code 2008-12-18 12:56:56 +01:00
trace_stat.c tracing: fix typing mistake in hint message and comments 2009-02-17 12:38:24 -05:00
trace_stat.h tracing/ftrace: separate events tracing and stats tracing engine 2009-01-14 12:11:37 +01:00
trace_sysprof.c tracing: fix typing mistake in hint message and comments 2009-02-17 12:38:24 -05:00
trace_workqueue.c trace_workqueue: use percpu data for workqueue stat 2009-01-20 13:06:59 +01:00
trace.c tracing: add interface to write into current tracer buffer 2009-02-28 03:06:44 -05:00
trace.h tracing: add raw trace point recording infrastructure 2009-02-28 03:09:32 -05:00