2008-05-12 21:20:42 +02:00
/*
* ring buffer based function tracer
*
* Copyright ( C ) 2007 - 2008 Steven Rostedt < srostedt @ redhat . com >
* Copyright ( C ) 2008 Ingo Molnar < mingo @ redhat . com >
*
* Originally taken from the RT patch by :
* Arnaldo Carvalho de Melo < acme @ redhat . com >
*
* Based on code from the latency_tracer , that is :
* Copyright ( C ) 2004 - 2006 Ingo Molnar
2012-12-06 10:39:54 +01:00
* Copyright ( C ) 2004 Nadia Yvette Chambers
2008-05-12 21:20:42 +02:00
*/
2008-12-01 22:20:19 -05:00
# include <linux/ring_buffer.h>
2009-10-18 00:52:28 +02:00
# include <generated/utsrelease.h>
2008-12-01 22:20:19 -05:00
# include <linux/stacktrace.h>
# include <linux/writeback.h>
2008-05-12 21:20:42 +02:00
# include <linux/kallsyms.h>
# include <linux/seq_file.h>
2008-07-30 22:36:46 -04:00
# include <linux/notifier.h>
2008-12-01 22:20:19 -05:00
# include <linux/irqflags.h>
2012-11-01 20:54:21 -04:00
# include <linux/irq_work.h>
2008-05-12 21:20:42 +02:00
# include <linux/debugfs.h>
2008-05-12 21:20:43 +02:00
# include <linux/pagemap.h>
2008-05-12 21:20:42 +02:00
# include <linux/hardirq.h>
# include <linux/linkage.h>
# include <linux/uaccess.h>
2008-12-01 22:20:19 -05:00
# include <linux/kprobes.h>
2008-05-12 21:20:42 +02:00
# include <linux/ftrace.h>
# include <linux/module.h>
# include <linux/percpu.h>
2008-12-01 22:20:19 -05:00
# include <linux/splice.h>
2008-07-30 22:36:46 -04:00
# include <linux/kdebug.h>
2009-03-27 14:22:10 +01:00
# include <linux/string.h>
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
# include <linux/rwsem.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 17:04:11 +09:00
# include <linux/slab.h>
2008-05-12 21:20:42 +02:00
# include <linux/ctype.h>
# include <linux/init.h>
2008-05-12 21:20:49 +02:00
# include <linux/poll.h>
2012-03-01 22:06:48 -05:00
# include <linux/nmi.h>
2008-05-12 21:20:42 +02:00
# include <linux/fs.h>
2008-05-12 21:20:51 +02:00
2008-05-12 21:20:42 +02:00
# include "trace.h"
2008-12-23 23:24:12 -05:00
# include "trace_output.h"
2008-05-12 21:20:42 +02:00
2009-03-11 13:42:01 -04:00
/*
* On boot up , the ring buffer is set to the minimum size , so that
* we do not waste memory on systems that are not using tracing .
*/
2009-07-01 10:47:05 +08:00
int ring_buffer_expanded ;
2009-03-11 13:42:01 -04:00
2008-12-06 03:41:33 +01:00
/*
* We need to change this state when a selftest is running .
2008-12-04 23:47:35 +01:00
* A selftest will lurk into the ring - buffer to count the
* entries inserted during the selftest although some concurrent
2009-03-05 10:24:48 +01:00
* insertions into the ring - buffer such as trace_printk could occurred
2008-12-04 23:47:35 +01:00
* at the same time , giving false positive or negative results .
*/
2008-12-06 03:41:33 +01:00
static bool __read_mostly tracing_selftest_running ;
2008-12-04 23:47:35 +01:00
2009-02-02 21:38:32 -05:00
/*
* If a tracer is running , we do not want to run SELFTEST .
*/
2009-07-01 10:47:05 +08:00
bool __read_mostly tracing_selftest_disabled ;
2009-02-02 21:38:32 -05:00
2008-11-17 19:23:42 +01:00
/* For tracers that don't implement custom flags */
static struct tracer_opt dummy_tracer_opt [ ] = {
{ }
} ;
static struct tracer_flags dummy_tracer_flags = {
. val = 0 ,
. opts = dummy_tracer_opt
} ;
static int dummy_set_flag ( u32 old_flags , u32 bit , int set )
{
return 0 ;
}
2008-11-05 16:05:44 -05:00
2012-10-11 12:14:25 -04:00
/*
* To prevent the comm cache from being overwritten when no
* tracing is active , only save the comm when a trace event
* occurred .
*/
static DEFINE_PER_CPU ( bool , trace_cmdline_save ) ;
2012-11-01 20:54:21 -04:00
/*
* When a reader is waiting for data , then this variable is
* set to true .
*/
static bool trace_wakeup_needed ;
static struct irq_work trace_work_wakeup ;
2008-11-05 16:05:44 -05:00
/*
* Kill all tracing for good ( never come back ) .
* It is initialized to 1 but will turn to zero if the initialization
* of the tracer is successful . But that is the only place that sets
* this back to zero .
*/
2009-02-10 19:44:12 +01:00
static int tracing_disabled = 1 ;
2008-11-05 16:05:44 -05:00
2009-10-07 19:17:45 -04:00
DEFINE_PER_CPU ( int , ftrace_cpu_disabled ) ;
2008-10-01 00:29:53 -04:00
2010-08-05 09:22:23 -05:00
cpumask_var_t __read_mostly tracing_buffer_mask ;
2008-05-12 21:21:00 +02:00
2008-10-23 19:26:08 -04:00
/*
* ftrace_dump_on_oops - variable to dump ftrace buffer on oops
*
* If there is an oops ( or kernel panic ) and the ftrace_dump_on_oops
* is set , then ftrace_dump is called . This will output the contents
* of the ftrace buffers to the console . This is very useful for
* capturing traces that lead to crashes and outputing it to a
* serial console .
*
* It is default off , but you can enable it with either specifying
* " ftrace_dump_on_oops " in the kernel command line , or setting
2010-04-18 19:08:41 +02:00
* / proc / sys / kernel / ftrace_dump_on_oops
* Set 1 if you want to dump buffers of all CPUs
* Set 2 if you want to dump the buffer of the CPU that triggered oops
2008-10-23 19:26:08 -04:00
*/
2010-04-18 19:08:41 +02:00
enum ftrace_dump_mode ftrace_dump_on_oops ;
2008-10-23 19:26:08 -04:00
2009-02-02 21:38:32 -05:00
static int tracing_set_tracer ( const char * buf ) ;
2009-09-18 14:06:47 +08:00
# define MAX_TRACER_SIZE 100
static char bootup_tracer_buf [ MAX_TRACER_SIZE ] __initdata ;
2009-02-02 21:38:32 -05:00
static char * default_bootup_tracer ;
2008-11-01 19:57:37 +01:00
2009-10-14 20:50:32 +02:00
static int __init set_cmdline_ftrace ( char * str )
2008-11-01 19:57:37 +01:00
{
2009-09-18 14:06:47 +08:00
strncpy ( bootup_tracer_buf , str , MAX_TRACER_SIZE ) ;
2009-02-02 21:38:32 -05:00
default_bootup_tracer = bootup_tracer_buf ;
2009-03-11 13:42:01 -04:00
/* We are using ftrace early, expand it */
ring_buffer_expanded = 1 ;
2008-11-01 19:57:37 +01:00
return 1 ;
}
2009-10-14 20:50:32 +02:00
__setup ( " ftrace= " , set_cmdline_ftrace ) ;
2008-11-01 19:57:37 +01:00
2008-10-23 19:26:08 -04:00
static int __init set_ftrace_dump_on_oops ( char * str )
{
2010-04-18 19:08:41 +02:00
if ( * str + + ! = ' = ' | | ! * str ) {
ftrace_dump_on_oops = DUMP_ALL ;
return 1 ;
}
if ( ! strcmp ( " orig_cpu " , str ) ) {
ftrace_dump_on_oops = DUMP_ORIG ;
return 1 ;
}
return 0 ;
2008-10-23 19:26:08 -04:00
}
__setup ( " ftrace_dump_on_oops " , set_ftrace_dump_on_oops ) ;
2008-05-12 21:20:44 +02:00
2012-11-01 22:56:07 -04:00
static char trace_boot_options_buf [ MAX_TRACER_SIZE ] __initdata ;
static char * trace_boot_options __initdata ;
static int __init set_trace_boot_options ( char * str )
{
strncpy ( trace_boot_options_buf , str , MAX_TRACER_SIZE ) ;
trace_boot_options = trace_boot_options_buf ;
return 0 ;
}
__setup ( " trace_options= " , set_trace_boot_options ) ;
2009-03-30 13:48:00 +08:00
unsigned long long ns2usecs ( cycle_t nsec )
2008-05-12 21:20:42 +02:00
{
nsec + = 500 ;
do_div ( nsec , 1000 ) ;
return nsec ;
}
2008-05-12 21:21:00 +02:00
/*
* The global_trace is the descriptor that holds the tracing
* buffers for the live tracing . For each CPU , it contains
* a link list of pages that will store trace entries . The
* page descriptor of the pages in the memory is used to hold
* the link list by linking the lru item in the page descriptor
* to each of the pages in the buffer per CPU .
*
* For each active CPU there is a data field that holds the
* pages for the buffer for that CPU . Each CPU has the same number
* of pages allocated for its buffer .
*/
2008-05-12 21:20:42 +02:00
static struct trace_array global_trace ;
static DEFINE_PER_CPU ( struct trace_array_cpu , global_trace_cpu ) ;
2009-09-02 14:17:06 -04:00
int filter_current_check_discard ( struct ring_buffer * buffer ,
struct ftrace_event_call * call , void * rec ,
2009-04-08 03:15:54 -05:00
struct ring_buffer_event * event )
{
2009-09-02 14:17:06 -04:00
return filter_check_discard ( call , rec , buffer , event ) ;
2009-04-08 03:15:54 -05:00
}
2009-04-10 18:12:50 -04:00
EXPORT_SYMBOL_GPL ( filter_current_check_discard ) ;
2009-04-08 03:15:54 -05:00
2009-03-17 17:22:06 -04:00
cycle_t ftrace_now ( int cpu )
{
u64 ts ;
/* Early boot up does not have a buffer yet */
if ( ! global_trace . buffer )
return trace_clock_local ( ) ;
ts = ring_buffer_time_stamp ( global_trace . buffer , cpu ) ;
ring_buffer_normalize_time_stamp ( global_trace . buffer , cpu , & ts ) ;
return ts ;
}
2008-05-12 21:20:42 +02:00
2008-05-12 21:21:00 +02:00
/*
* The max_tr is used to snapshot the global_trace when a maximum
* latency is reached . Some tracers will use this to store a maximum
* trace while it continues examining live traces .
*
* The buffers for the max_tr are set up the same as the global_trace .
* When a snapshot is taken , the link list of the max_tr is swapped
* with the link list of the global_trace and the buffers are reset for
* the global_trace so the tracing can continue .
*/
2008-05-12 21:20:42 +02:00
static struct trace_array max_tr ;
2009-10-29 22:34:13 +09:00
static DEFINE_PER_CPU ( struct trace_array_cpu , max_tr_data ) ;
2008-05-12 21:20:42 +02:00
ftrace: restructure tracing start/stop infrastructure
Impact: change where tracing is started up and stopped
Currently, when a new tracer is selected via echo'ing a tracer name into
the current_tracer file, the startup is only done if tracing_enabled is
set to one. If tracing_enabled is changed to zero (by echo'ing 0 into
the tracing_enabled file) a full shutdown is performed.
The full startup and shutdown of a tracer can be expensive and the
user can lose out traces when echo'ing in 0 to the tracing_enabled file,
because the process takes too long. There can also be places that
the user would like to start and stop the tracer several times and
doing the full startup and shutdown of a tracer might be too expensive.
This patch performs the full startup and shutdown when a tracer is
selected. It also adds a way to do a quick start or stop of a tracer.
The quick version is just a flag that prevents the tracing from
taking place, but the overhead of the code is still there.
For example, the startup of a tracer may enable tracepoints, or enable
the function tracer. The stop and start will just set a flag to
have the tracer ignore the calls when the tracepoint or function trace
is called. The overhead of the tracer may still be present when
the tracer is stopped, but no tracing will occur. Setting the tracer
to the 'nop' tracer (or any other tracer) will perform the shutdown
of the tracer which will disable the tracepoint or disable the
function tracer.
The tracing_enabled file will simply start or stop tracing.
This change is all internal. The end result for the user should be the same
as before. If tracing_enabled is not set, no trace will happen.
If tracing_enabled is set, then the trace will happen. The tracing_enabled
variable is static between tracers. Enabling tracing_enabled and
going to another tracer will keep tracing_enabled enabled. Same
is true with disabling tracing_enabled.
This patch will now provide a fast start/stop method to the users
for enabling or disabling tracing.
Note: There were two methods to the struct tracer that were never
used: The methods start and stop. These were to be used as a hook
to the reading of the trace output, but ended up not being
necessary. These two methods are now used to enable the start
and stop of each tracer, in case the tracer needs to do more than
just not write into the buffer. For example, the irqsoff tracer
must stop recording max latencies when tracing is stopped.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 16:05:44 -05:00
int tracing_is_enabled ( void )
{
2012-05-11 14:25:30 -04:00
return tracing_is_on ( ) ;
ftrace: restructure tracing start/stop infrastructure
Impact: change where tracing is started up and stopped
Currently, when a new tracer is selected via echo'ing a tracer name into
the current_tracer file, the startup is only done if tracing_enabled is
set to one. If tracing_enabled is changed to zero (by echo'ing 0 into
the tracing_enabled file) a full shutdown is performed.
The full startup and shutdown of a tracer can be expensive and the
user can lose out traces when echo'ing in 0 to the tracing_enabled file,
because the process takes too long. There can also be places that
the user would like to start and stop the tracer several times and
doing the full startup and shutdown of a tracer might be too expensive.
This patch performs the full startup and shutdown when a tracer is
selected. It also adds a way to do a quick start or stop of a tracer.
The quick version is just a flag that prevents the tracing from
taking place, but the overhead of the code is still there.
For example, the startup of a tracer may enable tracepoints, or enable
the function tracer. The stop and start will just set a flag to
have the tracer ignore the calls when the tracepoint or function trace
is called. The overhead of the tracer may still be present when
the tracer is stopped, but no tracing will occur. Setting the tracer
to the 'nop' tracer (or any other tracer) will perform the shutdown
of the tracer which will disable the tracepoint or disable the
function tracer.
The tracing_enabled file will simply start or stop tracing.
This change is all internal. The end result for the user should be the same
as before. If tracing_enabled is not set, no trace will happen.
If tracing_enabled is set, then the trace will happen. The tracing_enabled
variable is static between tracers. Enabling tracing_enabled and
going to another tracer will keep tracing_enabled enabled. Same
is true with disabling tracing_enabled.
This patch will now provide a fast start/stop method to the users
for enabling or disabling tracing.
Note: There were two methods to the struct tracer that were never
used: The methods start and stop. These were to be used as a hook
to the reading of the trace output, but ended up not being
necessary. These two methods are now used to enable the start
and stop of each tracer, in case the tracer needs to do more than
just not write into the buffer. For example, the irqsoff tracer
must stop recording max latencies when tracing is stopped.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 16:05:44 -05:00
}
2008-05-12 21:21:00 +02:00
/*
2008-09-29 23:02:41 -04:00
* trace_buf_size is the size in bytes that is allocated
* for a buffer . Note , the number of bytes is always rounded
* to page size .
2008-07-30 22:36:46 -04:00
*
* This number is purposely set to a low number of 16384.
* If the dump on oops happens , it will be much appreciated
* to not have to wait for all that output . Anyway this can be
* boot time and run time configurable .
2008-05-12 21:21:00 +02:00
*/
2008-09-29 23:02:41 -04:00
# define TRACE_BUF_SIZE_DEFAULT 1441792UL /* 16384 * 88 (sizeof(entry)) */
2008-07-30 22:36:46 -04:00
2008-09-29 23:02:41 -04:00
static unsigned long trace_buf_size = TRACE_BUF_SIZE_DEFAULT ;
2008-05-12 21:20:42 +02:00
2008-05-12 21:21:00 +02:00
/* trace_types holds a link list of available tracers. */
2008-05-12 21:20:42 +02:00
static struct tracer * trace_types __read_mostly ;
2008-05-12 21:21:00 +02:00
/* current_trace points to the tracer that is currently active */
2008-05-12 21:20:42 +02:00
static struct tracer * current_trace __read_mostly ;
2008-05-12 21:21:00 +02:00
/*
* trace_types_lock is used to protect the trace_types list .
*/
2008-05-12 21:20:42 +02:00
static DEFINE_MUTEX ( trace_types_lock ) ;
2008-05-12 21:21:00 +02:00
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
/*
* serialize the access of the ring buffer
*
* ring buffer serializes readers , but it is low level protection .
* The validity of the events ( which returns by ring_buffer_peek ( ) . . etc )
* are not protected by ring buffer .
*
* The content of events may become garbage if we allow other process consumes
* these events concurrently :
* A ) the page of the consumed events may become a normal page
* ( not reader page ) in ring buffer , and this page will be rewrited
* by events producer .
* B ) The page of the consumed events may become a page for splice_read ,
* and this page will be returned to system .
*
* These primitives allow multi process access to different cpu ring buffer
* concurrently .
*
* These primitives don ' t distinguish read - only and read - consume access .
* Multi read - only access are also serialized .
*/
# ifdef CONFIG_SMP
static DECLARE_RWSEM ( all_cpu_access_lock ) ;
static DEFINE_PER_CPU ( struct mutex , cpu_access_lock ) ;
static inline void trace_access_lock ( int cpu )
{
if ( cpu = = TRACE_PIPE_ALL_CPU ) {
/* gain it for accessing the whole ring buffer. */
down_write ( & all_cpu_access_lock ) ;
} else {
/* gain it for accessing a cpu ring buffer. */
/* Firstly block other trace_access_lock(TRACE_PIPE_ALL_CPU). */
down_read ( & all_cpu_access_lock ) ;
/* Secondly block other access to this @cpu ring buffer. */
mutex_lock ( & per_cpu ( cpu_access_lock , cpu ) ) ;
}
}
static inline void trace_access_unlock ( int cpu )
{
if ( cpu = = TRACE_PIPE_ALL_CPU ) {
up_write ( & all_cpu_access_lock ) ;
} else {
mutex_unlock ( & per_cpu ( cpu_access_lock , cpu ) ) ;
up_read ( & all_cpu_access_lock ) ;
}
}
static inline void trace_access_lock_init ( void )
{
int cpu ;
for_each_possible_cpu ( cpu )
mutex_init ( & per_cpu ( cpu_access_lock , cpu ) ) ;
}
# else
static DEFINE_MUTEX ( access_lock ) ;
static inline void trace_access_lock ( int cpu )
{
( void ) cpu ;
mutex_lock ( & access_lock ) ;
}
static inline void trace_access_unlock ( int cpu )
{
( void ) cpu ;
mutex_unlock ( & access_lock ) ;
}
static inline void trace_access_lock_init ( void )
{
}
# endif
2008-05-12 21:21:00 +02:00
/* trace_wait is a waitqueue for tasks blocked on trace_poll */
2008-05-12 21:20:52 +02:00
static DECLARE_WAIT_QUEUE_HEAD ( trace_wait ) ;
2008-11-12 17:52:37 -05:00
/* trace_flags holds trace_options default values */
2008-11-12 17:52:38 -05:00
unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK |
2009-03-24 23:17:58 -04:00
TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME |
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 09:34:33 -05:00
TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD | TRACE_ITER_OVERWRITE |
2012-09-07 18:12:19 -07:00
TRACE_ITER_IRQ_INFO | TRACE_ITER_MARKERS ;
2008-05-12 21:20:52 +02:00
2009-08-31 22:32:27 -04:00
static int trace_stop_count ;
2009-07-25 17:13:33 +02:00
static DEFINE_RAW_SPINLOCK ( tracing_start_lock ) ;
2009-08-31 22:32:27 -04:00
2012-11-01 20:54:21 -04:00
/**
* trace_wake_up - wake up tasks waiting for trace input
*
* Schedules a delayed work to wake up any task that is blocked on the
* trace_wait queue . These is used with trace_poll for tasks polling the
* trace .
*/
static void trace_wake_up ( struct irq_work * work )
2011-05-10 13:27:21 -07:00
{
2012-11-01 20:54:21 -04:00
wake_up_all ( & trace_wait ) ;
2011-05-10 13:27:21 -07:00
2012-11-01 20:54:21 -04:00
}
2011-05-10 13:27:21 -07:00
2012-02-22 15:50:28 -05:00
/**
* tracing_on - enable tracing buffers
*
* This function enables tracing buffers that may have been
* disabled with tracing_off .
*/
void tracing_on ( void )
{
if ( global_trace . buffer )
ring_buffer_record_on ( global_trace . buffer ) ;
/*
* This flag is only looked at when buffers haven ' t been
* allocated yet . We don ' t really care about the race
* between setting this flag and actually turning
* on the buffer .
*/
global_trace . buffer_disabled = 0 ;
}
EXPORT_SYMBOL_GPL ( tracing_on ) ;
/**
* tracing_off - turn off tracing buffers
*
* This function stops the tracing buffers from recording data .
* It does not disable any overhead the tracers themselves may
* be causing . This function simply causes all recording to
* the ring buffers to fail .
*/
void tracing_off ( void )
{
if ( global_trace . buffer )
2012-06-06 19:50:40 -04:00
ring_buffer_record_off ( global_trace . buffer ) ;
2012-02-22 15:50:28 -05:00
/*
* This flag is only looked at when buffers haven ' t been
* allocated yet . We don ' t really care about the race
* between setting this flag and actually turning
* on the buffer .
*/
global_trace . buffer_disabled = 1 ;
}
EXPORT_SYMBOL_GPL ( tracing_off ) ;
/**
* tracing_is_on - show state of ring buffers enabled
*/
int tracing_is_on ( void )
{
if ( global_trace . buffer )
return ring_buffer_record_is_on ( global_trace . buffer ) ;
return ! global_trace . buffer_disabled ;
}
EXPORT_SYMBOL_GPL ( tracing_is_on ) ;
2008-09-29 23:02:41 -04:00
static int __init set_buf_size ( char * str )
2008-05-12 21:20:42 +02:00
{
2008-09-29 23:02:41 -04:00
unsigned long buf_size ;
2008-05-12 21:21:00 +02:00
2008-05-12 21:20:42 +02:00
if ( ! str )
return 0 ;
2009-06-24 17:33:15 +08:00
buf_size = memparse ( str , & str ) ;
2008-05-12 21:21:00 +02:00
/* nr_entries can not be zero */
2009-06-24 17:33:15 +08:00
if ( buf_size = = 0 )
2008-05-12 21:21:00 +02:00
return 0 ;
2008-09-29 23:02:41 -04:00
trace_buf_size = buf_size ;
2008-05-12 21:20:42 +02:00
return 1 ;
}
2008-09-29 23:02:41 -04:00
__setup ( " trace_buf_size= " , set_buf_size ) ;
2008-05-12 21:20:42 +02:00
2010-02-25 15:36:43 -08:00
static int __init set_tracing_thresh ( char * str )
{
2012-08-02 14:02:00 +08:00
unsigned long threshold ;
2010-02-25 15:36:43 -08:00
int ret ;
if ( ! str )
return 0 ;
2012-09-26 22:08:38 +02:00
ret = kstrtoul ( str , 0 , & threshold ) ;
2010-02-25 15:36:43 -08:00
if ( ret < 0 )
return 0 ;
2012-08-02 14:02:00 +08:00
tracing_thresh = threshold * 1000 ;
2010-02-25 15:36:43 -08:00
return 1 ;
}
__setup ( " tracing_thresh= " , set_tracing_thresh ) ;
2008-05-12 21:20:44 +02:00
unsigned long nsecs_to_usecs ( unsigned long nsecs )
{
return nsecs / 1000 ;
}
2008-05-12 21:21:00 +02:00
/* These must match the bit postions in trace_iterator_flags */
2008-05-12 21:20:42 +02:00
static const char * trace_options [ ] = {
" print-parent " ,
" sym-offset " ,
" sym-addr " ,
" verbose " ,
2008-05-12 21:20:47 +02:00
" raw " ,
2008-05-12 21:20:49 +02:00
" hex " ,
2008-05-12 21:20:47 +02:00
" bin " ,
2008-05-12 21:20:49 +02:00
" block " ,
2008-05-12 21:20:51 +02:00
" stacktrace " ,
2009-03-05 10:24:48 +01:00
" trace_printk " ,
ftrace: function tracer with irqs disabled
Impact: disable interrupts during trace entry creation (as opposed to preempt)
To help with performance, I set the ftracer to not disable interrupts,
and only to disable preemption. If an interrupt occurred, it would not
be traced, because the function tracer protects itself from recursion.
This may be faster, but the trace output might miss some traces.
This patch makes the fuction trace disable interrupts, but it also
adds a runtime feature to disable preemption instead. It does this by
having two different tracer functions. When the function tracer is
enabled, it will check to see which version is requested (irqs disabled
or preemption disabled). Then it will use the corresponding function
as the tracer.
Irq disabling is the default behaviour, but if the user wants better
performance, with the chance of missing traces, then they can choose
the preempt disabled version.
Running hackbench 3 times with the irqs disabled and 3 times with
the preempt disabled function tracer yielded:
tracing type times entries recorded
------------ -------- ----------------
irq disabled 43.393 166433066
43.282 166172618
43.298 166256704
preempt disabled 38.969 159871710
38.943 159972935
39.325 161056510
Average:
irqs disabled: 43.324 166287462
preempt disabled: 39.079 160300385
preempt is 10.8 percent faster than irqs disabled.
I wrote a patch to count function trace recursion and reran hackbench.
With irq disabled: 1,150 times the function tracer did not trace due to
recursion.
with preempt disabled: 5,117,718 times.
The thousand times with irq disabled could be due to NMIs, or simply a case
where it called a function that was not protected by notrace.
But we also see that a large amount of the trace is lost with the
preempt version.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03 23:15:57 -05:00
" ftrace_preempt " ,
2008-11-12 15:24:24 -05:00
" branch " ,
2008-11-12 17:52:38 -05:00
" annotate " ,
2008-11-22 13:28:47 +02:00
" userstacktrace " ,
2008-11-22 13:28:48 +02:00
" sym-userobj " ,
2008-12-13 20:18:13 +01:00
" printk-msg-only " ,
2009-02-02 20:29:21 -02:00
" context-info " ,
2009-03-04 20:34:24 -05:00
" latency-format " ,
2009-03-24 11:06:24 -04:00
" sleep-time " ,
2009-03-24 23:17:58 -04:00
" graph-time " ,
2010-07-02 11:07:32 +08:00
" record-cmd " ,
2010-12-08 13:46:47 -08:00
" overwrite " ,
2011-06-14 22:44:07 -04:00
" disable_on_free " ,
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 09:34:33 -05:00
" irq-info " ,
2012-09-07 18:12:19 -07:00
" markers " ,
2008-05-12 21:20:42 +02:00
NULL
} ;
2009-08-25 16:12:56 +08:00
static struct {
u64 ( * func ) ( void ) ;
const char * name ;
2012-11-13 12:18:22 -08:00
int in_ns ; /* is this clock in nanoseconds? */
2009-08-25 16:12:56 +08:00
} trace_clocks [ ] = {
2012-11-13 12:18:22 -08:00
{ trace_clock_local , " local " , 1 } ,
{ trace_clock_global , " global " , 1 } ,
{ trace_clock_counter , " counter " , 0 } ,
2012-11-13 12:18:21 -08:00
ARCH_TRACE_CLOCKS
2009-08-25 16:12:56 +08:00
} ;
int trace_clock_id ;
2009-09-11 17:29:27 +02:00
/*
* trace_parser_get_init - gets the buffer for trace parser
*/
int trace_parser_get_init ( struct trace_parser * parser , int size )
{
memset ( parser , 0 , sizeof ( * parser ) ) ;
parser - > buffer = kmalloc ( size , GFP_KERNEL ) ;
if ( ! parser - > buffer )
return 1 ;
parser - > size = size ;
return 0 ;
}
/*
* trace_parser_put - frees the buffer for trace parser
*/
void trace_parser_put ( struct trace_parser * parser )
{
kfree ( parser - > buffer ) ;
}
/*
* trace_get_user - reads the user input string separated by space
* ( matched by isspace ( ch ) )
*
* For each string found the ' struct trace_parser ' is updated ,
* and the function returns .
*
* Returns number of bytes read .
*
* See kernel / trace / trace . h for ' struct trace_parser ' details .
*/
int trace_get_user ( struct trace_parser * parser , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
char ch ;
size_t read = 0 ;
ssize_t ret ;
if ( ! * ppos )
trace_parser_clear ( parser ) ;
ret = get_user ( ch , ubuf + + ) ;
if ( ret )
goto out ;
read + + ;
cnt - - ;
/*
* The parser is not finished with the last write ,
* continue reading the user input without skipping spaces .
*/
if ( ! parser - > cont ) {
/* skip white space */
while ( cnt & & isspace ( ch ) ) {
ret = get_user ( ch , ubuf + + ) ;
if ( ret )
goto out ;
read + + ;
cnt - - ;
}
/* only spaces were written */
if ( isspace ( ch ) ) {
* ppos + = read ;
ret = read ;
goto out ;
}
parser - > idx = 0 ;
}
/* read the non-space input */
while ( cnt & & ! isspace ( ch ) ) {
2009-09-22 13:51:54 +08:00
if ( parser - > idx < parser - > size - 1 )
2009-09-11 17:29:27 +02:00
parser - > buffer [ parser - > idx + + ] = ch ;
else {
ret = - EINVAL ;
goto out ;
}
ret = get_user ( ch , ubuf + + ) ;
if ( ret )
goto out ;
read + + ;
cnt - - ;
}
/* We either got finished input or we have to wait for another call. */
if ( isspace ( ch ) ) {
parser - > buffer [ parser - > idx ] = 0 ;
parser - > cont = false ;
} else {
parser - > cont = true ;
parser - > buffer [ parser - > idx + + ] = ch ;
}
* ppos + = read ;
ret = read ;
out :
return ret ;
}
2008-05-12 21:21:02 +02:00
ssize_t trace_seq_to_user ( struct trace_seq * s , char __user * ubuf , size_t cnt )
{
int len ;
int ret ;
2009-03-04 19:10:05 -05:00
if ( ! cnt )
return 0 ;
2008-05-12 21:21:02 +02:00
if ( s - > len < = s - > readpos )
return - EBUSY ;
len = s - > len - s - > readpos ;
if ( cnt > len )
cnt = len ;
ret = copy_to_user ( ubuf , s - > buffer + s - > readpos , cnt ) ;
2009-03-04 19:10:05 -05:00
if ( ret = = cnt )
2008-05-12 21:21:02 +02:00
return - EFAULT ;
2009-03-04 19:10:05 -05:00
cnt - = ret ;
2009-03-04 20:31:11 -05:00
s - > readpos + = cnt ;
2008-05-12 21:21:02 +02:00
return cnt ;
2008-05-12 21:20:46 +02:00
}
2009-03-22 19:11:11 +02:00
static ssize_t trace_seq_to_buffer ( struct trace_seq * s , void * buf , size_t cnt )
2009-02-09 08:15:56 +02:00
{
int len ;
if ( s - > len < = s - > readpos )
return - EBUSY ;
len = s - > len - s - > readpos ;
if ( cnt > len )
cnt = len ;
2012-04-20 09:31:45 +03:00
memcpy ( buf , s - > buffer + s - > readpos , cnt ) ;
2009-02-09 08:15:56 +02:00
2009-03-04 20:31:11 -05:00
s - > readpos + = cnt ;
2009-02-09 08:15:56 +02:00
return cnt ;
}
2009-08-27 16:52:21 -04:00
/*
* ftrace_max_lock is used to protect the swapping of buffers
* when taking a max snapshot . The buffers themselves are
* protected by per_cpu spinlocks . But the action of the swap
* needs its own lock .
*
2009-12-02 19:49:50 +01:00
* This is defined as a arch_spinlock_t in order to help
2009-08-27 16:52:21 -04:00
* with performance when lockdep debugging is enabled .
*
* It is also used in other places outside the update_max_tr
* so it needs to be defined outside of the
* CONFIG_TRACER_MAX_TRACE .
*/
2009-12-02 19:49:50 +01:00
static arch_spinlock_t ftrace_max_lock =
2009-12-03 12:38:57 +01:00
( arch_spinlock_t ) __ARCH_SPIN_LOCK_UNLOCKED ;
2009-08-27 16:52:21 -04:00
2010-02-25 15:36:43 -08:00
unsigned long __read_mostly tracing_thresh ;
2009-08-27 16:52:21 -04:00
# ifdef CONFIG_TRACER_MAX_TRACE
unsigned long __read_mostly tracing_max_latency ;
/*
* Copy the new maximum trace into the separate maximum - trace
* structure . ( this way the maximum trace is permanently saved ,
* for later retrieval via / sys / kernel / debug / tracing / latency_trace )
*/
static void
__update_max_tr ( struct trace_array * tr , struct task_struct * tsk , int cpu )
{
struct trace_array_cpu * data = tr - > data [ cpu ] ;
2010-03-05 18:23:50 -03:00
struct trace_array_cpu * max_data ;
2009-08-27 16:52:21 -04:00
max_tr . cpu = cpu ;
max_tr . time_start = data - > preempt_timestamp ;
2009-09-02 12:27:41 -04:00
max_data = max_tr . data [ cpu ] ;
max_data - > saved_latency = tracing_max_latency ;
max_data - > critical_start = data - > critical_start ;
max_data - > critical_end = data - > critical_end ;
2009-08-27 16:52:21 -04:00
2010-03-05 18:23:50 -03:00
memcpy ( max_data - > comm , tsk - > comm , TASK_COMM_LEN ) ;
2009-09-02 12:27:41 -04:00
max_data - > pid = tsk - > pid ;
max_data - > uid = task_uid ( tsk ) ;
max_data - > nice = tsk - > static_prio - 20 - MAX_RT_PRIO ;
max_data - > policy = tsk - > policy ;
max_data - > rt_priority = tsk - > rt_priority ;
2009-08-27 16:52:21 -04:00
/* record this tasks comm */
tracing_record_cmdline ( tsk ) ;
}
2008-05-12 21:21:00 +02:00
/**
* update_max_tr - snapshot all trace buffers from global_trace to max_tr
* @ tr : tracer
* @ tsk : the task with the latency
* @ cpu : The cpu that initiated the trace .
*
* Flip the buffers between the @ tr and the max_tr and record information
* about which task was the cause of this latency .
*/
2008-05-12 21:20:51 +02:00
void
2008-05-12 21:20:42 +02:00
update_max_tr ( struct trace_array * tr , struct task_struct * tsk , int cpu )
{
2008-09-29 23:02:41 -04:00
struct ring_buffer * buf = tr - > buffer ;
2008-05-12 21:20:42 +02:00
2009-08-31 22:32:27 -04:00
if ( trace_stop_count )
return ;
2008-05-12 21:20:43 +02:00
WARN_ON_ONCE ( ! irqs_disabled ( ) ) ;
2010-07-01 14:34:35 +09:00
if ( ! current_trace - > use_max_tr ) {
WARN_ON_ONCE ( 1 ) ;
return ;
}
2009-12-02 20:01:25 +01:00
arch_spin_lock ( & ftrace_max_lock ) ;
2008-09-29 23:02:41 -04:00
tr - > buffer = max_tr . buffer ;
max_tr . buffer = buf ;
2008-05-12 21:20:42 +02:00
__update_max_tr ( tr , tsk , cpu ) ;
2009-12-02 20:01:25 +01:00
arch_spin_unlock ( & ftrace_max_lock ) ;
2008-05-12 21:20:42 +02:00
}
/**
* update_max_tr_single - only copy one trace over , and reset the rest
* @ tr - tracer
* @ tsk - task with the latency
* @ cpu - the cpu of the buffer to copy .
2008-05-12 21:21:00 +02:00
*
* Flip the trace of a single CPU buffer between the @ tr and the max_tr .
2008-05-12 21:20:42 +02:00
*/
2008-05-12 21:20:51 +02:00
void
2008-05-12 21:20:42 +02:00
update_max_tr_single ( struct trace_array * tr , struct task_struct * tsk , int cpu )
{
2008-09-29 23:02:41 -04:00
int ret ;
2008-05-12 21:20:42 +02:00
2009-08-31 22:32:27 -04:00
if ( trace_stop_count )
return ;
2008-05-12 21:20:43 +02:00
WARN_ON_ONCE ( ! irqs_disabled ( ) ) ;
2010-07-01 14:34:35 +09:00
if ( ! current_trace - > use_max_tr ) {
WARN_ON_ONCE ( 1 ) ;
return ;
}
2009-12-02 20:01:25 +01:00
arch_spin_lock ( & ftrace_max_lock ) ;
2008-05-12 21:20:42 +02:00
2008-09-29 23:02:41 -04:00
ret = ring_buffer_swap_cpu ( max_tr . buffer , tr - > buffer , cpu ) ;
2009-09-03 19:13:05 -04:00
if ( ret = = - EBUSY ) {
/*
* We failed to swap the buffer due to a commit taking
* place on this CPU . We fail to record , but we reset
* the max trace buffer ( no one writes directly to it )
* and flag that it failed .
*/
trace_array_printk ( & max_tr , _THIS_IP_ ,
" Failed to swap buffers due to commit in progress \n " ) ;
}
WARN_ON_ONCE ( ret & & ret ! = - EAGAIN & & ret ! = - EBUSY ) ;
2008-05-12 21:20:42 +02:00
__update_max_tr ( tr , tsk , cpu ) ;
2009-12-02 20:01:25 +01:00
arch_spin_unlock ( & ftrace_max_lock ) ;
2008-05-12 21:20:42 +02:00
}
2009-08-27 16:52:21 -04:00
# endif /* CONFIG_TRACER_MAX_TRACE */
2008-05-12 21:20:42 +02:00
2012-11-01 20:54:21 -04:00
static void default_wait_pipe ( struct trace_iterator * iter )
{
DEFINE_WAIT ( wait ) ;
prepare_to_wait ( & trace_wait , & wait , TASK_INTERRUPTIBLE ) ;
/*
* The events can happen in critical sections where
* checking a work queue can cause deadlocks .
* After adding a task to the queue , this flag is set
* only to notify events to try to wake up the queue
* using irq_work .
*
* We don ' t clear it even if the buffer is no longer
* empty . The flag only causes the next event to run
* irq_work to do the work queue wake up . The worse
* that can happen if we race with ! trace_empty ( ) is that
* an event will cause an irq_work to try to wake up
* an empty queue .
*
* There ' s no reason to protect this flag either , as
* the work queue and irq_work logic will do the necessary
* synchronization for the wake ups . The only thing
* that is necessary is that the wake up happens after
* a task has been queued . It ' s OK for spurious wake ups .
*/
trace_wakeup_needed = true ;
if ( trace_empty ( iter ) )
schedule ( ) ;
finish_wait ( & trace_wait , & wait ) ;
}
2008-05-12 21:21:00 +02:00
/**
* register_tracer - register a tracer with the ftrace system .
* @ type - the plugin for the tracer
*
* Register a new plugin tracer .
*/
2008-05-12 21:20:42 +02:00
int register_tracer ( struct tracer * type )
{
struct tracer * t ;
int ret = 0 ;
if ( ! type - > name ) {
pr_info ( " Tracer must have a name \n " ) ;
return - 1 ;
}
2010-07-10 12:06:44 +02:00
if ( strlen ( type - > name ) > = MAX_TRACER_SIZE ) {
2009-09-18 14:06:47 +08:00
pr_info ( " Tracer has a name longer than %d \n " , MAX_TRACER_SIZE ) ;
return - 1 ;
}
2008-05-12 21:20:42 +02:00
mutex_lock ( & trace_types_lock ) ;
2008-11-19 10:00:15 +01:00
2008-12-06 03:41:33 +01:00
tracing_selftest_running = true ;
2008-05-12 21:20:42 +02:00
for ( t = trace_types ; t ; t = t - > next ) {
if ( strcmp ( type - > name , t - > name ) = = 0 ) {
/* already found */
2009-09-18 14:06:47 +08:00
pr_info ( " Tracer %s already registered \n " ,
2008-05-12 21:20:42 +02:00
type - > name ) ;
ret = - 1 ;
goto out ;
}
}
2008-11-17 19:23:42 +01:00
if ( ! type - > set_flag )
type - > set_flag = & dummy_set_flag ;
if ( ! type - > flags )
type - > flags = & dummy_tracer_flags ;
else
if ( ! type - > flags - > opts )
type - > flags - > opts = dummy_tracer_opt ;
2009-02-11 02:25:00 +01:00
if ( ! type - > wait_pipe )
type - > wait_pipe = default_wait_pipe ;
2008-11-17 19:23:42 +01:00
2008-05-12 21:20:44 +02:00
# ifdef CONFIG_FTRACE_STARTUP_TEST
2009-02-02 21:38:32 -05:00
if ( type - > selftest & & ! tracing_selftest_disabled ) {
2008-05-12 21:20:44 +02:00
struct tracer * saved_tracer = current_trace ;
struct trace_array * tr = & global_trace ;
2008-12-04 23:47:35 +01:00
2008-05-12 21:20:44 +02:00
/*
* Run a selftest on this tracer .
* Here we reset the trace buffer , and set the current
* tracer to be this tracer . The tracer can then run some
* internal tracing to verify that everything is in order .
* If we fail , we do not register this tracer .
*/
2009-09-04 12:12:39 -04:00
tracing_reset_online_cpus ( tr ) ;
2008-11-19 10:00:15 +01:00
2008-05-12 21:20:44 +02:00
current_trace = type ;
2011-03-09 20:09:26 -05:00
/* If we expanded the buffers, make sure the max is expanded too */
if ( ring_buffer_expanded & & type - > use_max_tr )
2012-02-02 12:00:41 -08:00
ring_buffer_resize ( max_tr . buffer , trace_buf_size ,
RING_BUFFER_ALL_CPUS ) ;
2011-03-09 20:09:26 -05:00
2008-05-12 21:20:44 +02:00
/* the test is responsible for initializing and enabling */
pr_info ( " Testing tracer %s: " , type - > name ) ;
ret = type - > selftest ( type , tr ) ;
/* the test is responsible for resetting too */
current_trace = saved_tracer ;
if ( ret ) {
printk ( KERN_CONT " FAILED! \n " ) ;
2012-06-18 09:28:16 -04:00
/* Add the warning after printing 'FAILED' */
WARN_ON ( 1 ) ;
2008-05-12 21:20:44 +02:00
goto out ;
}
2008-05-12 21:20:45 +02:00
/* Only reset on passing, to avoid touching corrupted buffers */
2009-09-04 12:12:39 -04:00
tracing_reset_online_cpus ( tr ) ;
2008-11-19 10:00:15 +01:00
2011-03-09 20:09:26 -05:00
/* Shrink the max buffer again */
if ( ring_buffer_expanded & & type - > use_max_tr )
2012-02-02 12:00:41 -08:00
ring_buffer_resize ( max_tr . buffer , 1 ,
RING_BUFFER_ALL_CPUS ) ;
2011-03-09 20:09:26 -05:00
2008-05-12 21:20:44 +02:00
printk ( KERN_CONT " PASSED \n " ) ;
}
# endif
2008-05-12 21:20:42 +02:00
type - > next = trace_types ;
trace_types = type ;
2008-05-12 21:20:44 +02:00
2008-05-12 21:20:42 +02:00
out :
2008-12-06 03:41:33 +01:00
tracing_selftest_running = false ;
2008-05-12 21:20:42 +02:00
mutex_unlock ( & trace_types_lock ) ;
2009-02-05 01:13:38 -05:00
if ( ret | | ! default_bootup_tracer )
goto out_unlock ;
2009-09-18 14:06:47 +08:00
if ( strncmp ( default_bootup_tracer , type - > name , MAX_TRACER_SIZE ) )
2009-02-05 01:13:38 -05:00
goto out_unlock ;
printk ( KERN_INFO " Starting tracer '%s' \n " , type - > name ) ;
/* Do we want this tracer to start on bootup? */
tracing_set_tracer ( type - > name ) ;
default_bootup_tracer = NULL ;
/* disable other selftests, since this will break it. */
tracing_selftest_disabled = 1 ;
2009-02-02 21:38:32 -05:00
# ifdef CONFIG_FTRACE_STARTUP_TEST
2009-02-05 01:13:38 -05:00
printk ( KERN_INFO " Disabling FTRACE selftests due to running tracer '%s' \n " ,
type - > name ) ;
2009-02-02 21:38:32 -05:00
# endif
2009-02-05 01:13:38 -05:00
out_unlock :
2008-05-12 21:20:42 +02:00
return ret ;
}
2009-09-04 12:35:16 -04:00
void tracing_reset ( struct trace_array * tr , int cpu )
{
struct ring_buffer * buffer = tr - > buffer ;
ring_buffer_record_disable ( buffer ) ;
/* Make sure all commits have finished */
synchronize_sched ( ) ;
2012-05-08 20:57:53 -04:00
ring_buffer_reset_cpu ( buffer , cpu ) ;
2009-09-04 12:35:16 -04:00
ring_buffer_record_enable ( buffer ) ;
}
2008-12-19 12:08:39 +02:00
void tracing_reset_online_cpus ( struct trace_array * tr )
{
2009-09-04 12:02:35 -04:00
struct ring_buffer * buffer = tr - > buffer ;
2008-12-19 12:08:39 +02:00
int cpu ;
2009-09-04 12:02:35 -04:00
ring_buffer_record_disable ( buffer ) ;
/* Make sure all commits have finished */
synchronize_sched ( ) ;
2008-12-19 12:08:39 +02:00
tr - > time_start = ftrace_now ( tr - > cpu ) ;
for_each_online_cpu ( cpu )
2012-05-08 20:57:53 -04:00
ring_buffer_reset_cpu ( buffer , cpu ) ;
2009-09-04 12:02:35 -04:00
ring_buffer_record_enable ( buffer ) ;
2008-12-19 12:08:39 +02:00
}
2009-05-06 21:54:09 -04:00
void tracing_reset_current ( int cpu )
{
tracing_reset ( & global_trace , cpu ) ;
}
void tracing_reset_current_online_cpus ( void )
{
tracing_reset_online_cpus ( & global_trace ) ;
}
2008-05-12 21:20:42 +02:00
# define SAVED_CMDLINES 128
2009-03-18 09:03:19 +01:00
# define NO_CMDLINE_MAP UINT_MAX
2008-05-12 21:20:42 +02:00
static unsigned map_pid_to_cmdline [ PID_MAX_DEFAULT + 1 ] ;
static unsigned map_cmdline_to_pid [ SAVED_CMDLINES ] ;
static char saved_cmdlines [ SAVED_CMDLINES ] [ TASK_COMM_LEN ] ;
static int cmdline_idx ;
2009-12-03 12:38:57 +01:00
static arch_spinlock_t trace_cmdline_lock = __ARCH_SPIN_LOCK_UNLOCKED ;
2008-05-12 21:21:00 +02:00
/* temporary disable recording */
2009-02-10 19:44:12 +01:00
static atomic_t trace_record_cmdline_disabled __read_mostly ;
2008-05-12 21:20:42 +02:00
static void trace_init_cmdlines ( void )
{
2009-03-18 09:03:19 +01:00
memset ( & map_pid_to_cmdline , NO_CMDLINE_MAP , sizeof ( map_pid_to_cmdline ) ) ;
memset ( & map_cmdline_to_pid , NO_CMDLINE_MAP , sizeof ( map_cmdline_to_pid ) ) ;
2008-05-12 21:20:42 +02:00
cmdline_idx = 0 ;
}
2009-09-13 01:43:07 +02:00
int is_tracing_stopped ( void )
{
return trace_stop_count ;
}
2008-11-21 12:59:38 -05:00
/**
* ftrace_off_permanent - disable all ftrace code permanently
*
* This should only be called when a serious anomally has
* been detected . This will turn off the function tracing ,
* ring buffers , and other tracing utilites . It takes no
* locks and can be called from any context .
*/
void ftrace_off_permanent ( void )
{
tracing_disabled = 1 ;
ftrace_stop ( ) ;
tracing_off_permanent ( ) ;
}
2008-11-05 16:05:44 -05:00
/**
* tracing_start - quick start of the tracer
*
* If tracing is enabled but was stopped by tracing_stop ,
* this will start the tracer back up .
*/
void tracing_start ( void )
{
struct ring_buffer * buffer ;
unsigned long flags ;
if ( tracing_disabled )
return ;
2009-07-25 17:13:33 +02:00
raw_spin_lock_irqsave ( & tracing_start_lock , flags ) ;
2009-01-22 14:26:15 -05:00
if ( - - trace_stop_count ) {
if ( trace_stop_count < 0 ) {
/* Someone screwed up their debugging */
WARN_ON_ONCE ( 1 ) ;
trace_stop_count = 0 ;
}
2008-11-05 16:05:44 -05:00
goto out ;
}
2010-03-12 19:56:00 -05:00
/* Prevent the buffers from switching */
arch_spin_lock ( & ftrace_max_lock ) ;
2008-11-05 16:05:44 -05:00
buffer = global_trace . buffer ;
if ( buffer )
ring_buffer_record_enable ( buffer ) ;
buffer = max_tr . buffer ;
if ( buffer )
ring_buffer_record_enable ( buffer ) ;
2010-03-12 19:56:00 -05:00
arch_spin_unlock ( & ftrace_max_lock ) ;
2008-11-05 16:05:44 -05:00
ftrace_start ( ) ;
out :
2009-07-25 17:13:33 +02:00
raw_spin_unlock_irqrestore ( & tracing_start_lock , flags ) ;
2008-11-05 16:05:44 -05:00
}
/**
* tracing_stop - quick stop of the tracer
*
* Light weight way to stop tracing . Use in conjunction with
* tracing_start .
*/
void tracing_stop ( void )
{
struct ring_buffer * buffer ;
unsigned long flags ;
ftrace_stop ( ) ;
2009-07-25 17:13:33 +02:00
raw_spin_lock_irqsave ( & tracing_start_lock , flags ) ;
2008-11-05 16:05:44 -05:00
if ( trace_stop_count + + )
goto out ;
2010-03-12 19:56:00 -05:00
/* Prevent the buffers from switching */
arch_spin_lock ( & ftrace_max_lock ) ;
2008-11-05 16:05:44 -05:00
buffer = global_trace . buffer ;
if ( buffer )
ring_buffer_record_disable ( buffer ) ;
buffer = max_tr . buffer ;
if ( buffer )
ring_buffer_record_disable ( buffer ) ;
2010-03-12 19:56:00 -05:00
arch_spin_unlock ( & ftrace_max_lock ) ;
2008-11-05 16:05:44 -05:00
out :
2009-07-25 17:13:33 +02:00
raw_spin_unlock_irqrestore ( & tracing_start_lock , flags ) ;
2008-11-05 16:05:44 -05:00
}
2008-05-12 21:20:51 +02:00
void trace_stop_cmdline_recording ( void ) ;
2008-05-12 21:20:42 +02:00
2008-05-12 21:20:51 +02:00
static void trace_save_cmdline ( struct task_struct * tsk )
2008-05-12 21:20:42 +02:00
{
2009-03-18 09:00:41 +01:00
unsigned pid , idx ;
2008-05-12 21:20:42 +02:00
if ( ! tsk - > pid | | unlikely ( tsk - > pid > PID_MAX_DEFAULT ) )
return ;
/*
* It ' s not the end of the world if we don ' t get
* the lock , but we also don ' t want to spin
* nor do we want to disable interrupts ,
* so if we miss here , then better luck next time .
*/
2009-12-02 20:01:25 +01:00
if ( ! arch_spin_trylock ( & trace_cmdline_lock ) )
2008-05-12 21:20:42 +02:00
return ;
idx = map_pid_to_cmdline [ tsk - > pid ] ;
2009-03-18 09:03:19 +01:00
if ( idx = = NO_CMDLINE_MAP ) {
2008-05-12 21:20:42 +02:00
idx = ( cmdline_idx + 1 ) % SAVED_CMDLINES ;
2009-03-18 09:00:41 +01:00
/*
* Check whether the cmdline buffer at idx has a pid
* mapped . We are going to overwrite that entry so we
* need to clear the map_pid_to_cmdline . Otherwise we
* would read the new comm for the old pid .
*/
pid = map_cmdline_to_pid [ idx ] ;
if ( pid ! = NO_CMDLINE_MAP )
map_pid_to_cmdline [ pid ] = NO_CMDLINE_MAP ;
2008-05-12 21:20:42 +02:00
2009-03-18 09:00:41 +01:00
map_cmdline_to_pid [ idx ] = tsk - > pid ;
2008-05-12 21:20:42 +02:00
map_pid_to_cmdline [ tsk - > pid ] = idx ;
cmdline_idx = idx ;
}
memcpy ( & saved_cmdlines [ idx ] , tsk - > comm , TASK_COMM_LEN ) ;
2009-12-02 20:01:25 +01:00
arch_spin_unlock ( & trace_cmdline_lock ) ;
2008-05-12 21:20:42 +02:00
}
2009-03-16 19:20:15 -04:00
void trace_find_cmdline ( int pid , char comm [ ] )
2008-05-12 21:20:42 +02:00
{
unsigned map ;
2009-03-16 19:20:15 -04:00
if ( ! pid ) {
strcpy ( comm , " <idle> " ) ;
return ;
}
2008-05-12 21:20:42 +02:00
2010-01-25 15:11:53 -05:00
if ( WARN_ON_ONCE ( pid < 0 ) ) {
strcpy ( comm , " <XXX> " ) ;
return ;
}
2009-03-16 19:20:15 -04:00
if ( pid > PID_MAX_DEFAULT ) {
strcpy ( comm , " <...> " ) ;
return ;
}
2008-05-12 21:20:42 +02:00
2009-05-26 17:28:02 +02:00
preempt_disable ( ) ;
2009-12-02 20:01:25 +01:00
arch_spin_lock ( & trace_cmdline_lock ) ;
2008-05-12 21:20:42 +02:00
map = map_pid_to_cmdline [ pid ] ;
2009-03-18 08:58:44 +01:00
if ( map ! = NO_CMDLINE_MAP )
strcpy ( comm , saved_cmdlines [ map ] ) ;
else
strcpy ( comm , " <...> " ) ;
2008-05-12 21:20:42 +02:00
2009-12-02 20:01:25 +01:00
arch_spin_unlock ( & trace_cmdline_lock ) ;
2009-05-26 17:28:02 +02:00
preempt_enable ( ) ;
2008-05-12 21:20:42 +02:00
}
2008-05-12 21:20:51 +02:00
void tracing_record_cmdline ( struct task_struct * tsk )
2008-05-12 21:20:42 +02:00
{
2012-05-11 14:25:30 -04:00
if ( atomic_read ( & trace_record_cmdline_disabled ) | | ! tracing_is_on ( ) )
2008-05-12 21:20:42 +02:00
return ;
2012-10-11 12:14:25 -04:00
if ( ! __this_cpu_read ( trace_cmdline_save ) )
return ;
__this_cpu_write ( trace_cmdline_save , false ) ;
2008-05-12 21:20:42 +02:00
trace_save_cmdline ( tsk ) ;
}
2008-09-16 21:56:41 +03:00
void
2008-10-01 13:14:09 -04:00
tracing_generic_entry_update ( struct trace_entry * entry , unsigned long flags ,
int pc )
2008-05-12 21:20:42 +02:00
{
struct task_struct * tsk = current ;
2008-09-29 23:02:42 -04:00
entry - > preempt_count = pc & 0xff ;
entry - > pid = ( tsk ) ? tsk - > pid : 0 ;
2011-05-05 23:55:18 -04:00
entry - > padding = 0 ;
2008-09-29 23:02:42 -04:00
entry - > flags =
2008-10-24 09:42:59 -04:00
# ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT
2008-08-01 12:26:40 -04:00
( irqs_disabled_flags ( flags ) ? TRACE_FLAG_IRQS_OFF : 0 ) |
2008-10-24 09:42:59 -04:00
# else
TRACE_FLAG_IRQS_NOSUPPORT |
# endif
2008-05-12 21:20:42 +02:00
( ( pc & HARDIRQ_MASK ) ? TRACE_FLAG_HARDIRQ : 0 ) |
( ( pc & SOFTIRQ_MASK ) ? TRACE_FLAG_SOFTIRQ : 0 ) |
( need_resched ( ) ? TRACE_FLAG_NEED_RESCHED : 0 ) ;
}
2009-08-07 01:25:54 +02:00
EXPORT_SYMBOL_GPL ( tracing_generic_entry_update ) ;
2008-05-12 21:20:42 +02:00
2009-09-02 14:17:06 -04:00
struct ring_buffer_event *
trace_buffer_lock_reserve ( struct ring_buffer * buffer ,
int type ,
unsigned long len ,
unsigned long flags , int pc )
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 16:14:13 -02:00
{
struct ring_buffer_event * event ;
2009-09-02 14:17:06 -04:00
event = ring_buffer_lock_reserve ( buffer , len ) ;
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 16:14:13 -02:00
if ( event ! = NULL ) {
struct trace_entry * ent = ring_buffer_event_data ( event ) ;
tracing_generic_entry_update ( ent , flags , pc ) ;
ent - > type = type ;
}
return event ;
}
2012-10-11 12:14:25 -04:00
void
__buffer_unlock_commit ( struct ring_buffer * buffer , struct ring_buffer_event * event )
{
__this_cpu_write ( trace_cmdline_save , true ) ;
2012-11-01 20:54:21 -04:00
if ( trace_wakeup_needed ) {
trace_wakeup_needed = false ;
/* irq_work_queue() supplies it's own memory barriers */
irq_work_queue ( & trace_work_wakeup ) ;
}
2012-10-11 12:14:25 -04:00
ring_buffer_unlock_commit ( buffer , event ) ;
}
2009-09-02 14:17:06 -04:00
static inline void
__trace_buffer_unlock_commit ( struct ring_buffer * buffer ,
struct ring_buffer_event * event ,
2012-11-01 20:54:21 -04:00
unsigned long flags , int pc )
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 16:14:13 -02:00
{
2012-10-11 12:14:25 -04:00
__buffer_unlock_commit ( buffer , event ) ;
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 16:14:13 -02:00
2009-09-02 14:17:06 -04:00
ftrace_trace_stack ( buffer , flags , 6 , pc ) ;
ftrace_trace_userstack ( buffer , flags , pc ) ;
2009-03-22 23:10:46 +01:00
}
2009-09-02 14:17:06 -04:00
void trace_buffer_unlock_commit ( struct ring_buffer * buffer ,
struct ring_buffer_event * event ,
unsigned long flags , int pc )
2009-03-22 23:10:46 +01:00
{
2012-11-01 20:54:21 -04:00
__trace_buffer_unlock_commit ( buffer , event , flags , pc ) ;
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 16:14:13 -02:00
}
2012-11-01 20:54:21 -04:00
EXPORT_SYMBOL_GPL ( trace_buffer_unlock_commit ) ;
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 16:14:13 -02:00
2009-02-27 19:38:04 -05:00
struct ring_buffer_event *
2009-09-02 14:17:06 -04:00
trace_current_buffer_lock_reserve ( struct ring_buffer * * current_rb ,
int type , unsigned long len ,
2009-02-27 19:38:04 -05:00
unsigned long flags , int pc )
{
2009-09-02 14:17:06 -04:00
* current_rb = global_trace . buffer ;
return trace_buffer_lock_reserve ( * current_rb ,
2009-02-27 19:38:04 -05:00
type , len , flags , pc ) ;
}
2009-05-05 19:22:53 -04:00
EXPORT_SYMBOL_GPL ( trace_current_buffer_lock_reserve ) ;
2009-02-27 19:38:04 -05:00
2009-09-02 14:17:06 -04:00
void trace_current_buffer_unlock_commit ( struct ring_buffer * buffer ,
struct ring_buffer_event * event ,
2009-02-27 19:38:04 -05:00
unsigned long flags , int pc )
{
2012-11-01 20:54:21 -04:00
__trace_buffer_unlock_commit ( buffer , event , flags , pc ) ;
2009-03-22 23:10:46 +01:00
}
2009-05-05 19:22:53 -04:00
EXPORT_SYMBOL_GPL ( trace_current_buffer_unlock_commit ) ;
2009-03-22 23:10:46 +01:00
2012-11-01 20:54:21 -04:00
void trace_buffer_unlock_commit_regs ( struct ring_buffer * buffer ,
struct ring_buffer_event * event ,
unsigned long flags , int pc ,
struct pt_regs * regs )
2011-06-08 16:09:34 +09:00
{
2012-10-11 12:14:25 -04:00
__buffer_unlock_commit ( buffer , event ) ;
2011-06-08 16:09:34 +09:00
ftrace_trace_stack_regs ( buffer , flags , 0 , pc , regs ) ;
ftrace_trace_userstack ( buffer , flags , pc ) ;
}
2012-11-01 20:54:21 -04:00
EXPORT_SYMBOL_GPL ( trace_buffer_unlock_commit_regs ) ;
2011-06-08 16:09:34 +09:00
2009-09-02 14:17:06 -04:00
void trace_current_buffer_discard_commit ( struct ring_buffer * buffer ,
struct ring_buffer_event * event )
2009-04-02 01:16:59 -04:00
{
2009-09-02 14:17:06 -04:00
ring_buffer_discard_commit ( buffer , event ) ;
2009-02-27 19:38:04 -05:00
}
2009-04-17 16:01:56 -04:00
EXPORT_SYMBOL_GPL ( trace_current_buffer_discard_commit ) ;
2009-02-27 19:38:04 -05:00
2008-05-12 21:20:51 +02:00
void
2009-02-05 01:13:37 -05:00
trace_function ( struct trace_array * tr ,
2008-10-01 13:14:09 -04:00
unsigned long ip , unsigned long parent_ip , unsigned long flags ,
int pc )
2008-05-12 21:20:42 +02:00
{
2009-03-31 00:48:49 -05:00
struct ftrace_event_call * call = & event_function ;
2009-09-02 14:17:06 -04:00
struct ring_buffer * buffer = tr - > buffer ;
2008-09-29 23:02:41 -04:00
struct ring_buffer_event * event ;
2008-09-29 23:02:42 -04:00
struct ftrace_entry * entry ;
2008-05-12 21:20:42 +02:00
2008-10-01 00:29:53 -04:00
/* If we are reading the ring buffer, don't trace */
2009-10-29 22:34:15 +09:00
if ( unlikely ( __this_cpu_read ( ftrace_cpu_disabled ) ) )
2008-10-01 00:29:53 -04:00
return ;
2009-09-02 14:17:06 -04:00
event = trace_buffer_lock_reserve ( buffer , TRACE_FN , sizeof ( * entry ) ,
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 16:14:13 -02:00
flags , pc ) ;
2008-09-29 23:02:41 -04:00
if ( ! event )
return ;
entry = ring_buffer_event_data ( event ) ;
2008-09-29 23:02:42 -04:00
entry - > ip = ip ;
entry - > parent_ip = parent_ip ;
2009-03-31 00:48:49 -05:00
2009-09-02 14:17:06 -04:00
if ( ! filter_check_discard ( call , entry , buffer , event ) )
2012-10-11 12:14:25 -04:00
__buffer_unlock_commit ( buffer , event ) ;
2008-05-12 21:20:42 +02:00
}
2008-05-12 21:20:51 +02:00
void
2008-05-12 21:20:49 +02:00
ftrace ( struct trace_array * tr , struct trace_array_cpu * data ,
2008-10-01 13:14:09 -04:00
unsigned long ip , unsigned long parent_ip , unsigned long flags ,
int pc )
2008-05-12 21:20:49 +02:00
{
if ( likely ( ! atomic_read ( & data - > disabled ) ) )
2009-02-05 01:13:37 -05:00
trace_function ( tr , ip , parent_ip , flags , pc ) ;
2008-05-12 21:20:49 +02:00
}
2009-07-29 17:51:13 +02:00
# ifdef CONFIG_STACKTRACE
2011-07-14 16:36:53 -04:00
# define FTRACE_STACK_MAX_ENTRIES (PAGE_SIZE / sizeof(unsigned long))
struct ftrace_stack {
unsigned long calls [ FTRACE_STACK_MAX_ENTRIES ] ;
} ;
static DEFINE_PER_CPU ( struct ftrace_stack , ftrace_stack ) ;
static DEFINE_PER_CPU ( int , ftrace_stack_reserve ) ;
2009-09-02 14:17:06 -04:00
static void __ftrace_trace_stack ( struct ring_buffer * buffer ,
2009-01-15 19:12:40 -05:00
unsigned long flags ,
2011-06-08 16:09:34 +09:00
int skip , int pc , struct pt_regs * regs )
2008-05-12 21:20:51 +02:00
{
2009-03-31 00:48:49 -05:00
struct ftrace_event_call * call = & event_kernel_stack ;
2008-09-29 23:02:41 -04:00
struct ring_buffer_event * event ;
2008-09-29 23:02:42 -04:00
struct stack_entry * entry ;
2008-05-12 21:20:51 +02:00
struct stack_trace trace ;
2011-07-14 16:36:53 -04:00
int use_stack ;
int size = FTRACE_STACK_ENTRIES ;
trace . nr_entries = 0 ;
trace . skip = skip ;
/*
* Since events can happen in NMIs there ' s no safe way to
* use the per cpu ftrace_stacks . We reserve it and if an interrupt
* or NMI comes in , it will just have to use the default
* FTRACE_STACK_SIZE .
*/
preempt_disable_notrace ( ) ;
use_stack = + + __get_cpu_var ( ftrace_stack_reserve ) ;
/*
* We don ' t need any atomic variables , just a barrier .
* If an interrupt comes in , we don ' t care , because it would
* have exited and put the counter back to what we want .
* We just need a barrier to keep gcc from moving things
* around .
*/
barrier ( ) ;
if ( use_stack = = 1 ) {
trace . entries = & __get_cpu_var ( ftrace_stack ) . calls [ 0 ] ;
trace . max_entries = FTRACE_STACK_MAX_ENTRIES ;
if ( regs )
save_stack_trace_regs ( regs , & trace ) ;
else
save_stack_trace ( & trace ) ;
if ( trace . nr_entries > size )
size = trace . nr_entries ;
} else
/* From now on, use_stack is a boolean */
use_stack = 0 ;
size * = sizeof ( unsigned long ) ;
2008-05-12 21:20:51 +02:00
2009-09-02 14:17:06 -04:00
event = trace_buffer_lock_reserve ( buffer , TRACE_STACK ,
2011-07-14 16:36:53 -04:00
sizeof ( * entry ) + size , flags , pc ) ;
2008-09-29 23:02:41 -04:00
if ( ! event )
2011-07-14 16:36:53 -04:00
goto out ;
entry = ring_buffer_event_data ( event ) ;
2008-05-12 21:20:51 +02:00
2011-07-14 16:36:53 -04:00
memset ( & entry - > caller , 0 , size ) ;
if ( use_stack )
memcpy ( & entry - > caller , trace . entries ,
trace . nr_entries * sizeof ( unsigned long ) ) ;
else {
trace . max_entries = FTRACE_STACK_ENTRIES ;
trace . entries = entry - > caller ;
if ( regs )
save_stack_trace_regs ( regs , & trace ) ;
else
save_stack_trace ( & trace ) ;
}
entry - > size = trace . nr_entries ;
2008-05-12 21:20:51 +02:00
2009-09-02 14:17:06 -04:00
if ( ! filter_check_discard ( call , entry , buffer , event ) )
2012-10-11 12:14:25 -04:00
__buffer_unlock_commit ( buffer , event ) ;
2011-07-14 16:36:53 -04:00
out :
/* Again, don't let gcc optimize things here */
barrier ( ) ;
__get_cpu_var ( ftrace_stack_reserve ) - - ;
preempt_enable_notrace ( ) ;
2008-05-12 21:20:47 +02:00
}
2011-06-08 16:09:34 +09:00
void ftrace_trace_stack_regs ( struct ring_buffer * buffer , unsigned long flags ,
int skip , int pc , struct pt_regs * regs )
{
if ( ! ( trace_flags & TRACE_ITER_STACKTRACE ) )
return ;
__ftrace_trace_stack ( buffer , flags , skip , pc , regs ) ;
}
2009-09-02 14:17:06 -04:00
void ftrace_trace_stack ( struct ring_buffer * buffer , unsigned long flags ,
int skip , int pc )
2009-01-15 19:12:40 -05:00
{
if ( ! ( trace_flags & TRACE_ITER_STACKTRACE ) )
return ;
2011-06-08 16:09:34 +09:00
__ftrace_trace_stack ( buffer , flags , skip , pc , NULL ) ;
2009-01-15 19:12:40 -05:00
}
2009-07-29 17:51:13 +02:00
void __trace_stack ( struct trace_array * tr , unsigned long flags , int skip ,
int pc )
2008-10-01 13:14:09 -04:00
{
2011-06-08 16:09:34 +09:00
__ftrace_trace_stack ( tr - > buffer , flags , skip , pc , NULL ) ;
2008-10-01 13:14:09 -04:00
}
2009-12-11 09:48:22 -05:00
/**
* trace_dump_stack - record a stack back trace in the trace buffer
*/
void trace_dump_stack ( void )
{
unsigned long flags ;
if ( tracing_disabled | | tracing_selftest_running )
2009-12-14 15:58:33 -05:00
return ;
2009-12-11 09:48:22 -05:00
local_save_flags ( flags ) ;
/* skipping 3 traces, seems to get us at the caller of this function */
2011-06-08 16:09:34 +09:00
__ftrace_trace_stack ( global_trace . buffer , flags , 3 , preempt_count ( ) , NULL ) ;
2009-12-11 09:48:22 -05:00
}
2010-11-10 12:56:12 +01:00
static DEFINE_PER_CPU ( int , user_stack_count ) ;
2009-09-02 14:17:06 -04:00
void
ftrace_trace_userstack ( struct ring_buffer * buffer , unsigned long flags , int pc )
2008-11-22 13:28:47 +02:00
{
2009-03-31 00:48:49 -05:00
struct ftrace_event_call * call = & event_user_stack ;
2008-11-23 12:39:06 +02:00
struct ring_buffer_event * event ;
2008-11-22 13:28:47 +02:00
struct userstack_entry * entry ;
struct stack_trace trace ;
if ( ! ( trace_flags & TRACE_ITER_USERSTACKTRACE ) )
return ;
tracing: Do not record user stack trace from NMI context
A bug was found with Li Zefan's ftrace_stress_test that caused applications
to segfault during the test.
Placing a tracing_off() in the segfault code, and examining several
traces, I found that the following was always the case. The lock tracer
was enabled (lockdep being required) and userstack was enabled. Testing
this out, I just enabled the two, but that was not good enough. I needed
to run something else that could trigger it. Running a load like hackbench
did not work, but executing a new program would. The following would
trigger the segfault within seconds:
# echo 1 > /debug/tracing/options/userstacktrace
# echo 1 > /debug/tracing/events/lock/enable
# while :; do ls > /dev/null ; done
Enabling the function graph tracer and looking at what was happening
I finally noticed that all cashes happened just after an NMI.
1) | copy_user_handle_tail() {
1) | bad_area_nosemaphore() {
1) | __bad_area_nosemaphore() {
1) | no_context() {
1) | fixup_exception() {
1) 0.319 us | search_exception_tables();
1) 0.873 us | }
[...]
1) 0.314 us | __rcu_read_unlock();
1) 0.325 us | native_apic_mem_write();
1) 0.943 us | }
1) 0.304 us | rcu_nmi_exit();
[...]
1) 0.479 us | find_vma();
1) | bad_area() {
1) | __bad_area() {
After capturing several traces of failures, all of them happened
after an NMI. Curious about this, I added a trace_printk() to the NMI
handler to read the regs->ip to see where the NMI happened. In which I
found out it was here:
ffffffff8135b660 <page_fault>:
ffffffff8135b660: 48 83 ec 78 sub $0x78,%rsp
ffffffff8135b664: e8 97 01 00 00 callq ffffffff8135b800 <error_entry>
What was happening is that the NMI would happen at the place that a page
fault occurred. It would call rcu_read_lock() which was traced by
the lock events, and the user_stack_trace would run. This would trigger
a page fault inside the NMI. I do not see where the CR2 register is
saved or restored in NMI handling. This means that it would corrupt
the page fault handling that the NMI interrupted.
The reason the while loop of ls helped trigger the bug, was that
each execution of ls would cause lots of pages to be faulted in, and
increase the chances of the race happening.
The simple solution is to not allow user stack traces in NMI context.
After this patch, I ran the above "ls" test for a couple of hours
without any issues. Without this patch, the bug would trigger in less
than a minute.
Cc: stable@kernel.org
Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-03-12 20:03:30 -05:00
/*
* NMIs can not handle page faults , even with fix ups .
* The save user stack can ( and often does ) fault .
*/
if ( unlikely ( in_nmi ( ) ) )
return ;
2008-11-22 13:28:47 +02:00
2010-11-10 12:56:12 +01:00
/*
* prevent recursion , since the user stack tracing may
* trigger other kernel events .
*/
preempt_disable ( ) ;
if ( __this_cpu_read ( user_stack_count ) )
goto out ;
__this_cpu_inc ( user_stack_count ) ;
2009-09-02 14:17:06 -04:00
event = trace_buffer_lock_reserve ( buffer , TRACE_USER_STACK ,
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 16:14:13 -02:00
sizeof ( * entry ) , flags , pc ) ;
2008-11-22 13:28:47 +02:00
if ( ! event )
2010-12-09 15:47:56 +08:00
goto out_drop_count ;
2008-11-22 13:28:47 +02:00
entry = ring_buffer_event_data ( event ) ;
2009-09-11 11:36:23 -04:00
entry - > tgid = current - > tgid ;
2008-11-22 13:28:47 +02:00
memset ( & entry - > caller , 0 , sizeof ( entry - > caller ) ) ;
trace . nr_entries = 0 ;
trace . max_entries = FTRACE_STACK_ENTRIES ;
trace . skip = 0 ;
trace . entries = entry - > caller ;
save_stack_trace_user ( & trace ) ;
2009-09-02 14:17:06 -04:00
if ( ! filter_check_discard ( call , entry , buffer , event ) )
2012-10-11 12:14:25 -04:00
__buffer_unlock_commit ( buffer , event ) ;
2010-11-10 12:56:12 +01:00
2010-12-09 15:47:56 +08:00
out_drop_count :
2010-11-10 12:56:12 +01:00
__this_cpu_dec ( user_stack_count ) ;
out :
preempt_enable ( ) ;
2008-11-22 13:28:47 +02:00
}
2009-02-10 19:44:12 +01:00
# ifdef UNUSED
static void __trace_userstack ( struct trace_array * tr , unsigned long flags )
2008-11-22 13:28:47 +02:00
{
2009-02-05 01:13:37 -05:00
ftrace_trace_userstack ( tr , flags , preempt_count ( ) ) ;
2008-11-22 13:28:47 +02:00
}
2009-02-10 19:44:12 +01:00
# endif /* UNUSED */
2008-11-22 13:28:47 +02:00
2009-07-29 17:51:13 +02:00
# endif /* CONFIG_STACKTRACE */
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
/* created for use with alloc_percpu */
struct trace_buffer_struct {
char buffer [ TRACE_BUF_SIZE ] ;
} ;
static struct trace_buffer_struct * trace_percpu_buffer ;
static struct trace_buffer_struct * trace_percpu_sirq_buffer ;
static struct trace_buffer_struct * trace_percpu_irq_buffer ;
static struct trace_buffer_struct * trace_percpu_nmi_buffer ;
/*
* The buffer used is dependent on the context . There is a per cpu
* buffer for normal context , softirq contex , hard irq context and
* for NMI context . Thise allows for lockless recording .
*
* Note , if the buffers failed to be allocated , then this returns NULL
*/
static char * get_trace_buf ( void )
{
struct trace_buffer_struct * percpu_buffer ;
struct trace_buffer_struct * buffer ;
/*
* If we have allocated per cpu buffers , then we do not
* need to do any locking .
*/
if ( in_nmi ( ) )
percpu_buffer = trace_percpu_nmi_buffer ;
else if ( in_irq ( ) )
percpu_buffer = trace_percpu_irq_buffer ;
else if ( in_softirq ( ) )
percpu_buffer = trace_percpu_sirq_buffer ;
else
percpu_buffer = trace_percpu_buffer ;
if ( ! percpu_buffer )
return NULL ;
buffer = per_cpu_ptr ( percpu_buffer , smp_processor_id ( ) ) ;
return buffer - > buffer ;
}
static int alloc_percpu_trace_buffer ( void )
{
struct trace_buffer_struct * buffers ;
struct trace_buffer_struct * sirq_buffers ;
struct trace_buffer_struct * irq_buffers ;
struct trace_buffer_struct * nmi_buffers ;
buffers = alloc_percpu ( struct trace_buffer_struct ) ;
if ( ! buffers )
goto err_warn ;
sirq_buffers = alloc_percpu ( struct trace_buffer_struct ) ;
if ( ! sirq_buffers )
goto err_sirq ;
irq_buffers = alloc_percpu ( struct trace_buffer_struct ) ;
if ( ! irq_buffers )
goto err_irq ;
nmi_buffers = alloc_percpu ( struct trace_buffer_struct ) ;
if ( ! nmi_buffers )
goto err_nmi ;
trace_percpu_buffer = buffers ;
trace_percpu_sirq_buffer = sirq_buffers ;
trace_percpu_irq_buffer = irq_buffers ;
trace_percpu_nmi_buffer = nmi_buffers ;
return 0 ;
err_nmi :
free_percpu ( irq_buffers ) ;
err_irq :
free_percpu ( sirq_buffers ) ;
err_sirq :
free_percpu ( buffers ) ;
err_warn :
WARN ( 1 , " Could not allocate percpu trace_printk buffer " ) ;
return - ENOMEM ;
}
2012-10-11 10:15:05 -04:00
static int buffers_allocated ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
void trace_printk_init_buffers ( void )
{
if ( buffers_allocated )
return ;
if ( alloc_percpu_trace_buffer ( ) )
return ;
pr_info ( " ftrace: Allocated trace_printk buffers \n " ) ;
2012-10-10 21:44:34 -04:00
/* Expand the buffers to set size */
tracing_update_buffers ( ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
buffers_allocated = 1 ;
2012-10-11 10:15:05 -04:00
/*
* trace_printk_init_buffers ( ) can be called by modules .
* If that happens , then we need to start cmdline recording
* directly here . If the global_trace . buffer is already
* allocated here , then this was called by module code .
*/
if ( global_trace . buffer )
tracing_start_cmdline_record ( ) ;
}
void trace_printk_start_comm ( void )
{
/* Start tracing comms if trace printk is set */
if ( ! buffers_allocated )
return ;
tracing_start_cmdline_record ( ) ;
}
static void trace_printk_start_stop_comm ( int enabled )
{
if ( ! buffers_allocated )
return ;
if ( enabled )
tracing_start_cmdline_record ( ) ;
else
tracing_stop_cmdline_record ( ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
}
2009-03-06 17:21:49 +01:00
/**
2009-03-12 18:24:49 +01:00
* trace_vbprintk - write binary msg to tracing buffer
2009-03-06 17:21:49 +01:00
*
*/
2009-03-19 14:03:53 -04:00
int trace_vbprintk ( unsigned long ip , const char * fmt , va_list args )
2009-03-06 17:21:49 +01:00
{
2009-03-31 00:48:49 -05:00
struct ftrace_event_call * call = & event_bprint ;
2009-03-06 17:21:49 +01:00
struct ring_buffer_event * event ;
2009-09-02 14:17:06 -04:00
struct ring_buffer * buffer ;
2009-03-06 17:21:49 +01:00
struct trace_array * tr = & global_trace ;
2009-03-12 18:24:49 +01:00
struct bprint_entry * entry ;
2009-03-06 17:21:49 +01:00
unsigned long flags ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
char * tbuffer ;
int len = 0 , size , pc ;
2009-03-06 17:21:49 +01:00
if ( unlikely ( tracing_selftest_running | | tracing_disabled ) )
return 0 ;
/* Don't pollute graph traces with trace_vprintk internals */
pause_graph_tracing ( ) ;
pc = preempt_count ( ) ;
2010-06-03 09:36:50 -04:00
preempt_disable_notrace ( ) ;
2009-03-06 17:21:49 +01:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
tbuffer = get_trace_buf ( ) ;
if ( ! tbuffer ) {
len = 0 ;
2009-03-06 17:21:49 +01:00
goto out ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
}
2009-03-06 17:21:49 +01:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
len = vbin_printf ( ( u32 * ) tbuffer , TRACE_BUF_SIZE / sizeof ( int ) , fmt , args ) ;
2009-03-06 17:21:49 +01:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
if ( len > TRACE_BUF_SIZE / sizeof ( int ) | | len < 0 )
goto out ;
2009-03-06 17:21:49 +01:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
local_save_flags ( flags ) ;
2009-03-06 17:21:49 +01:00
size = sizeof ( * entry ) + sizeof ( u32 ) * len ;
2009-09-02 14:17:06 -04:00
buffer = tr - > buffer ;
event = trace_buffer_lock_reserve ( buffer , TRACE_BPRINT , size ,
flags , pc ) ;
2009-03-06 17:21:49 +01:00
if ( ! event )
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
goto out ;
2009-03-06 17:21:49 +01:00
entry = ring_buffer_event_data ( event ) ;
entry - > ip = ip ;
entry - > fmt = fmt ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
memcpy ( entry - > buf , tbuffer , sizeof ( u32 ) * len ) ;
2010-01-06 17:27:11 -05:00
if ( ! filter_check_discard ( call , entry , buffer , event ) ) {
2012-10-11 12:14:25 -04:00
__buffer_unlock_commit ( buffer , event ) ;
2010-01-06 17:27:11 -05:00
ftrace_trace_stack ( buffer , flags , 6 , pc ) ;
}
2009-03-06 17:21:49 +01:00
out :
2010-06-03 09:36:50 -04:00
preempt_enable_notrace ( ) ;
2009-03-06 17:21:49 +01:00
unpause_graph_tracing ( ) ;
return len ;
}
2009-03-12 18:24:49 +01:00
EXPORT_SYMBOL_GPL ( trace_vbprintk ) ;
2009-09-03 19:11:07 -04:00
int trace_array_printk ( struct trace_array * tr ,
unsigned long ip , const char * fmt , . . . )
{
int ret ;
va_list ap ;
if ( ! ( trace_flags & TRACE_ITER_PRINTK ) )
return 0 ;
va_start ( ap , fmt ) ;
ret = trace_array_vprintk ( tr , ip , fmt , ap ) ;
va_end ( ap ) ;
return ret ;
}
int trace_array_vprintk ( struct trace_array * tr ,
unsigned long ip , const char * fmt , va_list args )
2009-03-12 18:24:49 +01:00
{
2009-03-31 00:48:49 -05:00
struct ftrace_event_call * call = & event_print ;
2009-03-12 18:24:49 +01:00
struct ring_buffer_event * event ;
2009-09-02 14:17:06 -04:00
struct ring_buffer * buffer ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
int len = 0 , size , pc ;
2009-03-12 18:24:49 +01:00
struct print_entry * entry ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
unsigned long flags ;
char * tbuffer ;
2009-03-12 18:24:49 +01:00
if ( tracing_disabled | | tracing_selftest_running )
return 0 ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
/* Don't pollute graph traces with trace_vprintk internals */
pause_graph_tracing ( ) ;
2009-03-12 18:24:49 +01:00
pc = preempt_count ( ) ;
preempt_disable_notrace ( ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
tbuffer = get_trace_buf ( ) ;
if ( ! tbuffer ) {
len = 0 ;
2009-03-12 18:24:49 +01:00
goto out ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
}
2009-03-12 18:24:49 +01:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
len = vsnprintf ( tbuffer , TRACE_BUF_SIZE , fmt , args ) ;
if ( len > TRACE_BUF_SIZE )
goto out ;
2009-03-12 18:24:49 +01:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
local_save_flags ( flags ) ;
2009-03-12 18:24:49 +01:00
size = sizeof ( * entry ) + len + 1 ;
2009-09-02 14:17:06 -04:00
buffer = tr - > buffer ;
event = trace_buffer_lock_reserve ( buffer , TRACE_PRINT , size ,
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
flags , pc ) ;
2009-03-12 18:24:49 +01:00
if ( ! event )
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
goto out ;
2009-03-12 18:24:49 +01:00
entry = ring_buffer_event_data ( event ) ;
2009-11-16 20:56:13 +01:00
entry - > ip = ip ;
2009-03-12 18:24:49 +01:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
memcpy ( & entry - > buf , tbuffer , len ) ;
2009-11-16 20:56:13 +01:00
entry - > buf [ len ] = ' \0 ' ;
2010-01-06 17:27:11 -05:00
if ( ! filter_check_discard ( call , entry , buffer , event ) ) {
2012-10-11 12:14:25 -04:00
__buffer_unlock_commit ( buffer , event ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
ftrace_trace_stack ( buffer , flags , 6 , pc ) ;
2010-01-06 17:27:11 -05:00
}
2009-03-12 18:24:49 +01:00
out :
preempt_enable_notrace ( ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
unpause_graph_tracing ( ) ;
2009-03-12 18:24:49 +01:00
return len ;
}
2009-09-03 19:11:07 -04:00
int trace_vprintk ( unsigned long ip , const char * fmt , va_list args )
{
2009-10-09 01:41:35 -04:00
return trace_array_vprintk ( & global_trace , ip , fmt , args ) ;
2009-09-03 19:11:07 -04:00
}
2009-03-06 17:21:49 +01:00
EXPORT_SYMBOL_GPL ( trace_vprintk ) ;
2008-11-12 12:59:32 +01:00
static void trace_iterator_increment ( struct trace_iterator * iter )
2008-09-03 17:42:51 -04:00
{
2012-06-27 20:46:14 -04:00
struct ring_buffer_iter * buf_iter = trace_buffer_iter ( iter , iter - > cpu ) ;
2008-09-03 17:42:51 -04:00
iter - > idx + + ;
2012-06-27 20:46:14 -04:00
if ( buf_iter )
ring_buffer_read ( buf_iter , NULL ) ;
2008-09-03 17:42:51 -04:00
}
2008-05-12 21:20:51 +02:00
static struct trace_entry *
2010-03-31 19:49:26 -04:00
peek_next_entry ( struct trace_iterator * iter , int cpu , u64 * ts ,
unsigned long * lost_events )
2008-08-01 12:26:41 -04:00
{
2008-09-29 23:02:41 -04:00
struct ring_buffer_event * event ;
2012-06-27 20:46:14 -04:00
struct ring_buffer_iter * buf_iter = trace_buffer_iter ( iter , cpu ) ;
2008-08-01 12:26:41 -04:00
2008-10-01 00:29:53 -04:00
if ( buf_iter )
event = ring_buffer_iter_peek ( buf_iter , ts ) ;
else
2010-03-31 19:49:26 -04:00
event = ring_buffer_peek ( iter - > tr - > buffer , cpu , ts ,
lost_events ) ;
2008-10-01 00:29:53 -04:00
2011-07-14 16:36:53 -04:00
if ( event ) {
iter - > ent_size = ring_buffer_event_length ( event ) ;
return ring_buffer_event_data ( event ) ;
}
iter - > ent_size = 0 ;
return NULL ;
2008-08-01 12:26:41 -04:00
}
2008-10-01 00:29:53 -04:00
2008-08-01 12:26:41 -04:00
static struct trace_entry *
2010-03-31 19:49:26 -04:00
__find_next_entry ( struct trace_iterator * iter , int * ent_cpu ,
unsigned long * missing_events , u64 * ent_ts )
2008-05-12 21:20:42 +02:00
{
2008-09-29 23:02:41 -04:00
struct ring_buffer * buffer = iter - > tr - > buffer ;
2008-05-12 21:20:42 +02:00
struct trace_entry * ent , * next = NULL ;
2010-04-05 17:11:05 +08:00
unsigned long lost_events = 0 , next_lost = 0 ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
int cpu_file = iter - > cpu_file ;
2008-09-29 23:02:41 -04:00
u64 next_ts = 0 , ts ;
2008-05-12 21:20:42 +02:00
int next_cpu = - 1 ;
2012-03-27 10:43:28 -04:00
int next_size = 0 ;
2008-05-12 21:20:42 +02:00
int cpu ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
/*
* If we are in a per_cpu trace file , don ' t bother by iterating over
* all cpu and peek directly .
*/
if ( cpu_file > TRACE_PIPE_ALL_CPU ) {
if ( ring_buffer_empty_cpu ( buffer , cpu_file ) )
return NULL ;
2010-03-31 19:49:26 -04:00
ent = peek_next_entry ( iter , cpu_file , ent_ts , missing_events ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
if ( ent_cpu )
* ent_cpu = cpu_file ;
return ent ;
}
2008-05-12 21:21:00 +02:00
for_each_tracing_cpu ( cpu ) {
2008-08-01 12:26:41 -04:00
2008-09-29 23:02:41 -04:00
if ( ring_buffer_empty_cpu ( buffer , cpu ) )
continue ;
2008-08-01 12:26:41 -04:00
2010-03-31 19:49:26 -04:00
ent = peek_next_entry ( iter , cpu , & ts , & lost_events ) ;
2008-08-01 12:26:41 -04:00
2008-05-12 21:20:46 +02:00
/*
* Pick the entry with the smallest timestamp :
*/
2008-09-29 23:02:41 -04:00
if ( ent & & ( ! next | | ts < next_ts ) ) {
2008-05-12 21:20:42 +02:00
next = ent ;
next_cpu = cpu ;
2008-09-29 23:02:41 -04:00
next_ts = ts ;
2010-03-31 19:49:26 -04:00
next_lost = lost_events ;
2012-03-27 10:43:28 -04:00
next_size = iter - > ent_size ;
2008-05-12 21:20:42 +02:00
}
}
2012-03-27 10:43:28 -04:00
iter - > ent_size = next_size ;
2008-05-12 21:20:42 +02:00
if ( ent_cpu )
* ent_cpu = next_cpu ;
2008-09-29 23:02:41 -04:00
if ( ent_ts )
* ent_ts = next_ts ;
2010-03-31 19:49:26 -04:00
if ( missing_events )
* missing_events = next_lost ;
2008-05-12 21:20:42 +02:00
return next ;
}
2008-08-01 12:26:41 -04:00
/* Find the next real entry, without updating the iterator itself */
2009-02-02 20:29:21 -02:00
struct trace_entry * trace_find_next_entry ( struct trace_iterator * iter ,
int * ent_cpu , u64 * ent_ts )
2008-05-12 21:20:42 +02:00
{
2010-03-31 19:49:26 -04:00
return __find_next_entry ( iter , ent_cpu , NULL , ent_ts ) ;
2008-08-01 12:26:41 -04:00
}
/* Find the next real entry, and increment the iterator to the next entry */
2010-08-05 09:22:23 -05:00
void * trace_find_next_entry_inc ( struct trace_iterator * iter )
2008-08-01 12:26:41 -04:00
{
2010-03-31 19:49:26 -04:00
iter - > ent = __find_next_entry ( iter , & iter - > cpu ,
& iter - > lost_events , & iter - > ts ) ;
2008-08-01 12:26:41 -04:00
2008-09-29 23:02:41 -04:00
if ( iter - > ent )
2008-11-12 12:59:32 +01:00
trace_iterator_increment ( iter ) ;
2008-08-01 12:26:41 -04:00
2008-09-29 23:02:41 -04:00
return iter - > ent ? iter : NULL ;
2008-05-12 21:20:46 +02:00
}
2008-05-12 21:20:42 +02:00
2008-05-12 21:20:51 +02:00
static void trace_consume ( struct trace_iterator * iter )
2008-05-12 21:20:46 +02:00
{
2010-03-31 19:49:26 -04:00
ring_buffer_consume ( iter - > tr - > buffer , iter - > cpu , & iter - > ts ,
& iter - > lost_events ) ;
2008-05-12 21:20:42 +02:00
}
2008-05-12 21:20:51 +02:00
static void * s_next ( struct seq_file * m , void * v , loff_t * pos )
2008-05-12 21:20:42 +02:00
{
struct trace_iterator * iter = m - > private ;
int i = ( int ) * pos ;
2008-05-12 21:20:45 +02:00
void * ent ;
2008-05-12 21:20:42 +02:00
2009-12-07 09:11:39 -05:00
WARN_ON_ONCE ( iter - > leftover ) ;
2008-05-12 21:20:42 +02:00
( * pos ) + + ;
/* can't go backwards */
if ( iter - > idx > i )
return NULL ;
if ( iter - > idx < 0 )
2010-08-05 09:22:23 -05:00
ent = trace_find_next_entry_inc ( iter ) ;
2008-05-12 21:20:42 +02:00
else
ent = iter ;
while ( ent & & iter - > idx < i )
2010-08-05 09:22:23 -05:00
ent = trace_find_next_entry_inc ( iter ) ;
2008-05-12 21:20:42 +02:00
iter - > pos = * pos ;
return ent ;
}
2010-08-05 09:22:23 -05:00
void tracing_iter_reset ( struct trace_iterator * iter , int cpu )
2009-09-01 11:06:29 -04:00
{
struct trace_array * tr = iter - > tr ;
struct ring_buffer_event * event ;
struct ring_buffer_iter * buf_iter ;
unsigned long entries = 0 ;
u64 ts ;
tr - > data [ cpu ] - > skipped_entries = 0 ;
2012-06-27 20:46:14 -04:00
buf_iter = trace_buffer_iter ( iter , cpu ) ;
if ( ! buf_iter )
2009-09-01 11:06:29 -04:00
return ;
ring_buffer_iter_reset ( buf_iter ) ;
/*
* We could have the case with the max latency tracers
* that a reset never took place on a cpu . This is evident
* by the timestamp being before the start of the buffer .
*/
while ( ( event = ring_buffer_iter_peek ( buf_iter , & ts ) ) ) {
if ( ts > = iter - > tr - > time_start )
break ;
entries + + ;
ring_buffer_read ( buf_iter , NULL ) ;
}
tr - > data [ cpu ] - > skipped_entries = entries ;
}
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
/*
* The current tracer is copied to avoid a global locking
* all around .
*/
2008-05-12 21:20:42 +02:00
static void * s_start ( struct seq_file * m , loff_t * pos )
{
struct trace_iterator * iter = m - > private ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
static struct tracer * old_tracer ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
int cpu_file = iter - > cpu_file ;
2008-05-12 21:20:42 +02:00
void * p = NULL ;
loff_t l = 0 ;
2008-09-29 23:02:41 -04:00
int cpu ;
2008-05-12 21:20:42 +02:00
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
/* copy the tracer to avoid using a global lock all around */
2008-05-12 21:20:42 +02:00
mutex_lock ( & trace_types_lock ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
if ( unlikely ( old_tracer ! = current_trace & & current_trace ) ) {
old_tracer = current_trace ;
* iter - > trace = * current_trace ;
2008-05-12 21:20:56 +02:00
}
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
mutex_unlock ( & trace_types_lock ) ;
2008-05-12 21:20:42 +02:00
atomic_inc ( & trace_record_cmdline_disabled ) ;
if ( * pos ! = iter - > pos ) {
iter - > ent = NULL ;
iter - > cpu = 0 ;
iter - > idx = - 1 ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
if ( cpu_file = = TRACE_PIPE_ALL_CPU ) {
for_each_tracing_cpu ( cpu )
2009-09-01 11:06:29 -04:00
tracing_iter_reset ( iter , cpu ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
} else
2009-09-01 11:06:29 -04:00
tracing_iter_reset ( iter , cpu_file ) ;
2008-05-12 21:20:42 +02:00
2010-03-02 17:54:50 +08:00
iter - > leftover = 0 ;
2008-05-12 21:20:42 +02:00
for ( p = iter ; p & & l < * pos ; p = s_next ( m , p , & l ) )
;
} else {
2009-12-07 09:11:39 -05:00
/*
* If we overflowed the seq_file before , then we want
* to just reuse the trace_seq buffer again .
*/
if ( iter - > leftover )
p = iter ;
else {
l = * pos - 1 ;
p = s_next ( m , p , & l ) ;
}
2008-05-12 21:20:42 +02:00
}
2009-05-18 19:35:34 +08:00
trace_event_read_lock ( ) ;
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
trace_access_lock ( cpu_file ) ;
2008-05-12 21:20:42 +02:00
return p ;
}
static void s_stop ( struct seq_file * m , void * p )
{
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
struct trace_iterator * iter = m - > private ;
2008-05-12 21:20:42 +02:00
atomic_dec ( & trace_record_cmdline_disabled ) ;
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
trace_access_unlock ( iter - > cpu_file ) ;
2009-05-18 19:35:34 +08:00
trace_event_read_unlock ( ) ;
2008-05-12 21:20:42 +02:00
}
2011-11-17 10:35:16 -05:00
static void
get_total_entries ( struct trace_array * tr , unsigned long * total , unsigned long * entries )
{
unsigned long count ;
int cpu ;
* total = 0 ;
* entries = 0 ;
for_each_tracing_cpu ( cpu ) {
count = ring_buffer_entries_cpu ( tr - > buffer , cpu ) ;
/*
* If this buffer has skipped entries , then we hold all
* entries for the trace and we need to ignore the
* ones before the time stamp .
*/
if ( tr - > data [ cpu ] - > skipped_entries ) {
count - = tr - > data [ cpu ] - > skipped_entries ;
/* total is the same as the entries */
* total + = count ;
} else
* total + = count +
ring_buffer_overrun_cpu ( tr - > buffer , cpu ) ;
* entries + = count ;
}
}
2008-05-12 21:20:51 +02:00
static void print_lat_help_header ( struct seq_file * m )
2008-05-12 21:20:42 +02:00
{
2008-08-20 16:36:11 -07:00
seq_puts ( m , " # _------=> CPU# \n " ) ;
seq_puts ( m , " # / _-----=> irqs-off \n " ) ;
seq_puts ( m , " # | / _----=> need-resched \n " ) ;
seq_puts ( m , " # || / _---=> hardirq/softirq \n " ) ;
seq_puts ( m , " # ||| / _--=> preempt-depth \n " ) ;
2011-03-09 10:41:56 -05:00
seq_puts ( m , " # |||| / delay \n " ) ;
seq_puts ( m , " # cmd pid ||||| time | caller \n " ) ;
seq_puts ( m , " # \\ / ||||| \\ | / \n " ) ;
2008-05-12 21:20:42 +02:00
}
2011-11-17 10:35:16 -05:00
static void print_event_info ( struct trace_array * tr , struct seq_file * m )
2008-05-12 21:20:42 +02:00
{
2011-11-17 10:35:16 -05:00
unsigned long total ;
unsigned long entries ;
get_total_entries ( tr , & total , & entries ) ;
seq_printf ( m , " # entries-in-buffer/entries-written: %lu/%lu #P:%d \n " ,
entries , total , num_online_cpus ( ) ) ;
seq_puts ( m , " # \n " ) ;
}
static void print_func_help_header ( struct trace_array * tr , struct seq_file * m )
{
print_event_info ( tr , m ) ;
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 09:34:33 -05:00
seq_puts ( m , " # TASK-PID CPU# TIMESTAMP FUNCTION \n " ) ;
2008-08-20 16:36:11 -07:00
seq_puts ( m , " # | | | | | \n " ) ;
2008-05-12 21:20:42 +02:00
}
2011-11-17 10:35:16 -05:00
static void print_func_help_header_irq ( struct trace_array * tr , struct seq_file * m )
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 09:34:33 -05:00
{
2011-11-17 10:35:16 -05:00
print_event_info ( tr , m ) ;
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 09:34:33 -05:00
seq_puts ( m , " # _-----=> irqs-off \n " ) ;
seq_puts ( m , " # / _----=> need-resched \n " ) ;
seq_puts ( m , " # | / _---=> hardirq/softirq \n " ) ;
seq_puts ( m , " # || / _--=> preempt-depth \n " ) ;
seq_puts ( m , " # ||| / delay \n " ) ;
seq_puts ( m , " # TASK-PID CPU# |||| TIMESTAMP FUNCTION \n " ) ;
seq_puts ( m , " # | | | |||| | | \n " ) ;
}
2008-05-12 21:20:42 +02:00
2010-04-02 19:01:22 +02:00
void
2008-05-12 21:20:42 +02:00
print_trace_header ( struct seq_file * m , struct trace_iterator * iter )
{
unsigned long sym_flags = ( trace_flags & TRACE_ITER_SYM_MASK ) ;
struct trace_array * tr = iter - > tr ;
struct trace_array_cpu * data = tr - > data [ tr - > cpu ] ;
struct tracer * type = current_trace ;
2011-11-17 10:35:16 -05:00
unsigned long entries ;
unsigned long total ;
2008-05-12 21:20:42 +02:00
const char * name = " preemption " ;
if ( type )
name = type - > name ;
2011-11-17 10:35:16 -05:00
get_total_entries ( tr , & total , & entries ) ;
2008-05-12 21:20:42 +02:00
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 13:12:43 +09:00
seq_printf ( m , " # %s latency trace v1.1.5 on %s \n " ,
2008-05-12 21:20:42 +02:00
name , UTS_RELEASE ) ;
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 13:12:43 +09:00
seq_puts ( m , " # ----------------------------------- "
2008-05-12 21:20:42 +02:00
" --------------------------------- \n " ) ;
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 13:12:43 +09:00
seq_printf ( m , " # latency: %lu us, #%lu/%lu, CPU#%d | "
2008-05-12 21:20:42 +02:00
" (M:%s VP:%d, KP:%d, SP:%d HP:%d " ,
2008-05-12 21:20:44 +02:00
nsecs_to_usecs ( data - > saved_latency ) ,
2008-05-12 21:20:42 +02:00
entries ,
2008-05-12 21:20:43 +02:00
total ,
2008-05-12 21:20:42 +02:00
tr - > cpu ,
# if defined(CONFIG_PREEMPT_NONE)
" server " ,
# elif defined(CONFIG_PREEMPT_VOLUNTARY)
" desktop " ,
2008-07-10 20:58:12 -04:00
# elif defined(CONFIG_PREEMPT)
2008-05-12 21:20:42 +02:00
" preempt " ,
# else
" unknown " ,
# endif
/* These are reserved for later use */
0 , 0 , 0 , 0 ) ;
# ifdef CONFIG_SMP
seq_printf ( m , " #P:%d) \n " , num_online_cpus ( ) ) ;
# else
seq_puts ( m , " ) \n " ) ;
# endif
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 13:12:43 +09:00
seq_puts ( m , " # ----------------- \n " ) ;
seq_printf ( m , " # | task: %.16s-%d "
2008-05-12 21:20:42 +02:00
" (uid:%d nice:%ld policy:%ld rt_prio:%ld) \n " ,
2012-03-13 16:02:19 -07:00
data - > comm , data - > pid ,
from_kuid_munged ( seq_user_ns ( m ) , data - > uid ) , data - > nice ,
2008-05-12 21:20:42 +02:00
data - > policy , data - > rt_priority ) ;
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 13:12:43 +09:00
seq_puts ( m , " # ----------------- \n " ) ;
2008-05-12 21:20:42 +02:00
if ( data - > critical_start ) {
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 13:12:43 +09:00
seq_puts ( m , " # => started at: " ) ;
2008-05-12 21:20:46 +02:00
seq_print_ip_sym ( & iter - > seq , data - > critical_start , sym_flags ) ;
trace_print_seq ( m , & iter - > seq ) ;
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 13:12:43 +09:00
seq_puts ( m , " \n # => ended at: " ) ;
2008-05-12 21:20:46 +02:00
seq_print_ip_sym ( & iter - > seq , data - > critical_end , sym_flags ) ;
trace_print_seq ( m , & iter - > seq ) ;
2009-09-02 12:27:41 -04:00
seq_puts ( m , " \n # \n " ) ;
2008-05-12 21:20:42 +02:00
}
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 13:12:43 +09:00
seq_puts ( m , " # \n " ) ;
2008-05-12 21:20:42 +02:00
}
2008-11-07 22:36:02 -05:00
static void test_cpu_buff_start ( struct trace_iterator * iter )
{
struct trace_seq * s = & iter - > seq ;
2008-11-12 17:52:38 -05:00
if ( ! ( trace_flags & TRACE_ITER_ANNOTATE ) )
return ;
if ( ! ( iter - > iter_flags & TRACE_FILE_ANNOTATE ) )
return ;
2009-01-01 10:12:23 +10:30
if ( cpumask_test_cpu ( iter - > cpu , iter - > started ) )
2008-11-07 22:36:02 -05:00
return ;
2009-09-01 11:06:29 -04:00
if ( iter - > tr - > data [ iter - > cpu ] - > skipped_entries )
return ;
2009-01-01 10:12:23 +10:30
cpumask_set_cpu ( iter - > cpu , iter - > started ) ;
2009-04-01 22:53:08 +02:00
/* Don't print started cpu buffer for the first entry of the trace */
if ( iter - > idx > 1 )
trace_seq_printf ( s , " ##### CPU %u buffer started #### \n " ,
iter - > cpu ) ;
2008-11-07 22:36:02 -05:00
}
2008-09-29 20:18:34 +02:00
static enum print_line_t print_trace_fmt ( struct trace_iterator * iter )
2008-05-12 21:20:42 +02:00
{
2008-05-12 21:20:46 +02:00
struct trace_seq * s = & iter - > seq ;
2008-05-12 21:20:42 +02:00
unsigned long sym_flags = ( trace_flags & TRACE_ITER_SYM_MASK ) ;
2008-05-12 21:20:45 +02:00
struct trace_entry * entry ;
2008-12-23 23:24:13 -05:00
struct trace_event * event ;
2008-05-12 21:20:42 +02:00
2008-05-12 21:20:45 +02:00
entry = iter - > ent ;
2008-08-01 12:26:41 -04:00
2008-11-07 22:36:02 -05:00
test_cpu_buff_start ( iter ) ;
2009-02-02 20:29:21 -02:00
event = ftrace_find_event ( entry - > type ) ;
2008-05-12 21:20:42 +02:00
2009-02-02 20:29:21 -02:00
if ( trace_flags & TRACE_ITER_CONTEXT_INFO ) {
2009-03-04 21:57:29 -05:00
if ( iter - > iter_flags & TRACE_FILE_LAT_FMT ) {
if ( ! trace_print_lat_context ( iter ) )
goto partial ;
} else {
if ( ! trace_print_context ( iter ) )
goto partial ;
}
2009-02-02 20:29:21 -02:00
}
2008-05-12 21:20:42 +02:00
2009-02-04 20:16:39 -02:00
if ( event )
2010-04-22 18:46:14 -04:00
return event - > funcs - > trace ( iter , sym_flags , event ) ;
2009-02-03 20:20:41 -02:00
if ( ! trace_seq_printf ( s , " Unknown type %d \n " , entry - > type ) )
goto partial ;
2008-11-22 13:28:47 +02:00
2008-09-29 20:18:34 +02:00
return TRACE_TYPE_HANDLED ;
2009-02-03 20:20:41 -02:00
partial :
return TRACE_TYPE_PARTIAL_LINE ;
2008-05-12 21:20:42 +02:00
}
2008-09-29 20:18:34 +02:00
static enum print_line_t print_raw_fmt ( struct trace_iterator * iter )
2008-05-12 21:20:47 +02:00
{
struct trace_seq * s = & iter - > seq ;
struct trace_entry * entry ;
2008-12-23 23:24:13 -05:00
struct trace_event * event ;
2008-05-12 21:20:47 +02:00
entry = iter - > ent ;
2008-08-01 12:26:41 -04:00
2009-02-02 20:29:21 -02:00
if ( trace_flags & TRACE_ITER_CONTEXT_INFO ) {
2009-02-03 20:20:41 -02:00
if ( ! trace_seq_printf ( s , " %d %d %llu " ,
entry - > pid , iter - > cpu , iter - > ts ) )
goto partial ;
2009-02-02 20:29:21 -02:00
}
2008-05-12 21:20:47 +02:00
2008-12-23 23:24:13 -05:00
event = ftrace_find_event ( entry - > type ) ;
2009-02-04 20:16:39 -02:00
if ( event )
2010-04-22 18:46:14 -04:00
return event - > funcs - > raw ( iter , 0 , event ) ;
2009-02-03 20:20:41 -02:00
if ( ! trace_seq_printf ( s , " %d ? \n " , entry - > type ) )
goto partial ;
2008-09-29 23:02:42 -04:00
2008-09-29 20:18:34 +02:00
return TRACE_TYPE_HANDLED ;
2009-02-03 20:20:41 -02:00
partial :
return TRACE_TYPE_PARTIAL_LINE ;
2008-05-12 21:20:47 +02:00
}
2008-09-29 20:18:34 +02:00
static enum print_line_t print_hex_fmt ( struct trace_iterator * iter )
2008-05-12 21:20:49 +02:00
{
struct trace_seq * s = & iter - > seq ;
unsigned char newline = ' \n ' ;
struct trace_entry * entry ;
2008-12-23 23:24:13 -05:00
struct trace_event * event ;
2008-05-12 21:20:49 +02:00
entry = iter - > ent ;
2008-08-01 12:26:41 -04:00
2009-02-02 20:29:21 -02:00
if ( trace_flags & TRACE_ITER_CONTEXT_INFO ) {
SEQ_PUT_HEX_FIELD_RET ( s , entry - > pid ) ;
SEQ_PUT_HEX_FIELD_RET ( s , iter - > cpu ) ;
SEQ_PUT_HEX_FIELD_RET ( s , iter - > ts ) ;
}
2008-05-12 21:20:49 +02:00
2008-12-23 23:24:13 -05:00
event = ftrace_find_event ( entry - > type ) ;
2009-02-04 20:16:39 -02:00
if ( event ) {
2010-04-22 18:46:14 -04:00
enum print_line_t ret = event - > funcs - > hex ( iter , 0 , event ) ;
2009-02-03 20:20:41 -02:00
if ( ret ! = TRACE_TYPE_HANDLED )
return ret ;
}
2008-10-01 10:52:51 -04:00
2008-05-12 21:20:49 +02:00
SEQ_PUT_FIELD_RET ( s , newline ) ;
2008-09-29 20:18:34 +02:00
return TRACE_TYPE_HANDLED ;
2008-05-12 21:20:49 +02:00
}
2008-09-29 20:18:34 +02:00
static enum print_line_t print_bin_fmt ( struct trace_iterator * iter )
2008-05-12 21:20:47 +02:00
{
struct trace_seq * s = & iter - > seq ;
struct trace_entry * entry ;
2008-12-23 23:24:13 -05:00
struct trace_event * event ;
2008-05-12 21:20:47 +02:00
entry = iter - > ent ;
2008-08-01 12:26:41 -04:00
2009-02-02 20:29:21 -02:00
if ( trace_flags & TRACE_ITER_CONTEXT_INFO ) {
SEQ_PUT_FIELD_RET ( s , entry - > pid ) ;
2009-02-07 19:38:43 -05:00
SEQ_PUT_FIELD_RET ( s , iter - > cpu ) ;
2009-02-02 20:29:21 -02:00
SEQ_PUT_FIELD_RET ( s , iter - > ts ) ;
}
2008-05-12 21:20:47 +02:00
2008-12-23 23:24:13 -05:00
event = ftrace_find_event ( entry - > type ) ;
2010-04-22 18:46:14 -04:00
return event ? event - > funcs - > binary ( iter , 0 , event ) :
TRACE_TYPE_HANDLED ;
2008-05-12 21:20:47 +02:00
}
2010-04-02 19:01:22 +02:00
int trace_empty ( struct trace_iterator * iter )
2008-05-12 21:20:42 +02:00
{
2012-06-27 20:46:14 -04:00
struct ring_buffer_iter * buf_iter ;
2008-05-12 21:20:42 +02:00
int cpu ;
2009-03-11 19:52:30 -04:00
/* If we are looking at one CPU buffer, only check that one */
if ( iter - > cpu_file ! = TRACE_PIPE_ALL_CPU ) {
cpu = iter - > cpu_file ;
2012-06-27 20:46:14 -04:00
buf_iter = trace_buffer_iter ( iter , cpu ) ;
if ( buf_iter ) {
if ( ! ring_buffer_iter_empty ( buf_iter ) )
2009-03-11 19:52:30 -04:00
return 0 ;
} else {
if ( ! ring_buffer_empty_cpu ( iter - > tr - > buffer , cpu ) )
return 0 ;
}
return 1 ;
}
2008-05-12 21:21:00 +02:00
for_each_tracing_cpu ( cpu ) {
2012-06-27 20:46:14 -04:00
buf_iter = trace_buffer_iter ( iter , cpu ) ;
if ( buf_iter ) {
if ( ! ring_buffer_iter_empty ( buf_iter ) )
2008-10-01 00:29:53 -04:00
return 0 ;
} else {
if ( ! ring_buffer_empty_cpu ( iter - > tr - > buffer , cpu ) )
return 0 ;
}
2008-05-12 21:20:42 +02:00
}
2008-10-01 00:29:53 -04:00
2008-09-30 18:13:45 +02:00
return 1 ;
2008-05-12 21:20:42 +02:00
}
2009-05-18 19:35:34 +08:00
/* Called with trace_event_read_lock() held. */
2010-08-05 09:22:23 -05:00
enum print_line_t print_trace_line ( struct trace_iterator * iter )
2008-05-12 21:20:47 +02:00
{
2008-09-29 20:18:34 +02:00
enum print_line_t ret ;
2011-03-25 12:05:18 +01:00
if ( iter - > lost_events & &
! trace_seq_printf ( & iter - > seq , " CPU:%d [LOST %lu EVENTS] \n " ,
iter - > cpu , iter - > lost_events ) )
return TRACE_TYPE_PARTIAL_LINE ;
2010-03-31 19:49:26 -04:00
2008-09-29 20:18:34 +02:00
if ( iter - > trace & & iter - > trace - > print_line ) {
ret = iter - > trace - > print_line ( iter ) ;
if ( ret ! = TRACE_TYPE_UNHANDLED )
return ret ;
}
2008-05-23 21:37:28 +02:00
2009-03-12 18:24:49 +01:00
if ( iter - > ent - > type = = TRACE_BPRINT & &
trace_flags & TRACE_ITER_PRINTK & &
trace_flags & TRACE_ITER_PRINTK_MSGONLY )
2009-03-19 12:20:38 -04:00
return trace_print_bprintk_msg_only ( iter ) ;
2009-03-12 18:24:49 +01:00
2008-12-13 20:18:13 +01:00
if ( iter - > ent - > type = = TRACE_PRINT & &
trace_flags & TRACE_ITER_PRINTK & &
trace_flags & TRACE_ITER_PRINTK_MSGONLY )
2009-03-19 12:20:38 -04:00
return trace_print_printk_msg_only ( iter ) ;
2008-12-13 20:18:13 +01:00
2008-05-12 21:20:47 +02:00
if ( trace_flags & TRACE_ITER_BIN )
return print_bin_fmt ( iter ) ;
2008-05-12 21:20:49 +02:00
if ( trace_flags & TRACE_ITER_HEX )
return print_hex_fmt ( iter ) ;
2008-05-12 21:20:47 +02:00
if ( trace_flags & TRACE_ITER_RAW )
return print_raw_fmt ( iter ) ;
return print_trace_fmt ( iter ) ;
}
tracing/latency: Fix header output for latency tracers
In case the the graph tracer (CONFIG_FUNCTION_GRAPH_TRACER) or even the
function tracer (CONFIG_FUNCTION_TRACER) are not set, the latency tracers
do not display proper latency header.
The involved/fixed latency tracers are:
wakeup_rt
wakeup
preemptirqsoff
preemptoff
irqsoff
The patch adds proper handling of tracer configuration options for latency
tracers, and displaying correct header info accordingly.
* The current output (for wakeup tracer) with both graph and function
tracers disabled is:
# tracer: wakeup
#
<idle>-0 0d.h5 1us+: 0:120:R + [000] 7: 0:R watchdog/0
<idle>-0 0d.h5 3us+: ttwu_do_activate.clone.1 <-try_to_wake_up
...
* The fixed output is:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 3.1.0-tip+
# --------------------------------------------------------------------
# latency: 55 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
# -----------------
# | task: migration/0-6 (uid:0 nice:0 policy:1 rt_prio:99)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
cat-1129 0d..4 1us : 1129:120:R + [000] 6: 0:R migration/0
cat-1129 0d..4 2us+: ttwu_do_activate.clone.1 <-try_to_wake_up
* The current output (for wakeup tracer) with only function
tracer enabled is:
# tracer: wakeup
#
cat-1140 0d..4 1us+: 1140:120:R + [000] 6: 0:R migration/0
cat-1140 0d..4 2us : ttwu_do_activate.clone.1 <-try_to_wake_up
* The fixed output is:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 3.1.0-tip+
# --------------------------------------------------------------------
# latency: 207 us, #109/109, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
# -----------------
# | task: watchdog/1-12 (uid:0 nice:0 policy:1 rt_prio:99)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
<idle>-0 1d.h5 1us+: 0:120:R + [001] 12: 0:R watchdog/1
<idle>-0 1d.h5 3us : ttwu_do_activate.clone.1 <-try_to_wake_up
Link: http://lkml.kernel.org/r/20111107150849.GE1807@m.brq.redhat.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-07 16:08:49 +01:00
void trace_latency_header ( struct seq_file * m )
{
struct trace_iterator * iter = m - > private ;
/* print nothing if the buffers are empty */
if ( trace_empty ( iter ) )
return ;
if ( iter - > iter_flags & TRACE_FILE_LAT_FMT )
print_trace_header ( m , iter ) ;
if ( ! ( trace_flags & TRACE_ITER_VERBOSE ) )
print_lat_help_header ( m ) ;
}
2010-04-02 19:01:22 +02:00
void trace_default_header ( struct seq_file * m )
{
struct trace_iterator * iter = m - > private ;
2011-06-03 16:58:49 +02:00
if ( ! ( trace_flags & TRACE_ITER_CONTEXT_INFO ) )
return ;
2010-04-02 19:01:22 +02:00
if ( iter - > iter_flags & TRACE_FILE_LAT_FMT ) {
/* print nothing if the buffers are empty */
if ( trace_empty ( iter ) )
return ;
print_trace_header ( m , iter ) ;
if ( ! ( trace_flags & TRACE_ITER_VERBOSE ) )
print_lat_help_header ( m ) ;
} else {
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 09:34:33 -05:00
if ( ! ( trace_flags & TRACE_ITER_VERBOSE ) ) {
if ( trace_flags & TRACE_ITER_IRQ_INFO )
2011-11-17 10:35:16 -05:00
print_func_help_header_irq ( iter - > tr , m ) ;
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 09:34:33 -05:00
else
2011-11-17 10:35:16 -05:00
print_func_help_header ( iter - > tr , m ) ;
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 09:34:33 -05:00
}
2010-04-02 19:01:22 +02:00
}
}
2011-09-29 21:26:16 -04:00
static void test_ftrace_alive ( struct seq_file * m )
{
if ( ! ftrace_is_dead ( ) )
return ;
seq_printf ( m , " # WARNING: FUNCTION TRACING IS CORRUPTED \n " ) ;
seq_printf ( m , " # MAY BE MISSING FUNCTION EVENTS \n " ) ;
}
2008-05-12 21:20:42 +02:00
static int s_show ( struct seq_file * m , void * v )
{
struct trace_iterator * iter = v ;
2009-12-07 09:11:39 -05:00
int ret ;
2008-05-12 21:20:42 +02:00
if ( iter - > ent = = NULL ) {
if ( iter - > tr ) {
seq_printf ( m , " # tracer: %s \n " , iter - > trace - > name ) ;
seq_puts ( m , " # \n " ) ;
2011-09-29 21:26:16 -04:00
test_ftrace_alive ( m ) ;
2008-05-12 21:20:42 +02:00
}
2008-11-25 09:12:31 +01:00
if ( iter - > trace & & iter - > trace - > print_header )
iter - > trace - > print_header ( m ) ;
2010-04-02 19:01:22 +02:00
else
trace_default_header ( m ) ;
2009-12-07 09:11:39 -05:00
} else if ( iter - > leftover ) {
/*
* If we filled the seq_file buffer earlier , we
* want to just show it now .
*/
ret = trace_print_seq ( m , & iter - > seq ) ;
/* ret should this time be zero, but you never know */
iter - > leftover = ret ;
2008-05-12 21:20:42 +02:00
} else {
2008-05-12 21:20:47 +02:00
print_trace_line ( iter ) ;
2009-12-07 09:11:39 -05:00
ret = trace_print_seq ( m , & iter - > seq ) ;
/*
* If we overflow the seq_file buffer , then it will
* ask us for this data again at start up .
* Use that instead .
* ret is 0 if seq_file write succeeded .
* - 1 otherwise .
*/
iter - > leftover = ret ;
2008-05-12 21:20:42 +02:00
}
return 0 ;
}
2009-09-22 16:43:43 -07:00
static const struct seq_operations tracer_seq_ops = {
2008-05-12 21:20:46 +02:00
. start = s_start ,
. next = s_next ,
. stop = s_stop ,
. show = s_show ,
2008-05-12 21:20:42 +02:00
} ;
2008-05-12 21:20:51 +02:00
static struct trace_iterator *
2009-02-27 00:12:38 -05:00
__tracing_open ( struct inode * inode , struct file * file )
2008-05-12 21:20:42 +02:00
{
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
long cpu_file = ( long ) inode - > i_private ;
2008-05-12 21:20:42 +02:00
struct trace_iterator * iter ;
2012-04-25 10:23:39 +02:00
int cpu ;
2008-05-12 21:20:42 +02:00
2009-02-27 00:12:38 -05:00
if ( tracing_disabled )
return ERR_PTR ( - ENODEV ) ;
2008-05-12 21:20:44 +02:00
2012-04-25 10:23:39 +02:00
iter = __seq_open_private ( file , & tracer_seq_ops , sizeof ( * iter ) ) ;
2009-02-27 00:12:38 -05:00
if ( ! iter )
return ERR_PTR ( - ENOMEM ) ;
2008-05-12 21:20:42 +02:00
2012-06-27 20:46:14 -04:00
iter - > buffer_iter = kzalloc ( sizeof ( * iter - > buffer_iter ) * num_possible_cpus ( ) ,
GFP_KERNEL ) ;
2012-07-11 09:35:08 +03:00
if ( ! iter - > buffer_iter )
goto release ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
/*
* We make a copy of the current tracer to avoid concurrent
* changes on it while we are reading .
*/
2008-05-12 21:20:42 +02:00
mutex_lock ( & trace_types_lock ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
iter - > trace = kzalloc ( sizeof ( * iter - > trace ) , GFP_KERNEL ) ;
2009-02-27 00:12:38 -05:00
if ( ! iter - > trace )
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
goto fail ;
2009-02-27 00:12:38 -05:00
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
if ( current_trace )
* iter - > trace = * current_trace ;
2009-06-15 14:58:26 +08:00
if ( ! zalloc_cpumask_var ( & iter - > started , GFP_KERNEL ) )
2009-04-01 22:53:08 +02:00
goto fail ;
2008-05-12 21:20:42 +02:00
if ( current_trace & & current_trace - > print_max )
iter - > tr = & max_tr ;
else
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
iter - > tr = & global_trace ;
2008-05-12 21:20:42 +02:00
iter - > pos = - 1 ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
mutex_init ( & iter - > mutex ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
iter - > cpu_file = cpu_file ;
2008-05-12 21:20:42 +02:00
2008-11-25 09:12:31 +01:00
/* Notify the tracer early; before we stop tracing. */
if ( iter - > trace & & iter - > trace - > open )
2008-12-11 13:53:26 +01:00
iter - > trace - > open ( iter ) ;
2008-11-25 09:12:31 +01:00
2008-11-12 17:52:38 -05:00
/* Annotate start of buffers if we had overruns */
if ( ring_buffer_overruns ( iter - > tr - > buffer ) )
iter - > iter_flags | = TRACE_FILE_ANNOTATE ;
2012-11-13 12:18:22 -08:00
/* Output in nanoseconds only if we are using a clock in nanoseconds. */
if ( trace_clocks [ trace_clock_id ] . in_ns )
iter - > iter_flags | = TRACE_FILE_TIME_IN_NS ;
2009-09-01 11:06:29 -04:00
/* stop the trace while dumping */
tracing_stop ( ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
if ( iter - > cpu_file = = TRACE_PIPE_ALL_CPU ) {
for_each_tracing_cpu ( cpu ) {
iter - > buffer_iter [ cpu ] =
2010-04-20 15:47:11 -07:00
ring_buffer_read_prepare ( iter - > tr - > buffer , cpu ) ;
}
ring_buffer_read_prepare_sync ( ) ;
for_each_tracing_cpu ( cpu ) {
ring_buffer_read_start ( iter - > buffer_iter [ cpu ] ) ;
2009-09-01 11:06:29 -04:00
tracing_iter_reset ( iter , cpu ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
}
} else {
cpu = iter - > cpu_file ;
2008-09-29 23:02:41 -04:00
iter - > buffer_iter [ cpu ] =
2010-04-20 15:47:11 -07:00
ring_buffer_read_prepare ( iter - > tr - > buffer , cpu ) ;
ring_buffer_read_prepare_sync ( ) ;
ring_buffer_read_start ( iter - > buffer_iter [ cpu ] ) ;
2009-09-01 11:06:29 -04:00
tracing_iter_reset ( iter , cpu ) ;
2008-09-29 23:02:41 -04:00
}
2008-05-12 21:20:42 +02:00
mutex_unlock ( & trace_types_lock ) ;
return iter ;
2008-09-29 23:02:41 -04:00
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
fail :
2008-09-29 23:02:41 -04:00
mutex_unlock ( & trace_types_lock ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
kfree ( iter - > trace ) ;
2012-06-27 20:46:14 -04:00
kfree ( iter - > buffer_iter ) ;
2012-07-11 09:35:08 +03:00
release :
2012-04-25 10:23:39 +02:00
seq_release_private ( inode , file ) ;
return ERR_PTR ( - ENOMEM ) ;
2008-05-12 21:20:42 +02:00
}
int tracing_open_generic ( struct inode * inode , struct file * filp )
{
2008-05-12 21:20:44 +02:00
if ( tracing_disabled )
return - ENODEV ;
2008-05-12 21:20:42 +02:00
filp - > private_data = inode - > i_private ;
return 0 ;
}
2009-02-10 19:44:12 +01:00
static int tracing_release ( struct inode * inode , struct file * file )
2008-05-12 21:20:42 +02:00
{
2010-09-27 19:04:53 -07:00
struct seq_file * m = file - > private_data ;
2009-03-18 10:40:24 -04:00
struct trace_iterator * iter ;
2008-09-29 23:02:41 -04:00
int cpu ;
2008-05-12 21:20:42 +02:00
2009-03-18 10:40:24 -04:00
if ( ! ( file - > f_mode & FMODE_READ ) )
return 0 ;
iter = m - > private ;
2008-05-12 21:20:42 +02:00
mutex_lock ( & trace_types_lock ) ;
2008-09-29 23:02:41 -04:00
for_each_tracing_cpu ( cpu ) {
if ( iter - > buffer_iter [ cpu ] )
ring_buffer_read_finish ( iter - > buffer_iter [ cpu ] ) ;
}
2008-05-12 21:20:42 +02:00
if ( iter - > trace & & iter - > trace - > close )
iter - > trace - > close ( iter ) ;
/* reenable tracing if it was previously enabled */
ftrace: restructure tracing start/stop infrastructure
Impact: change where tracing is started up and stopped
Currently, when a new tracer is selected via echo'ing a tracer name into
the current_tracer file, the startup is only done if tracing_enabled is
set to one. If tracing_enabled is changed to zero (by echo'ing 0 into
the tracing_enabled file) a full shutdown is performed.
The full startup and shutdown of a tracer can be expensive and the
user can lose out traces when echo'ing in 0 to the tracing_enabled file,
because the process takes too long. There can also be places that
the user would like to start and stop the tracer several times and
doing the full startup and shutdown of a tracer might be too expensive.
This patch performs the full startup and shutdown when a tracer is
selected. It also adds a way to do a quick start or stop of a tracer.
The quick version is just a flag that prevents the tracing from
taking place, but the overhead of the code is still there.
For example, the startup of a tracer may enable tracepoints, or enable
the function tracer. The stop and start will just set a flag to
have the tracer ignore the calls when the tracepoint or function trace
is called. The overhead of the tracer may still be present when
the tracer is stopped, but no tracing will occur. Setting the tracer
to the 'nop' tracer (or any other tracer) will perform the shutdown
of the tracer which will disable the tracepoint or disable the
function tracer.
The tracing_enabled file will simply start or stop tracing.
This change is all internal. The end result for the user should be the same
as before. If tracing_enabled is not set, no trace will happen.
If tracing_enabled is set, then the trace will happen. The tracing_enabled
variable is static between tracers. Enabling tracing_enabled and
going to another tracer will keep tracing_enabled enabled. Same
is true with disabling tracing_enabled.
This patch will now provide a fast start/stop method to the users
for enabling or disabling tracing.
Note: There were two methods to the struct tracer that were never
used: The methods start and stop. These were to be used as a hook
to the reading of the trace output, but ended up not being
necessary. These two methods are now used to enable the start
and stop of each tracer, in case the tracer needs to do more than
just not write into the buffer. For example, the irqsoff tracer
must stop recording max latencies when tracing is stopped.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 16:05:44 -05:00
tracing_start ( ) ;
2008-05-12 21:20:42 +02:00
mutex_unlock ( & trace_types_lock ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
mutex_destroy ( & iter - > mutex ) ;
2009-04-01 22:53:08 +02:00
free_cpumask_var ( iter - > started ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
kfree ( iter - > trace ) ;
2012-06-27 20:46:14 -04:00
kfree ( iter - > buffer_iter ) ;
2012-04-25 10:23:39 +02:00
seq_release_private ( inode , file ) ;
2008-05-12 21:20:42 +02:00
return 0 ;
}
static int tracing_open ( struct inode * inode , struct file * file )
{
2009-02-27 00:12:38 -05:00
struct trace_iterator * iter ;
int ret = 0 ;
2008-05-12 21:20:42 +02:00
2009-03-18 10:40:24 -04:00
/* If this file was open for write, then erase contents */
if ( ( file - > f_mode & FMODE_WRITE ) & &
2009-07-22 23:29:30 -04:00
( file - > f_flags & O_TRUNC ) ) {
2009-03-18 10:40:24 -04:00
long cpu = ( long ) inode - > i_private ;
2008-05-12 21:20:42 +02:00
2009-03-18 10:40:24 -04:00
if ( cpu = = TRACE_PIPE_ALL_CPU )
tracing_reset_online_cpus ( & global_trace ) ;
else
tracing_reset ( & global_trace , cpu ) ;
}
2008-05-12 21:20:42 +02:00
2009-03-18 10:40:24 -04:00
if ( file - > f_mode & FMODE_READ ) {
iter = __tracing_open ( inode , file ) ;
if ( IS_ERR ( iter ) )
ret = PTR_ERR ( iter ) ;
else if ( trace_flags & TRACE_ITER_LATENCY_FMT )
iter - > iter_flags | = TRACE_FILE_LAT_FMT ;
}
2008-05-12 21:20:42 +02:00
return ret ;
}
2008-05-12 21:20:51 +02:00
static void *
2008-05-12 21:20:42 +02:00
t_next ( struct seq_file * m , void * v , loff_t * pos )
{
2009-06-24 09:53:44 +08:00
struct tracer * t = v ;
2008-05-12 21:20:42 +02:00
( * pos ) + + ;
if ( t )
t = t - > next ;
return t ;
}
static void * t_start ( struct seq_file * m , loff_t * pos )
{
2009-06-24 09:53:44 +08:00
struct tracer * t ;
2008-05-12 21:20:42 +02:00
loff_t l = 0 ;
mutex_lock ( & trace_types_lock ) ;
2009-06-24 09:53:44 +08:00
for ( t = trace_types ; t & & l < * pos ; t = t_next ( m , t , & l ) )
2008-05-12 21:20:42 +02:00
;
return t ;
}
static void t_stop ( struct seq_file * m , void * p )
{
mutex_unlock ( & trace_types_lock ) ;
}
static int t_show ( struct seq_file * m , void * v )
{
struct tracer * t = v ;
if ( ! t )
return 0 ;
seq_printf ( m , " %s " , t - > name ) ;
if ( t - > next )
seq_putc ( m , ' ' ) ;
else
seq_putc ( m , ' \n ' ) ;
return 0 ;
}
2009-09-22 16:43:43 -07:00
static const struct seq_operations show_traces_seq_ops = {
2008-05-12 21:20:46 +02:00
. start = t_start ,
. next = t_next ,
. stop = t_stop ,
. show = t_show ,
2008-05-12 21:20:42 +02:00
} ;
static int show_traces_open ( struct inode * inode , struct file * file )
{
2008-05-12 21:20:44 +02:00
if ( tracing_disabled )
return - ENODEV ;
2009-06-24 09:53:44 +08:00
return seq_open ( file , & show_traces_seq_ops ) ;
2008-05-12 21:20:42 +02:00
}
2009-03-18 10:40:24 -04:00
static ssize_t
tracing_write_stub ( struct file * filp , const char __user * ubuf ,
size_t count , loff_t * ppos )
{
return count ;
}
2010-11-24 15:13:16 -08:00
static loff_t tracing_seek ( struct file * file , loff_t offset , int origin )
{
if ( file - > f_mode & FMODE_READ )
return seq_lseek ( file , offset , origin ) ;
else
return 0 ;
}
2009-03-05 21:44:55 -05:00
static const struct file_operations tracing_fops = {
2008-05-12 21:20:46 +02:00
. open = tracing_open ,
. read = seq_read ,
2009-03-18 10:40:24 -04:00
. write = tracing_write_stub ,
2010-11-24 15:13:16 -08:00
. llseek = tracing_seek ,
2008-05-12 21:20:46 +02:00
. release = tracing_release ,
2008-05-12 21:20:42 +02:00
} ;
2009-03-05 21:44:55 -05:00
static const struct file_operations show_traces_fops = {
2008-05-12 21:20:52 +02:00
. open = show_traces_open ,
. read = seq_read ,
. release = seq_release ,
2010-07-07 23:40:11 +02:00
. llseek = seq_lseek ,
2008-05-12 21:20:52 +02:00
} ;
2008-05-12 21:20:52 +02:00
/*
* Only trace on a CPU if the bitmask is set :
*/
2009-01-01 10:12:22 +10:30
static cpumask_var_t tracing_cpumask ;
2008-05-12 21:20:52 +02:00
/*
* The tracer itself will not take this lock , but still we want
* to provide a consistent cpumask to user - space :
*/
static DEFINE_MUTEX ( tracing_cpumask_update_lock ) ;
/*
* Temporary storage for the character representation of the
* CPU bitmask ( and one more byte for the newline ) :
*/
static char mask_str [ NR_CPUS + 1 ] ;
2008-05-12 21:20:52 +02:00
static ssize_t
tracing_cpumask_read ( struct file * filp , char __user * ubuf ,
size_t count , loff_t * ppos )
{
2008-05-12 21:20:52 +02:00
int len ;
2008-05-12 21:20:52 +02:00
mutex_lock ( & tracing_cpumask_update_lock ) ;
2008-05-12 21:20:52 +02:00
2009-01-01 10:12:22 +10:30
len = cpumask_scnprintf ( mask_str , count , tracing_cpumask ) ;
2008-05-12 21:20:52 +02:00
if ( count - len < 2 ) {
count = - EINVAL ;
goto out_err ;
}
len + = sprintf ( mask_str + len , " \n " ) ;
count = simple_read_from_buffer ( ubuf , count , ppos , mask_str , NR_CPUS + 1 ) ;
out_err :
2008-05-12 21:20:52 +02:00
mutex_unlock ( & tracing_cpumask_update_lock ) ;
return count ;
}
static ssize_t
tracing_cpumask_write ( struct file * filp , const char __user * ubuf ,
size_t count , loff_t * ppos )
{
2008-05-12 21:20:52 +02:00
int err , cpu ;
2009-01-01 10:12:22 +10:30
cpumask_var_t tracing_cpumask_new ;
if ( ! alloc_cpumask_var ( & tracing_cpumask_new , GFP_KERNEL ) )
return - ENOMEM ;
2008-05-12 21:20:52 +02:00
2009-01-01 10:12:22 +10:30
err = cpumask_parse_user ( ubuf , count , tracing_cpumask_new ) ;
2008-05-12 21:20:52 +02:00
if ( err )
2008-05-12 21:20:52 +02:00
goto err_unlock ;
2009-06-15 10:56:42 +08:00
mutex_lock ( & tracing_cpumask_update_lock ) ;
2008-12-02 15:34:05 -05:00
local_irq_disable ( ) ;
2009-12-02 20:01:25 +01:00
arch_spin_lock ( & ftrace_max_lock ) ;
2008-05-12 21:21:00 +02:00
for_each_tracing_cpu ( cpu ) {
2008-05-12 21:20:52 +02:00
/*
* Increase / decrease the disabled counter if we are
* about to flip a bit in the cpumask :
*/
2009-01-01 10:12:22 +10:30
if ( cpumask_test_cpu ( cpu , tracing_cpumask ) & &
! cpumask_test_cpu ( cpu , tracing_cpumask_new ) ) {
2008-05-12 21:20:52 +02:00
atomic_inc ( & global_trace . data [ cpu ] - > disabled ) ;
2012-05-03 18:59:52 -07:00
ring_buffer_record_disable_cpu ( global_trace . buffer , cpu ) ;
2008-05-12 21:20:52 +02:00
}
2009-01-01 10:12:22 +10:30
if ( ! cpumask_test_cpu ( cpu , tracing_cpumask ) & &
cpumask_test_cpu ( cpu , tracing_cpumask_new ) ) {
2008-05-12 21:20:52 +02:00
atomic_dec ( & global_trace . data [ cpu ] - > disabled ) ;
2012-05-03 18:59:52 -07:00
ring_buffer_record_enable_cpu ( global_trace . buffer , cpu ) ;
2008-05-12 21:20:52 +02:00
}
}
2009-12-02 20:01:25 +01:00
arch_spin_unlock ( & ftrace_max_lock ) ;
2008-12-02 15:34:05 -05:00
local_irq_enable ( ) ;
2008-05-12 21:20:52 +02:00
2009-01-01 10:12:22 +10:30
cpumask_copy ( tracing_cpumask , tracing_cpumask_new ) ;
2008-05-12 21:20:52 +02:00
mutex_unlock ( & tracing_cpumask_update_lock ) ;
2009-01-01 10:12:22 +10:30
free_cpumask_var ( tracing_cpumask_new ) ;
2008-05-12 21:20:52 +02:00
return count ;
2008-05-12 21:20:52 +02:00
err_unlock :
2009-06-15 10:56:42 +08:00
free_cpumask_var ( tracing_cpumask_new ) ;
2008-05-12 21:20:52 +02:00
return err ;
2008-05-12 21:20:52 +02:00
}
2009-03-05 21:44:55 -05:00
static const struct file_operations tracing_cpumask_fops = {
2008-05-12 21:20:52 +02:00
. open = tracing_open_generic ,
. read = tracing_cpumask_read ,
. write = tracing_cpumask_write ,
2010-07-07 23:40:11 +02:00
. llseek = generic_file_llseek ,
2008-05-12 21:20:42 +02:00
} ;
2009-12-08 11:15:59 +08:00
static int tracing_trace_options_show ( struct seq_file * m , void * v )
2008-05-12 21:20:42 +02:00
{
2009-02-26 23:55:58 -05:00
struct tracer_opt * trace_opts ;
u32 tracer_flags ;
int i ;
2008-11-17 19:23:42 +01:00
2009-02-26 23:55:58 -05:00
mutex_lock ( & trace_types_lock ) ;
tracer_flags = current_trace - > flags - > val ;
trace_opts = current_trace - > flags - > opts ;
2008-05-12 21:20:42 +02:00
for ( i = 0 ; trace_options [ i ] ; i + + ) {
if ( trace_flags & ( 1 < < i ) )
2009-12-08 11:15:59 +08:00
seq_printf ( m , " %s \n " , trace_options [ i ] ) ;
2008-05-12 21:20:42 +02:00
else
2009-12-08 11:15:59 +08:00
seq_printf ( m , " no%s \n " , trace_options [ i ] ) ;
2008-05-12 21:20:42 +02:00
}
2008-11-17 19:23:42 +01:00
for ( i = 0 ; trace_opts [ i ] . name ; i + + ) {
if ( tracer_flags & trace_opts [ i ] . bit )
2009-12-08 11:15:59 +08:00
seq_printf ( m , " %s \n " , trace_opts [ i ] . name ) ;
2008-11-17 19:23:42 +01:00
else
2009-12-08 11:15:59 +08:00
seq_printf ( m , " no%s \n " , trace_opts [ i ] . name ) ;
2008-11-17 19:23:42 +01:00
}
2009-02-26 23:55:58 -05:00
mutex_unlock ( & trace_types_lock ) ;
2008-11-17 19:23:42 +01:00
2009-12-08 11:15:59 +08:00
return 0 ;
2008-05-12 21:20:42 +02:00
}
2009-12-08 11:17:06 +08:00
static int __set_tracer_option ( struct tracer * trace ,
struct tracer_flags * tracer_flags ,
struct tracer_opt * opts , int neg )
{
int ret ;
2008-05-12 21:20:42 +02:00
2009-12-08 11:17:06 +08:00
ret = trace - > set_flag ( tracer_flags - > val , opts - > bit , ! neg ) ;
if ( ret )
return ret ;
if ( neg )
tracer_flags - > val & = ~ opts - > bit ;
else
tracer_flags - > val | = opts - > bit ;
return 0 ;
2008-05-12 21:20:42 +02:00
}
2008-11-17 19:23:42 +01:00
/* Try to assign a tracer specific option */
static int set_tracer_option ( struct tracer * trace , char * cmp , int neg )
{
2009-08-07 18:53:21 +08:00
struct tracer_flags * tracer_flags = trace - > flags ;
2008-11-17 19:23:42 +01:00
struct tracer_opt * opts = NULL ;
2009-12-08 11:17:06 +08:00
int i ;
2008-11-17 19:23:42 +01:00
2009-08-07 18:53:21 +08:00
for ( i = 0 ; tracer_flags - > opts [ i ] . name ; i + + ) {
opts = & tracer_flags - > opts [ i ] ;
2008-11-17 19:23:42 +01:00
2009-12-08 11:17:06 +08:00
if ( strcmp ( cmp , opts - > name ) = = 0 )
return __set_tracer_option ( trace , trace - > flags ,
opts , neg ) ;
2008-11-17 19:23:42 +01:00
}
2009-12-08 11:17:06 +08:00
return - EINVAL ;
2008-11-17 19:23:42 +01:00
}
2009-03-17 18:09:55 -04:00
static void set_tracer_flags ( unsigned int mask , int enabled )
{
/* do nothing if flag is already set */
if ( ! ! ( trace_flags & mask ) = = ! ! enabled )
return ;
if ( enabled )
trace_flags | = mask ;
else
trace_flags & = ~ mask ;
2010-07-02 11:07:32 +08:00
if ( mask = = TRACE_ITER_RECORD_CMD )
trace_event_enable_cmd_record ( enabled ) ;
2010-12-08 13:46:47 -08:00
if ( mask = = TRACE_ITER_OVERWRITE )
ring_buffer_change_overwrite ( global_trace . buffer , enabled ) ;
2012-10-11 10:15:05 -04:00
if ( mask = = TRACE_ITER_PRINTK )
trace_printk_start_stop_comm ( enabled ) ;
2009-03-17 18:09:55 -04:00
}
2012-11-01 22:56:07 -04:00
static int trace_set_options ( char * option )
2008-05-12 21:20:42 +02:00
{
2009-12-08 11:17:06 +08:00
char * cmp ;
2008-05-12 21:20:42 +02:00
int neg = 0 ;
2012-11-01 22:56:07 -04:00
int ret = 0 ;
2008-05-12 21:20:42 +02:00
int i ;
2012-11-01 22:56:07 -04:00
cmp = strstrip ( option ) ;
2008-05-12 21:20:42 +02:00
2009-12-08 11:17:06 +08:00
if ( strncmp ( cmp , " no " , 2 ) = = 0 ) {
2008-05-12 21:20:42 +02:00
neg = 1 ;
cmp + = 2 ;
}
for ( i = 0 ; trace_options [ i ] ; i + + ) {
2009-12-08 11:17:06 +08:00
if ( strcmp ( cmp , trace_options [ i ] ) = = 0 ) {
2009-03-17 18:09:55 -04:00
set_tracer_flags ( 1 < < i , ! neg ) ;
2008-05-12 21:20:42 +02:00
break ;
}
}
2008-11-17 19:23:42 +01:00
/* If no option could be set, test the specific tracer options */
if ( ! trace_options [ i ] ) {
2009-02-26 23:55:58 -05:00
mutex_lock ( & trace_types_lock ) ;
2008-11-17 19:23:42 +01:00
ret = set_tracer_option ( current_trace , cmp , neg ) ;
2009-02-26 23:55:58 -05:00
mutex_unlock ( & trace_types_lock ) ;
2008-11-17 19:23:42 +01:00
}
2008-05-12 21:20:42 +02:00
2012-11-01 22:56:07 -04:00
return ret ;
}
static ssize_t
tracing_trace_options_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
char buf [ 64 ] ;
if ( cnt > = sizeof ( buf ) )
return - EINVAL ;
if ( copy_from_user ( & buf , ubuf , cnt ) )
return - EFAULT ;
trace_set_options ( buf ) ;
2009-10-23 19:36:16 -04:00
* ppos + = cnt ;
2008-05-12 21:20:42 +02:00
return cnt ;
}
2009-12-08 11:15:59 +08:00
static int tracing_trace_options_open ( struct inode * inode , struct file * file )
{
if ( tracing_disabled )
return - ENODEV ;
return single_open ( file , tracing_trace_options_show , NULL ) ;
}
2009-03-05 21:44:55 -05:00
static const struct file_operations tracing_iter_fops = {
2009-12-08 11:15:59 +08:00
. open = tracing_trace_options_open ,
. read = seq_read ,
. llseek = seq_lseek ,
. release = single_release ,
2008-11-12 17:52:37 -05:00
. write = tracing_trace_options_write ,
2008-05-12 21:20:42 +02:00
} ;
2008-05-12 21:20:45 +02:00
static const char readme_msg [ ] =
" tracing mini-HOWTO: \n \n "
2009-06-02 15:01:37 +09:00
" # mount -t debugfs nodev /sys/kernel/debug \n \n "
" # cat /sys/kernel/debug/tracing/available_tracers \n "
2012-02-08 19:05:36 +09:00
" wakeup wakeup_rt preemptirqsoff preemptoff irqsoff function nop \n \n "
2009-06-02 15:01:37 +09:00
" # cat /sys/kernel/debug/tracing/current_tracer \n "
2009-03-23 11:58:31 +05:30
" nop \n "
2012-02-08 19:05:36 +09:00
" # echo wakeup > /sys/kernel/debug/tracing/current_tracer \n "
2009-06-02 15:01:37 +09:00
" # cat /sys/kernel/debug/tracing/current_tracer \n "
2012-02-08 19:05:36 +09:00
" wakeup \n "
2009-06-02 15:01:37 +09:00
" # cat /sys/kernel/debug/tracing/trace_options \n "
2008-05-12 21:20:45 +02:00
" noprint-parent nosym-offset nosym-addr noverbose \n "
2009-06-02 15:01:37 +09:00
" # echo print-parent > /sys/kernel/debug/tracing/trace_options \n "
2011-08-12 14:30:22 +09:00
" # echo 1 > /sys/kernel/debug/tracing/tracing_on \n "
2009-06-02 15:01:37 +09:00
" # cat /sys/kernel/debug/tracing/trace > /tmp/trace.txt \n "
2011-08-12 14:30:22 +09:00
" # echo 0 > /sys/kernel/debug/tracing/tracing_on \n "
2008-05-12 21:20:45 +02:00
;
static ssize_t
tracing_readme_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
return simple_read_from_buffer ( ubuf , cnt , ppos ,
readme_msg , strlen ( readme_msg ) ) ;
}
2009-03-05 21:44:55 -05:00
static const struct file_operations tracing_readme_fops = {
2008-05-12 21:20:52 +02:00
. open = tracing_open_generic ,
. read = tracing_readme_read ,
2010-07-07 23:40:11 +02:00
. llseek = generic_file_llseek ,
2008-05-12 21:20:45 +02:00
} ;
2009-04-10 16:04:48 -04:00
static ssize_t
tracing_saved_cmdlines_read ( struct file * file , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
char * buf_comm ;
char * file_buf ;
char * buf ;
int len = 0 ;
int pid ;
int i ;
file_buf = kmalloc ( SAVED_CMDLINES * ( 16 + TASK_COMM_LEN ) , GFP_KERNEL ) ;
if ( ! file_buf )
return - ENOMEM ;
buf_comm = kmalloc ( TASK_COMM_LEN , GFP_KERNEL ) ;
if ( ! buf_comm ) {
kfree ( file_buf ) ;
return - ENOMEM ;
}
buf = file_buf ;
for ( i = 0 ; i < SAVED_CMDLINES ; i + + ) {
int r ;
pid = map_cmdline_to_pid [ i ] ;
if ( pid = = - 1 | | pid = = NO_CMDLINE_MAP )
continue ;
trace_find_cmdline ( pid , buf_comm ) ;
r = sprintf ( buf , " %d %s \n " , pid , buf_comm ) ;
buf + = r ;
len + = r ;
}
len = simple_read_from_buffer ( ubuf , cnt , ppos ,
file_buf , len ) ;
kfree ( file_buf ) ;
kfree ( buf_comm ) ;
return len ;
}
static const struct file_operations tracing_saved_cmdlines_fops = {
. open = tracing_open_generic ,
. read = tracing_saved_cmdlines_read ,
2010-07-07 23:40:11 +02:00
. llseek = generic_file_llseek ,
2009-04-10 16:04:48 -04:00
} ;
2008-05-12 21:20:42 +02:00
static ssize_t
tracing_set_trace_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2009-09-18 14:06:47 +08:00
char buf [ MAX_TRACER_SIZE + 2 ] ;
2008-05-12 21:20:42 +02:00
int r ;
mutex_lock ( & trace_types_lock ) ;
if ( current_trace )
r = sprintf ( buf , " %s \n " , current_trace - > name ) ;
else
r = sprintf ( buf , " \n " ) ;
mutex_unlock ( & trace_types_lock ) ;
2008-05-12 21:20:46 +02:00
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
2008-05-12 21:20:42 +02:00
}
2009-02-05 18:02:00 -02:00
int tracer_init ( struct tracer * t , struct trace_array * tr )
{
tracing_reset_online_cpus ( tr ) ;
return t - > init ( tr ) ;
}
2012-02-02 12:00:41 -08:00
static void set_buffer_entries ( struct trace_array * tr , unsigned long val )
{
int cpu ;
for_each_tracing_cpu ( cpu )
tr - > data [ cpu ] - > entries = val ;
}
2012-10-17 11:56:16 +09:00
/* resize @tr's buffer to the size of @size_tr's entries */
static int resize_buffer_duplicate_size ( struct trace_array * tr ,
struct trace_array * size_tr , int cpu_id )
{
int cpu , ret = 0 ;
if ( cpu_id = = RING_BUFFER_ALL_CPUS ) {
for_each_tracing_cpu ( cpu ) {
ret = ring_buffer_resize ( tr - > buffer ,
size_tr - > data [ cpu ] - > entries , cpu ) ;
if ( ret < 0 )
break ;
tr - > data [ cpu ] - > entries = size_tr - > data [ cpu ] - > entries ;
}
} else {
ret = ring_buffer_resize ( tr - > buffer ,
size_tr - > data [ cpu_id ] - > entries , cpu_id ) ;
if ( ret = = 0 )
tr - > data [ cpu_id ] - > entries =
size_tr - > data [ cpu_id ] - > entries ;
}
return ret ;
}
2012-02-02 12:00:41 -08:00
static int __tracing_resize_ring_buffer ( unsigned long size , int cpu )
2009-03-11 13:42:01 -04:00
{
int ret ;
/*
* If kernel or user changes the size of the ring buffer
2009-03-12 11:21:08 -04:00
* we use the size that was given , and we can forget about
* expanding it later .
2009-03-11 13:42:01 -04:00
*/
ring_buffer_expanded = 1 ;
2012-10-10 21:44:34 -04:00
/* May be called before buffers are initialized */
if ( ! global_trace . buffer )
return 0 ;
2012-02-02 12:00:41 -08:00
ret = ring_buffer_resize ( global_trace . buffer , size , cpu ) ;
2009-03-11 13:42:01 -04:00
if ( ret < 0 )
return ret ;
2010-07-01 14:34:35 +09:00
if ( ! current_trace - > use_max_tr )
goto out ;
2012-02-02 12:00:41 -08:00
ret = ring_buffer_resize ( max_tr . buffer , size , cpu ) ;
2009-03-11 13:42:01 -04:00
if ( ret < 0 ) {
2012-10-17 11:56:16 +09:00
int r = resize_buffer_duplicate_size ( & global_trace ,
& global_trace , cpu ) ;
2009-03-11 13:42:01 -04:00
if ( r < 0 ) {
2009-03-12 11:21:08 -04:00
/*
* AARGH ! We are left with different
* size max buffer ! ! ! !
* The max buffer is our " snapshot " buffer .
* When a tracer needs a snapshot ( one of the
* latency tracers ) , it swaps the max buffer
* with the saved snap shot . We succeeded to
* update the size of the main buffer , but failed to
* update the size of the max buffer . But when we tried
* to reset the main buffer to the original size , we
* failed there too . This is very unlikely to
* happen , but if it does , warn and kill all
* tracing .
*/
2009-03-11 13:42:01 -04:00
WARN_ON ( 1 ) ;
tracing_disabled = 1 ;
}
return ret ;
}
2012-02-02 12:00:41 -08:00
if ( cpu = = RING_BUFFER_ALL_CPUS )
set_buffer_entries ( & max_tr , size ) ;
else
max_tr . data [ cpu ] - > entries = size ;
2010-07-01 14:34:35 +09:00
out :
2012-02-02 12:00:41 -08:00
if ( cpu = = RING_BUFFER_ALL_CPUS )
set_buffer_entries ( & global_trace , size ) ;
else
global_trace . data [ cpu ] - > entries = size ;
2009-03-11 13:42:01 -04:00
return ret ;
}
2012-02-02 12:00:41 -08:00
static ssize_t tracing_resize_ring_buffer ( unsigned long size , int cpu_id )
2011-06-13 17:51:57 -07:00
{
2012-05-03 18:59:50 -07:00
int ret = size ;
2011-06-13 17:51:57 -07:00
mutex_lock ( & trace_types_lock ) ;
2012-02-02 12:00:41 -08:00
if ( cpu_id ! = RING_BUFFER_ALL_CPUS ) {
/* make sure, this cpu is enabled in the mask */
if ( ! cpumask_test_cpu ( cpu_id , tracing_buffer_mask ) ) {
ret = - EINVAL ;
goto out ;
}
}
2011-06-13 17:51:57 -07:00
2012-02-02 12:00:41 -08:00
ret = __tracing_resize_ring_buffer ( size , cpu_id ) ;
2011-06-13 17:51:57 -07:00
if ( ret < 0 )
ret = - ENOMEM ;
2012-02-02 12:00:41 -08:00
out :
2011-06-13 17:51:57 -07:00
mutex_unlock ( & trace_types_lock ) ;
return ret ;
}
2010-07-01 14:34:35 +09:00
2009-03-11 14:33:00 -04:00
/**
* tracing_update_buffers - used by tracing facility to expand ring buffers
*
* To save on memory when the tracing is never used on a system with it
* configured in . The ring buffers are set to a minimum size . But once
* a user starts to use the tracing facility , then they need to grow
* to their default size .
*
* This function is to be called when a tracer is about to be used .
*/
int tracing_update_buffers ( void )
{
int ret = 0 ;
2009-03-12 11:33:20 -04:00
mutex_lock ( & trace_types_lock ) ;
2009-03-11 14:33:00 -04:00
if ( ! ring_buffer_expanded )
2012-02-02 12:00:41 -08:00
ret = __tracing_resize_ring_buffer ( trace_buf_size ,
RING_BUFFER_ALL_CPUS ) ;
2009-03-12 11:33:20 -04:00
mutex_unlock ( & trace_types_lock ) ;
2009-03-11 14:33:00 -04:00
return ret ;
}
2009-02-26 23:43:05 -05:00
struct trace_option_dentry ;
static struct trace_option_dentry *
create_trace_option_files ( struct tracer * tracer ) ;
static void
destroy_trace_option_files ( struct trace_option_dentry * topts ) ;
2009-02-02 21:38:32 -05:00
static int tracing_set_tracer ( const char * buf )
2008-05-12 21:20:42 +02:00
{
2009-02-26 23:43:05 -05:00
static struct trace_option_dentry * topts ;
2008-05-12 21:20:42 +02:00
struct trace_array * tr = & global_trace ;
struct tracer * t ;
2008-11-01 19:57:37 +01:00
int ret = 0 ;
2008-05-12 21:20:42 +02:00
2009-03-12 11:33:20 -04:00
mutex_lock ( & trace_types_lock ) ;
2009-03-11 13:42:01 -04:00
if ( ! ring_buffer_expanded ) {
2012-02-02 12:00:41 -08:00
ret = __tracing_resize_ring_buffer ( trace_buf_size ,
RING_BUFFER_ALL_CPUS ) ;
2009-03-11 13:42:01 -04:00
if ( ret < 0 )
2009-03-15 22:10:39 +01:00
goto out ;
2009-03-11 13:42:01 -04:00
ret = 0 ;
}
2008-05-12 21:20:42 +02:00
for ( t = trace_types ; t ; t = t - > next ) {
if ( strcmp ( t - > name , buf ) = = 0 )
break ;
}
2008-10-04 22:04:44 +02:00
if ( ! t ) {
ret = - EINVAL ;
goto out ;
}
if ( t = = current_trace )
2008-05-12 21:20:42 +02:00
goto out ;
2008-11-12 15:24:24 -05:00
trace_branch_disable ( ) ;
2008-05-12 21:20:42 +02:00
if ( current_trace & & current_trace - > reset )
current_trace - > reset ( tr ) ;
2010-07-01 14:34:35 +09:00
if ( current_trace & & current_trace - > use_max_tr ) {
/*
* We don ' t free the ring buffer . instead , resize it because
* The max_tr ring buffer has some state ( e . g . ring - > clock ) and
* we want preserve it .
*/
2012-02-02 12:00:41 -08:00
ring_buffer_resize ( max_tr . buffer , 1 , RING_BUFFER_ALL_CPUS ) ;
set_buffer_entries ( & max_tr , 1 ) ;
2010-07-01 14:34:35 +09:00
}
2009-02-26 23:43:05 -05:00
destroy_trace_option_files ( topts ) ;
2012-07-09 17:10:39 -07:00
current_trace = & nop_trace ;
2009-02-26 23:43:05 -05:00
2012-07-09 17:10:39 -07:00
topts = create_trace_option_files ( t ) ;
if ( t - > use_max_tr ) {
2012-02-02 12:00:41 -08:00
/* we need to make per cpu buffer sizes equivalent */
2012-10-17 11:56:16 +09:00
ret = resize_buffer_duplicate_size ( & max_tr , & global_trace ,
RING_BUFFER_ALL_CPUS ) ;
if ( ret < 0 )
goto out ;
2010-07-01 14:34:35 +09:00
}
2009-02-26 23:43:05 -05:00
2008-11-16 05:57:26 +01:00
if ( t - > init ) {
2009-02-05 18:02:00 -02:00
ret = tracer_init ( t , tr ) ;
2008-11-16 05:57:26 +01:00
if ( ret )
goto out ;
}
2008-05-12 21:20:42 +02:00
2012-07-09 17:10:39 -07:00
current_trace = t ;
2008-11-12 15:24:24 -05:00
trace_branch_enable ( tr ) ;
2008-05-12 21:20:42 +02:00
out :
mutex_unlock ( & trace_types_lock ) ;
2008-11-01 19:57:37 +01:00
return ret ;
}
static ssize_t
tracing_set_trace_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2009-09-18 14:06:47 +08:00
char buf [ MAX_TRACER_SIZE + 1 ] ;
2008-11-01 19:57:37 +01:00
int i ;
size_t ret ;
2008-11-16 05:53:19 +01:00
int err ;
ret = cnt ;
2008-11-01 19:57:37 +01:00
2009-09-18 14:06:47 +08:00
if ( cnt > MAX_TRACER_SIZE )
cnt = MAX_TRACER_SIZE ;
2008-11-01 19:57:37 +01:00
if ( copy_from_user ( & buf , ubuf , cnt ) )
return - EFAULT ;
buf [ cnt ] = 0 ;
/* strip ending whitespace. */
for ( i = cnt - 1 ; i > 0 & & isspace ( buf [ i ] ) ; i - - )
buf [ i ] = 0 ;
2008-11-16 05:53:19 +01:00
err = tracing_set_tracer ( buf ) ;
if ( err )
return err ;
2008-11-01 19:57:37 +01:00
2009-10-23 19:36:16 -04:00
* ppos + = ret ;
2008-05-12 21:20:42 +02:00
2008-10-04 22:04:44 +02:00
return ret ;
2008-05-12 21:20:42 +02:00
}
static ssize_t
tracing_max_lat_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
unsigned long * ptr = filp - > private_data ;
char buf [ 64 ] ;
int r ;
2008-05-12 21:21:00 +02:00
r = snprintf ( buf , sizeof ( buf ) , " %ld \n " ,
2008-05-12 21:20:42 +02:00
* ptr = = ( unsigned long ) - 1 ? - 1 : nsecs_to_usecs ( * ptr ) ) ;
2008-05-12 21:21:00 +02:00
if ( r > sizeof ( buf ) )
r = sizeof ( buf ) ;
2008-05-12 21:20:46 +02:00
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
2008-05-12 21:20:42 +02:00
}
static ssize_t
tracing_max_lat_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2009-02-10 19:44:34 +01:00
unsigned long * ptr = filp - > private_data ;
unsigned long val ;
2008-05-12 21:21:00 +02:00
int ret ;
2008-05-12 21:20:42 +02:00
2011-06-07 21:58:27 +02:00
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
2008-05-12 21:21:00 +02:00
return ret ;
2008-05-12 21:20:42 +02:00
* ptr = val * 1000 ;
return cnt ;
}
2008-05-12 21:20:46 +02:00
static int tracing_open_pipe ( struct inode * inode , struct file * filp )
{
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
long cpu_file = ( long ) inode - > i_private ;
2008-05-12 21:20:46 +02:00
struct trace_iterator * iter ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
int ret = 0 ;
2008-05-12 21:20:46 +02:00
if ( tracing_disabled )
return - ENODEV ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
mutex_lock ( & trace_types_lock ) ;
2008-05-12 21:20:46 +02:00
/* create a buffer to store the information to pass to userspace */
iter = kzalloc ( sizeof ( * iter ) , GFP_KERNEL ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
if ( ! iter ) {
ret = - ENOMEM ;
goto out ;
}
2008-05-12 21:20:46 +02:00
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
/*
* We make a copy of the current tracer to avoid concurrent
* changes on it while we are reading .
*/
iter - > trace = kmalloc ( sizeof ( * iter - > trace ) , GFP_KERNEL ) ;
if ( ! iter - > trace ) {
ret = - ENOMEM ;
goto fail ;
}
if ( current_trace )
* iter - > trace = * current_trace ;
2009-01-01 10:12:23 +10:30
if ( ! alloc_cpumask_var ( & iter - > started , GFP_KERNEL ) ) {
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
ret = - ENOMEM ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
goto fail ;
2009-01-01 10:12:23 +10:30
}
2008-11-07 22:36:02 -05:00
/* trace pipe does not show start of buffer */
2009-01-01 10:12:23 +10:30
cpumask_setall ( iter - > started ) ;
2008-11-07 22:36:02 -05:00
2009-06-01 15:16:05 -04:00
if ( trace_flags & TRACE_ITER_LATENCY_FMT )
iter - > iter_flags | = TRACE_FILE_LAT_FMT ;
2012-11-13 12:18:22 -08:00
/* Output in nanoseconds only if we are using a clock in nanoseconds. */
if ( trace_clocks [ trace_clock_id ] . in_ns )
iter - > iter_flags | = TRACE_FILE_TIME_IN_NS ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
iter - > cpu_file = cpu_file ;
2008-05-12 21:20:46 +02:00
iter - > tr = & global_trace ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
mutex_init ( & iter - > mutex ) ;
2008-05-12 21:20:46 +02:00
filp - > private_data = iter ;
2008-05-12 21:21:01 +02:00
if ( iter - > trace - > pipe_open )
iter - > trace - > pipe_open ( iter ) ;
2010-07-07 23:40:11 +02:00
nonseekable_open ( inode , filp ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
out :
mutex_unlock ( & trace_types_lock ) ;
return ret ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
fail :
kfree ( iter - > trace ) ;
kfree ( iter ) ;
mutex_unlock ( & trace_types_lock ) ;
return ret ;
2008-05-12 21:20:46 +02:00
}
static int tracing_release_pipe ( struct inode * inode , struct file * file )
{
struct trace_iterator * iter = file - > private_data ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
mutex_lock ( & trace_types_lock ) ;
2009-12-09 12:37:43 -05:00
if ( iter - > trace - > pipe_close )
2009-12-07 09:06:24 -05:00
iter - > trace - > pipe_close ( iter ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
mutex_unlock ( & trace_types_lock ) ;
2009-01-01 10:12:23 +10:30
free_cpumask_var ( iter - > started ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
mutex_destroy ( & iter - > mutex ) ;
kfree ( iter - > trace ) ;
2008-05-12 21:20:46 +02:00
kfree ( iter ) ;
return 0 ;
}
2008-05-12 21:20:49 +02:00
static unsigned int
tracing_poll_pipe ( struct file * filp , poll_table * poll_table )
{
struct trace_iterator * iter = filp - > private_data ;
if ( trace_flags & TRACE_ITER_BLOCK ) {
/*
* Always select as readable when in blocking mode
*/
return POLLIN | POLLRDNORM ;
2008-05-12 21:21:00 +02:00
} else {
2008-05-12 21:20:49 +02:00
if ( ! trace_empty ( iter ) )
return POLLIN | POLLRDNORM ;
poll_wait ( filp , & trace_wait , poll_table ) ;
if ( ! trace_empty ( iter ) )
return POLLIN | POLLRDNORM ;
return 0 ;
}
}
2009-02-11 02:25:00 +01:00
/*
* This is a make - shift waitqueue .
* A tracer might use this callback on some rare cases :
*
* 1 ) the current tracer might hold the runqueue lock when it wakes up
* a reader , hence a deadlock ( sched , function , and function graph tracers )
* 2 ) the function tracers , trace all functions , we don ' t want
* the overhead of calling wake_up and friends
* ( and tracing them too )
*
* Anyway , this is really very primitive wakeup .
*/
void poll_wait_pipe ( struct trace_iterator * iter )
{
set_current_state ( TASK_INTERRUPTIBLE ) ;
/* sleep for 100 msecs, and try again. */
schedule_timeout ( HZ / 10 ) ;
}
2009-02-09 08:15:55 +02:00
/* Must be called with trace_types_lock mutex held. */
static int tracing_wait_pipe ( struct file * filp )
2008-05-12 21:20:46 +02:00
{
struct trace_iterator * iter = filp - > private_data ;
while ( trace_empty ( iter ) ) {
2008-05-12 21:20:58 +02:00
2008-05-12 21:21:01 +02:00
if ( ( filp - > f_flags & O_NONBLOCK ) ) {
2009-02-09 08:15:55 +02:00
return - EAGAIN ;
2008-05-12 21:21:01 +02:00
}
2008-05-12 21:20:58 +02:00
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
mutex_unlock ( & iter - > mutex ) ;
2008-05-12 21:21:01 +02:00
2009-02-11 02:25:00 +01:00
iter - > trace - > wait_pipe ( iter ) ;
2008-05-12 21:20:46 +02:00
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
mutex_lock ( & iter - > mutex ) ;
2008-05-12 21:21:01 +02:00
2009-02-11 02:25:00 +01:00
if ( signal_pending ( current ) )
2009-02-09 08:15:55 +02:00
return - EINTR ;
2008-05-12 21:20:46 +02:00
/*
2012-05-11 14:25:30 -04:00
* We block until we read something and tracing is enabled .
2008-05-12 21:20:46 +02:00
* We still block if tracing is disabled , but we have never
* read anything . This allows a user to cat this file , and
* then enable tracing . But after we have read something ,
* we give an EOF when tracing is again disabled .
*
* iter - > pos will be 0 if we haven ' t read anything .
*/
2012-05-11 14:25:30 -04:00
if ( tracing_is_enabled ( ) & & iter - > pos )
2008-05-12 21:20:46 +02:00
break ;
}
2009-02-09 08:15:55 +02:00
return 1 ;
}
/*
* Consumer reader .
*/
static ssize_t
tracing_read_pipe ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
struct trace_iterator * iter = filp - > private_data ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
static struct tracer * old_tracer ;
2009-02-09 08:15:55 +02:00
ssize_t sret ;
/* return any leftover data */
sret = trace_seq_to_user ( & iter - > seq , ubuf , cnt ) ;
if ( sret ! = - EBUSY )
return sret ;
2009-03-02 14:04:40 -05:00
trace_seq_init ( & iter - > seq ) ;
2009-02-09 08:15:55 +02:00
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
/* copy the tracer to avoid using a global lock all around */
2009-02-09 08:15:55 +02:00
mutex_lock ( & trace_types_lock ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
if ( unlikely ( old_tracer ! = current_trace & & current_trace ) ) {
old_tracer = current_trace ;
* iter - > trace = * current_trace ;
}
mutex_unlock ( & trace_types_lock ) ;
/*
* Avoid more than one consumer on a single file descriptor
* This is just a matter of traces coherency , the ring buffer itself
* is protected .
*/
mutex_lock ( & iter - > mutex ) ;
2009-02-09 08:15:55 +02:00
if ( iter - > trace - > read ) {
sret = iter - > trace - > read ( iter , filp , ubuf , cnt , ppos ) ;
if ( sret )
goto out ;
}
waitagain :
sret = tracing_wait_pipe ( filp ) ;
if ( sret < = 0 )
goto out ;
2008-05-12 21:20:46 +02:00
/* stop when tracing is finished */
2009-02-09 08:15:55 +02:00
if ( trace_empty ( iter ) ) {
sret = 0 ;
2008-05-12 21:21:01 +02:00
goto out ;
2009-02-09 08:15:55 +02:00
}
2008-05-12 21:20:46 +02:00
if ( cnt > = PAGE_SIZE )
cnt = PAGE_SIZE - 1 ;
2008-05-12 21:21:01 +02:00
/* reset all but tr, trace, and overruns */
memset ( & iter - > seq , 0 ,
sizeof ( struct trace_iterator ) -
offsetof ( struct trace_iterator , seq ) ) ;
2008-05-12 21:21:01 +02:00
iter - > pos = - 1 ;
2008-05-12 21:20:46 +02:00
2009-05-18 19:35:34 +08:00
trace_event_read_lock ( ) ;
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
trace_access_lock ( iter - > cpu_file ) ;
2010-08-05 09:22:23 -05:00
while ( trace_find_next_entry_inc ( iter ) ! = NULL ) {
2008-09-29 20:18:34 +02:00
enum print_line_t ret ;
2008-05-12 21:20:48 +02:00
int len = iter - > seq . len ;
2008-05-12 21:20:47 +02:00
ret = print_trace_line ( iter ) ;
2008-09-29 20:18:34 +02:00
if ( ret = = TRACE_TYPE_PARTIAL_LINE ) {
2008-05-12 21:20:48 +02:00
/* don't print partial lines */
iter - > seq . len = len ;
2008-05-12 21:20:46 +02:00
break ;
2008-05-12 21:20:48 +02:00
}
2009-02-06 18:30:44 +01:00
if ( ret ! = TRACE_TYPE_NO_CONSUME )
trace_consume ( iter ) ;
2008-05-12 21:20:46 +02:00
if ( iter - > seq . len > = cnt )
break ;
2011-03-25 12:05:18 +01:00
/*
* Setting the full flag means we reached the trace_seq buffer
* size and we should leave by partial output condition above .
* One of the trace_seq_ * functions is not used properly .
*/
WARN_ONCE ( iter - > seq . full , " full flag set for trace type %d " ,
iter - > ent - > type ) ;
2008-05-12 21:20:46 +02:00
}
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
trace_access_unlock ( iter - > cpu_file ) ;
2009-05-18 19:35:34 +08:00
trace_event_read_unlock ( ) ;
2008-05-12 21:20:46 +02:00
/* Now copy what we have to the user */
2008-05-12 21:21:02 +02:00
sret = trace_seq_to_user ( & iter - > seq , ubuf , cnt ) ;
if ( iter - > seq . readpos > = iter - > seq . len )
2009-03-02 14:04:40 -05:00
trace_seq_init ( & iter - > seq ) ;
2008-09-29 20:23:48 +02:00
/*
2011-03-30 22:57:33 -03:00
* If there was nothing to send to user , in spite of consuming trace
2008-09-29 20:23:48 +02:00
* entries , go back to wait for more entries .
*/
2008-05-12 21:21:02 +02:00
if ( sret = = - EBUSY )
2008-09-29 20:23:48 +02:00
goto waitagain ;
2008-05-12 21:20:46 +02:00
2008-05-12 21:21:01 +02:00
out :
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
mutex_unlock ( & iter - > mutex ) ;
2008-05-12 21:21:01 +02:00
2008-05-12 21:21:02 +02:00
return sret ;
2008-05-12 21:20:46 +02:00
}
2009-02-09 08:15:56 +02:00
static void tracing_pipe_buf_release ( struct pipe_inode_info * pipe ,
struct pipe_buffer * buf )
{
__free_page ( buf - > page ) ;
}
static void tracing_spd_release_pipe ( struct splice_pipe_desc * spd ,
unsigned int idx )
{
__free_page ( spd - > pages [ idx ] ) ;
}
2009-12-15 16:46:48 -08:00
static const struct pipe_buf_operations tracing_pipe_buf_ops = {
2009-02-09 12:06:29 -05:00
. can_merge = 0 ,
. map = generic_pipe_buf_map ,
. unmap = generic_pipe_buf_unmap ,
. confirm = generic_pipe_buf_confirm ,
. release = tracing_pipe_buf_release ,
. steal = generic_pipe_buf_steal ,
. get = generic_pipe_buf_get ,
2009-02-09 08:15:56 +02:00
} ;
2009-02-09 12:06:29 -05:00
static size_t
2009-02-11 02:51:30 +01:00
tracing_fill_pipe_page ( size_t rem , struct trace_iterator * iter )
2009-02-09 12:06:29 -05:00
{
size_t count ;
int ret ;
/* Seq buffer is page-sized, exactly what we need. */
for ( ; ; ) {
count = iter - > seq . len ;
ret = print_trace_line ( iter ) ;
count = iter - > seq . len - count ;
if ( rem < count ) {
rem = 0 ;
iter - > seq . len - = count ;
break ;
}
if ( ret = = TRACE_TYPE_PARTIAL_LINE ) {
iter - > seq . len - = count ;
break ;
}
2009-07-28 20:17:22 +08:00
if ( ret ! = TRACE_TYPE_NO_CONSUME )
trace_consume ( iter ) ;
2009-02-09 12:06:29 -05:00
rem - = count ;
2010-08-05 09:22:23 -05:00
if ( ! trace_find_next_entry_inc ( iter ) ) {
2009-02-09 12:06:29 -05:00
rem = 0 ;
iter - > ent = NULL ;
break ;
}
}
return rem ;
}
2009-02-09 08:15:56 +02:00
static ssize_t tracing_splice_read_pipe ( struct file * filp ,
loff_t * ppos ,
struct pipe_inode_info * pipe ,
size_t len ,
unsigned int flags )
{
2010-05-20 10:43:18 +02:00
struct page * pages_def [ PIPE_DEF_BUFFERS ] ;
struct partial_page partial_def [ PIPE_DEF_BUFFERS ] ;
2009-02-09 08:15:56 +02:00
struct trace_iterator * iter = filp - > private_data ;
struct splice_pipe_desc spd = {
2010-05-20 10:43:18 +02:00
. pages = pages_def ,
. partial = partial_def ,
2009-02-09 12:06:29 -05:00
. nr_pages = 0 , /* This gets updated below. */
2012-06-12 15:24:40 +02:00
. nr_pages_max = PIPE_DEF_BUFFERS ,
2009-02-09 12:06:29 -05:00
. flags = flags ,
. ops = & tracing_pipe_buf_ops ,
. spd_release = tracing_spd_release_pipe ,
2009-02-09 08:15:56 +02:00
} ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
static struct tracer * old_tracer ;
2009-02-09 08:15:56 +02:00
ssize_t ret ;
2009-02-09 12:06:29 -05:00
size_t rem ;
2009-02-09 08:15:56 +02:00
unsigned int i ;
2010-05-20 10:43:18 +02:00
if ( splice_grow_spd ( pipe , & spd ) )
return - ENOMEM ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
/* copy the tracer to avoid using a global lock all around */
2009-02-09 08:15:56 +02:00
mutex_lock ( & trace_types_lock ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
if ( unlikely ( old_tracer ! = current_trace & & current_trace ) ) {
old_tracer = current_trace ;
* iter - > trace = * current_trace ;
}
mutex_unlock ( & trace_types_lock ) ;
mutex_lock ( & iter - > mutex ) ;
2009-02-09 08:15:56 +02:00
if ( iter - > trace - > splice_read ) {
ret = iter - > trace - > splice_read ( iter , filp ,
ppos , pipe , len , flags ) ;
if ( ret )
2009-02-09 12:06:29 -05:00
goto out_err ;
2009-02-09 08:15:56 +02:00
}
ret = tracing_wait_pipe ( filp ) ;
if ( ret < = 0 )
2009-02-09 12:06:29 -05:00
goto out_err ;
2009-02-09 08:15:56 +02:00
2010-08-05 09:22:23 -05:00
if ( ! iter - > ent & & ! trace_find_next_entry_inc ( iter ) ) {
2009-02-09 08:15:56 +02:00
ret = - EFAULT ;
2009-02-09 12:06:29 -05:00
goto out_err ;
2009-02-09 08:15:56 +02:00
}
2009-05-18 19:35:34 +08:00
trace_event_read_lock ( ) ;
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
trace_access_lock ( iter - > cpu_file ) ;
2009-05-18 19:35:34 +08:00
2009-02-09 08:15:56 +02:00
/* Fill as many pages as possible. */
2010-05-20 10:43:18 +02:00
for ( i = 0 , rem = len ; i < pipe - > buffers & & rem ; i + + ) {
spd . pages [ i ] = alloc_page ( GFP_KERNEL ) ;
if ( ! spd . pages [ i ] )
2009-02-09 12:06:29 -05:00
break ;
2009-02-09 08:15:56 +02:00
2009-02-11 02:51:30 +01:00
rem = tracing_fill_pipe_page ( rem , iter ) ;
2009-02-09 08:15:56 +02:00
/* Copy the data into the page, so we can start over. */
ret = trace_seq_to_buffer ( & iter - > seq ,
2010-05-20 10:43:18 +02:00
page_address ( spd . pages [ i ] ) ,
2009-02-09 08:15:56 +02:00
iter - > seq . len ) ;
if ( ret < 0 ) {
2010-05-20 10:43:18 +02:00
__free_page ( spd . pages [ i ] ) ;
2009-02-09 08:15:56 +02:00
break ;
}
2010-05-20 10:43:18 +02:00
spd . partial [ i ] . offset = 0 ;
spd . partial [ i ] . len = iter - > seq . len ;
2009-02-09 08:15:56 +02:00
2009-03-02 14:04:40 -05:00
trace_seq_init ( & iter - > seq ) ;
2009-02-09 08:15:56 +02:00
}
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
trace_access_unlock ( iter - > cpu_file ) ;
2009-05-18 19:35:34 +08:00
trace_event_read_unlock ( ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
mutex_unlock ( & iter - > mutex ) ;
2009-02-09 08:15:56 +02:00
spd . nr_pages = i ;
2010-05-20 10:43:18 +02:00
ret = splice_to_pipe ( pipe , & spd ) ;
out :
2012-06-12 15:24:40 +02:00
splice_shrink_spd ( & spd ) ;
2010-05-20 10:43:18 +02:00
return ret ;
2009-02-09 08:15:56 +02:00
2009-02-09 12:06:29 -05:00
out_err :
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 06:13:16 +01:00
mutex_unlock ( & iter - > mutex ) ;
2010-05-20 10:43:18 +02:00
goto out ;
2009-02-09 08:15:56 +02:00
}
2012-02-02 12:00:41 -08:00
struct ftrace_entries_info {
struct trace_array * tr ;
int cpu ;
} ;
static int tracing_entries_open ( struct inode * inode , struct file * filp )
{
struct ftrace_entries_info * info ;
if ( tracing_disabled )
return - ENODEV ;
info = kzalloc ( sizeof ( * info ) , GFP_KERNEL ) ;
if ( ! info )
return - ENOMEM ;
info - > tr = & global_trace ;
info - > cpu = ( unsigned long ) inode - > i_private ;
filp - > private_data = info ;
return 0 ;
}
2008-05-12 21:20:59 +02:00
static ssize_t
tracing_entries_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2012-02-02 12:00:41 -08:00
struct ftrace_entries_info * info = filp - > private_data ;
struct trace_array * tr = info - > tr ;
char buf [ 64 ] ;
int r = 0 ;
ssize_t ret ;
2008-05-12 21:20:59 +02:00
2009-03-12 13:53:25 -04:00
mutex_lock ( & trace_types_lock ) ;
2012-02-02 12:00:41 -08:00
if ( info - > cpu = = RING_BUFFER_ALL_CPUS ) {
int cpu , buf_size_same ;
unsigned long size ;
size = 0 ;
buf_size_same = 1 ;
/* check if all cpu sizes are same */
for_each_tracing_cpu ( cpu ) {
/* fill in the size from first enabled cpu */
if ( size = = 0 )
size = tr - > data [ cpu ] - > entries ;
if ( size ! = tr - > data [ cpu ] - > entries ) {
buf_size_same = 0 ;
break ;
}
}
if ( buf_size_same ) {
if ( ! ring_buffer_expanded )
r = sprintf ( buf , " %lu (expanded: %lu) \n " ,
size > > 10 ,
trace_buf_size > > 10 ) ;
else
r = sprintf ( buf , " %lu \n " , size > > 10 ) ;
} else
r = sprintf ( buf , " X \n " ) ;
} else
r = sprintf ( buf , " %lu \n " , tr - > data [ info - > cpu ] - > entries > > 10 ) ;
2009-03-12 13:53:25 -04:00
mutex_unlock ( & trace_types_lock ) ;
2012-02-02 12:00:41 -08:00
ret = simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
return ret ;
2008-05-12 21:20:59 +02:00
}
static ssize_t
tracing_entries_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2012-02-02 12:00:41 -08:00
struct ftrace_entries_info * info = filp - > private_data ;
2008-05-12 21:20:59 +02:00
unsigned long val ;
2011-06-13 17:51:57 -07:00
int ret ;
2008-05-12 21:20:59 +02:00
2011-06-07 21:58:27 +02:00
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
2008-05-12 21:21:00 +02:00
return ret ;
2008-05-12 21:20:59 +02:00
/* must have at least 1 entry */
if ( ! val )
return - EINVAL ;
2008-11-13 00:09:35 -05:00
/* value is in KB */
val < < = 10 ;
2012-02-02 12:00:41 -08:00
ret = tracing_resize_ring_buffer ( val , info - > cpu ) ;
2011-06-13 17:51:57 -07:00
if ( ret < 0 )
return ret ;
2008-05-12 21:20:59 +02:00
2009-10-23 19:36:16 -04:00
* ppos + = cnt ;
2008-05-12 21:20:59 +02:00
2011-06-13 17:51:57 -07:00
return cnt ;
}
2008-11-10 21:46:00 -05:00
2012-02-02 12:00:41 -08:00
static int
tracing_entries_release ( struct inode * inode , struct file * filp )
{
struct ftrace_entries_info * info = filp - > private_data ;
kfree ( info ) ;
return 0 ;
}
2011-08-16 14:46:15 -07:00
static ssize_t
tracing_total_entries_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
struct trace_array * tr = filp - > private_data ;
char buf [ 64 ] ;
int r , cpu ;
unsigned long size = 0 , expanded_size = 0 ;
mutex_lock ( & trace_types_lock ) ;
for_each_tracing_cpu ( cpu ) {
2012-02-02 12:00:41 -08:00
size + = tr - > data [ cpu ] - > entries > > 10 ;
2011-08-16 14:46:15 -07:00
if ( ! ring_buffer_expanded )
expanded_size + = trace_buf_size > > 10 ;
}
if ( ring_buffer_expanded )
r = sprintf ( buf , " %lu \n " , size ) ;
else
r = sprintf ( buf , " %lu (expanded: %lu) \n " , size , expanded_size ) ;
mutex_unlock ( & trace_types_lock ) ;
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
}
2011-06-13 17:51:57 -07:00
static ssize_t
tracing_free_buffer_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
/*
* There is no need to read what the user has written , this function
* is just to make sure that there is no error when " echo " is used
*/
* ppos + = cnt ;
2008-05-12 21:20:59 +02:00
return cnt ;
}
2011-06-13 17:51:57 -07:00
static int
tracing_free_buffer_release ( struct inode * inode , struct file * filp )
{
2011-06-14 22:44:07 -04:00
/* disable tracing ? */
if ( trace_flags & TRACE_ITER_STOP_ON_FREE )
tracing_off ( ) ;
2011-06-13 17:51:57 -07:00
/* resize the ring buffer to 0 */
2012-02-02 12:00:41 -08:00
tracing_resize_ring_buffer ( 0 , RING_BUFFER_ALL_CPUS ) ;
2011-06-13 17:51:57 -07:00
return 0 ;
}
2008-09-16 22:06:42 +03:00
static ssize_t
tracing_mark_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * fpos )
{
2011-09-22 11:50:27 -04:00
unsigned long addr = ( unsigned long ) ubuf ;
struct ring_buffer_event * event ;
struct ring_buffer * buffer ;
struct print_entry * entry ;
unsigned long irq_flags ;
struct page * pages [ 2 ] ;
2012-05-11 23:28:49 -04:00
void * map_page [ 2 ] ;
2011-09-22 11:50:27 -04:00
int nr_pages = 1 ;
ssize_t written ;
int offset ;
int size ;
int len ;
int ret ;
2012-05-11 23:28:49 -04:00
int i ;
2008-09-16 22:06:42 +03:00
2008-11-07 22:36:02 -05:00
if ( tracing_disabled )
2008-09-16 22:06:42 +03:00
return - EINVAL ;
2012-09-07 18:12:19 -07:00
if ( ! ( trace_flags & TRACE_ITER_MARKERS ) )
return - EINVAL ;
2008-09-16 22:06:42 +03:00
if ( cnt > TRACE_BUF_SIZE )
cnt = TRACE_BUF_SIZE ;
2011-09-22 11:50:27 -04:00
/*
* Userspace is injecting traces into the kernel trace buffer .
* We want to be as non intrusive as possible .
* To do so , we do not want to allocate any special buffers
* or take any locks , but instead write the userspace data
* straight into the ring buffer .
*
* First we need to pin the userspace buffer into memory ,
* which , most likely it is , because it just referenced it .
* But there ' s no guarantee that it is . By using get_user_pages_fast ( )
* and kmap_atomic / kunmap_atomic ( ) we can get access to the
* pages directly . We then write the data directly into the
* ring buffer .
*/
BUILD_BUG_ON ( TRACE_BUF_SIZE > = PAGE_SIZE ) ;
2008-09-16 22:06:42 +03:00
2011-09-22 11:50:27 -04:00
/* check if we cross pages */
if ( ( addr & PAGE_MASK ) ! = ( ( addr + cnt ) & PAGE_MASK ) )
nr_pages = 2 ;
offset = addr & ( PAGE_SIZE - 1 ) ;
addr & = PAGE_MASK ;
ret = get_user_pages_fast ( addr , nr_pages , 0 , pages ) ;
if ( ret < nr_pages ) {
while ( - - ret > = 0 )
put_page ( pages [ ret ] ) ;
written = - EFAULT ;
goto out ;
2008-09-16 22:06:42 +03:00
}
2011-09-22 11:50:27 -04:00
2012-05-11 23:28:49 -04:00
for ( i = 0 ; i < nr_pages ; i + + )
map_page [ i ] = kmap_atomic ( pages [ i ] ) ;
2011-09-22 11:50:27 -04:00
local_save_flags ( irq_flags ) ;
size = sizeof ( * entry ) + cnt + 2 ; /* possible \n added */
buffer = global_trace . buffer ;
event = trace_buffer_lock_reserve ( buffer , TRACE_PRINT , size ,
irq_flags , preempt_count ( ) ) ;
if ( ! event ) {
/* Ring buffer disabled, return as if not open for write */
written = - EBADF ;
goto out_unlock ;
2008-09-16 22:06:42 +03:00
}
2011-09-22 11:50:27 -04:00
entry = ring_buffer_event_data ( event ) ;
entry - > ip = _THIS_IP_ ;
if ( nr_pages = = 2 ) {
len = PAGE_SIZE - offset ;
2012-05-11 23:28:49 -04:00
memcpy ( & entry - > buf , map_page [ 0 ] + offset , len ) ;
memcpy ( & entry - > buf [ len ] , map_page [ 1 ] , cnt - len ) ;
2009-11-16 20:56:13 +01:00
} else
2012-05-11 23:28:49 -04:00
memcpy ( & entry - > buf , map_page [ 0 ] + offset , cnt ) ;
2008-09-16 22:06:42 +03:00
2011-09-22 11:50:27 -04:00
if ( entry - > buf [ cnt - 1 ] ! = ' \n ' ) {
entry - > buf [ cnt ] = ' \n ' ;
entry - > buf [ cnt + 1 ] = ' \0 ' ;
} else
entry - > buf [ cnt ] = ' \0 ' ;
2012-10-11 12:14:25 -04:00
__buffer_unlock_commit ( buffer , event ) ;
2008-09-16 22:06:42 +03:00
2011-09-22 11:50:27 -04:00
written = cnt ;
2008-09-16 22:06:42 +03:00
2011-09-22 11:50:27 -04:00
* fpos + = written ;
tracing: Sanitize value returned from write(trace_marker, "...", len)
When userspace code writes non-new-line-terminated string to trace_marker
file, write handler appends new-line and returns number of bytes written
to trace buffer, so
write(fd, "abc", 3) will return 4
That's unexpected and unfortunately it confuses glibc's fprintf function.
Example:
int main() {
fprintf(stderr, "abc");
return 0;
}
$ gcc test.c -o test
$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
$ ./test 2>/sys/kernel/debug/tracing/trace_marker
results in infinite loop:
write(fd, "abc", 3) = 4
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
(...)
...and kernel trace buffer full of empty markers.
Fix it by sanitizing write return value.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
LKML-Reference: <20100727231801.GB2826@joi.lan>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-07-28 01:18:01 +02:00
2011-09-22 11:50:27 -04:00
out_unlock :
2012-05-11 23:28:49 -04:00
for ( i = 0 ; i < nr_pages ; i + + ) {
kunmap_atomic ( map_page [ i ] ) ;
put_page ( pages [ i ] ) ;
}
2011-09-22 11:50:27 -04:00
out :
tracing: Sanitize value returned from write(trace_marker, "...", len)
When userspace code writes non-new-line-terminated string to trace_marker
file, write handler appends new-line and returns number of bytes written
to trace buffer, so
write(fd, "abc", 3) will return 4
That's unexpected and unfortunately it confuses glibc's fprintf function.
Example:
int main() {
fprintf(stderr, "abc");
return 0;
}
$ gcc test.c -o test
$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
$ ./test 2>/sys/kernel/debug/tracing/trace_marker
results in infinite loop:
write(fd, "abc", 3) = 4
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
(...)
...and kernel trace buffer full of empty markers.
Fix it by sanitizing write return value.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
LKML-Reference: <20100727231801.GB2826@joi.lan>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-07-28 01:18:01 +02:00
return written ;
2008-09-16 22:06:42 +03:00
}
2009-12-08 11:16:11 +08:00
static int tracing_clock_show ( struct seq_file * m , void * v )
2009-08-25 16:12:56 +08:00
{
int i ;
for ( i = 0 ; i < ARRAY_SIZE ( trace_clocks ) ; i + + )
2009-12-08 11:16:11 +08:00
seq_printf ( m ,
2009-08-25 16:12:56 +08:00
" %s%s%s%s " , i ? " " : " " ,
i = = trace_clock_id ? " [ " : " " , trace_clocks [ i ] . name ,
i = = trace_clock_id ? " ] " : " " ) ;
2009-12-08 11:16:11 +08:00
seq_putc ( m , ' \n ' ) ;
2009-08-25 16:12:56 +08:00
2009-12-08 11:16:11 +08:00
return 0 ;
2009-08-25 16:12:56 +08:00
}
static ssize_t tracing_clock_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * fpos )
{
char buf [ 64 ] ;
const char * clockstr ;
int i ;
if ( cnt > = sizeof ( buf ) )
return - EINVAL ;
if ( copy_from_user ( & buf , ubuf , cnt ) )
return - EFAULT ;
buf [ cnt ] = 0 ;
clockstr = strstrip ( buf ) ;
for ( i = 0 ; i < ARRAY_SIZE ( trace_clocks ) ; i + + ) {
if ( strcmp ( trace_clocks [ i ] . name , clockstr ) = = 0 )
break ;
}
if ( i = = ARRAY_SIZE ( trace_clocks ) )
return - EINVAL ;
trace_clock_id = i ;
mutex_lock ( & trace_types_lock ) ;
ring_buffer_set_clock ( global_trace . buffer , trace_clocks [ i ] . func ) ;
if ( max_tr . buffer )
ring_buffer_set_clock ( max_tr . buffer , trace_clocks [ i ] . func ) ;
2012-10-11 16:27:52 -07:00
/*
* New clock may not be consistent with the previous clock .
* Reset the buffer so that it doesn ' t have incomparable timestamps .
*/
tracing_reset_online_cpus ( & global_trace ) ;
if ( max_tr . buffer )
tracing_reset_online_cpus ( & max_tr ) ;
2009-08-25 16:12:56 +08:00
mutex_unlock ( & trace_types_lock ) ;
* fpos + = cnt ;
return cnt ;
}
2009-12-08 11:16:11 +08:00
static int tracing_clock_open ( struct inode * inode , struct file * file )
{
if ( tracing_disabled )
return - ENODEV ;
return single_open ( file , tracing_clock_show , NULL ) ;
}
2009-03-05 21:44:55 -05:00
static const struct file_operations tracing_max_lat_fops = {
2008-05-12 21:20:46 +02:00
. open = tracing_open_generic ,
. read = tracing_max_lat_read ,
. write = tracing_max_lat_write ,
2010-07-07 23:40:11 +02:00
. llseek = generic_file_llseek ,
2008-05-12 21:20:42 +02:00
} ;
2009-03-05 21:44:55 -05:00
static const struct file_operations set_tracer_fops = {
2008-05-12 21:20:46 +02:00
. open = tracing_open_generic ,
. read = tracing_set_trace_read ,
. write = tracing_set_trace_write ,
2010-07-07 23:40:11 +02:00
. llseek = generic_file_llseek ,
2008-05-12 21:20:42 +02:00
} ;
2009-03-05 21:44:55 -05:00
static const struct file_operations tracing_pipe_fops = {
2008-05-12 21:20:46 +02:00
. open = tracing_open_pipe ,
2008-05-12 21:20:49 +02:00
. poll = tracing_poll_pipe ,
2008-05-12 21:20:46 +02:00
. read = tracing_read_pipe ,
2009-02-09 08:15:56 +02:00
. splice_read = tracing_splice_read_pipe ,
2008-05-12 21:20:46 +02:00
. release = tracing_release_pipe ,
2010-07-07 23:40:11 +02:00
. llseek = no_llseek ,
2008-05-12 21:20:46 +02:00
} ;
2009-03-05 21:44:55 -05:00
static const struct file_operations tracing_entries_fops = {
2012-02-02 12:00:41 -08:00
. open = tracing_entries_open ,
2008-05-12 21:20:59 +02:00
. read = tracing_entries_read ,
. write = tracing_entries_write ,
2012-02-02 12:00:41 -08:00
. release = tracing_entries_release ,
2010-07-07 23:40:11 +02:00
. llseek = generic_file_llseek ,
2008-05-12 21:20:59 +02:00
} ;
2011-08-16 14:46:15 -07:00
static const struct file_operations tracing_total_entries_fops = {
. open = tracing_open_generic ,
. read = tracing_total_entries_read ,
. llseek = generic_file_llseek ,
} ;
2011-06-13 17:51:57 -07:00
static const struct file_operations tracing_free_buffer_fops = {
. write = tracing_free_buffer_write ,
. release = tracing_free_buffer_release ,
} ;
2009-03-05 21:44:55 -05:00
static const struct file_operations tracing_mark_fops = {
2008-09-21 20:16:30 +02:00
. open = tracing_open_generic ,
2008-09-16 22:06:42 +03:00
. write = tracing_mark_write ,
2010-07-07 23:40:11 +02:00
. llseek = generic_file_llseek ,
2008-09-16 22:06:42 +03:00
} ;
2009-08-25 16:12:56 +08:00
static const struct file_operations trace_clock_fops = {
2009-12-08 11:16:11 +08:00
. open = tracing_clock_open ,
. read = seq_read ,
. llseek = seq_lseek ,
. release = single_release ,
2009-08-25 16:12:56 +08:00
. write = tracing_clock_write ,
} ;
2008-12-01 22:20:19 -05:00
struct ftrace_buffer_info {
struct trace_array * tr ;
void * spare ;
int cpu ;
unsigned int read ;
} ;
static int tracing_buffers_open ( struct inode * inode , struct file * filp )
{
int cpu = ( int ) ( long ) inode - > i_private ;
struct ftrace_buffer_info * info ;
if ( tracing_disabled )
return - ENODEV ;
info = kzalloc ( sizeof ( * info ) , GFP_KERNEL ) ;
if ( ! info )
return - ENOMEM ;
info - > tr = & global_trace ;
info - > cpu = cpu ;
2009-04-02 15:16:59 +08:00
info - > spare = NULL ;
2008-12-01 22:20:19 -05:00
/* Force reading ring buffer for first read */
info - > read = ( unsigned int ) - 1 ;
filp - > private_data = info ;
2009-04-02 15:16:56 +08:00
return nonseekable_open ( inode , filp ) ;
2008-12-01 22:20:19 -05:00
}
static ssize_t
tracing_buffers_read ( struct file * filp , char __user * ubuf ,
size_t count , loff_t * ppos )
{
struct ftrace_buffer_info * info = filp - > private_data ;
ssize_t ret ;
size_t size ;
2009-03-04 19:10:05 -05:00
if ( ! count )
return 0 ;
2009-04-02 15:16:59 +08:00
if ( ! info - > spare )
2011-05-03 17:56:42 -07:00
info - > spare = ring_buffer_alloc_read_page ( info - > tr - > buffer , info - > cpu ) ;
2009-04-02 15:16:59 +08:00
if ( ! info - > spare )
return - ENOMEM ;
2008-12-01 22:20:19 -05:00
/* Do we have previous read data to read? */
if ( info - > read < PAGE_SIZE )
goto read ;
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
trace_access_lock ( info - > cpu ) ;
2008-12-01 22:20:19 -05:00
ret = ring_buffer_read_page ( info - > tr - > buffer ,
& info - > spare ,
count ,
info - > cpu , 0 ) ;
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
trace_access_unlock ( info - > cpu ) ;
2008-12-01 22:20:19 -05:00
if ( ret < 0 )
return 0 ;
2011-10-14 10:44:25 -04:00
info - > read = 0 ;
2008-12-01 22:20:19 -05:00
read :
size = PAGE_SIZE - info - > read ;
if ( size > count )
size = count ;
ret = copy_to_user ( ubuf , info - > spare + info - > read , size ) ;
2009-03-04 19:10:05 -05:00
if ( ret = = size )
2008-12-01 22:20:19 -05:00
return - EFAULT ;
2009-03-04 19:10:05 -05:00
size - = ret ;
2008-12-01 22:20:19 -05:00
* ppos + = size ;
info - > read + = size ;
return size ;
}
static int tracing_buffers_release ( struct inode * inode , struct file * file )
{
struct ftrace_buffer_info * info = file - > private_data ;
2009-04-02 15:16:59 +08:00
if ( info - > spare )
ring_buffer_free_read_page ( info - > tr - > buffer , info - > spare ) ;
2008-12-01 22:20:19 -05:00
kfree ( info ) ;
return 0 ;
}
struct buffer_ref {
struct ring_buffer * buffer ;
void * page ;
int ref ;
} ;
static void buffer_pipe_buf_release ( struct pipe_inode_info * pipe ,
struct pipe_buffer * buf )
{
struct buffer_ref * ref = ( struct buffer_ref * ) buf - > private ;
if ( - - ref - > ref )
return ;
ring_buffer_free_read_page ( ref - > buffer , ref - > page ) ;
kfree ( ref ) ;
buf - > private = 0 ;
}
static void buffer_pipe_buf_get ( struct pipe_inode_info * pipe ,
struct pipe_buffer * buf )
{
struct buffer_ref * ref = ( struct buffer_ref * ) buf - > private ;
ref - > ref + + ;
}
/* Pipe buffer operations for a buffer. */
2009-12-15 16:46:48 -08:00
static const struct pipe_buf_operations buffer_pipe_buf_ops = {
2008-12-01 22:20:19 -05:00
. can_merge = 0 ,
. map = generic_pipe_buf_map ,
. unmap = generic_pipe_buf_unmap ,
. confirm = generic_pipe_buf_confirm ,
. release = buffer_pipe_buf_release ,
2012-08-09 21:31:10 +09:00
. steal = generic_pipe_buf_steal ,
2008-12-01 22:20:19 -05:00
. get = buffer_pipe_buf_get ,
} ;
/*
* Callback from splice_to_pipe ( ) , if we need to release some pages
* at the end of the spd in case we error ' ed out in filling the pipe .
*/
static void buffer_spd_release ( struct splice_pipe_desc * spd , unsigned int i )
{
struct buffer_ref * ref =
( struct buffer_ref * ) spd - > partial [ i ] . private ;
if ( - - ref - > ref )
return ;
ring_buffer_free_read_page ( ref - > buffer , ref - > page ) ;
kfree ( ref ) ;
spd - > partial [ i ] . private = 0 ;
}
static ssize_t
tracing_buffers_splice_read ( struct file * file , loff_t * ppos ,
struct pipe_inode_info * pipe , size_t len ,
unsigned int flags )
{
struct ftrace_buffer_info * info = file - > private_data ;
2010-05-20 10:43:18 +02:00
struct partial_page partial_def [ PIPE_DEF_BUFFERS ] ;
struct page * pages_def [ PIPE_DEF_BUFFERS ] ;
2008-12-01 22:20:19 -05:00
struct splice_pipe_desc spd = {
2010-05-20 10:43:18 +02:00
. pages = pages_def ,
. partial = partial_def ,
2012-06-12 15:24:40 +02:00
. nr_pages_max = PIPE_DEF_BUFFERS ,
2008-12-01 22:20:19 -05:00
. flags = flags ,
. ops = & buffer_pipe_buf_ops ,
. spd_release = buffer_spd_release ,
} ;
struct buffer_ref * ref ;
2009-04-29 00:23:13 -04:00
int entries , size , i ;
2008-12-01 22:20:19 -05:00
size_t ret ;
2010-05-20 10:43:18 +02:00
if ( splice_grow_spd ( pipe , & spd ) )
return - ENOMEM ;
tracing: fix splice return too large
I got these from strace:
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
It is because the return value of tracing_buffers_splice_read()
does not include "zero out any left over data" bytes.
But tracing_buffers_read() includes these bytes, we make them
consistent.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 15:17:08 +08:00
if ( * ppos & ( PAGE_SIZE - 1 ) ) {
2010-05-20 10:43:18 +02:00
ret = - EINVAL ;
goto out ;
tracing: fix splice return too large
I got these from strace:
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
It is because the return value of tracing_buffers_splice_read()
does not include "zero out any left over data" bytes.
But tracing_buffers_read() includes these bytes, we make them
consistent.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 15:17:08 +08:00
}
if ( len & ( PAGE_SIZE - 1 ) ) {
2010-05-20 10:43:18 +02:00
if ( len < PAGE_SIZE ) {
ret = - EINVAL ;
goto out ;
}
tracing: fix splice return too large
I got these from strace:
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
It is because the return value of tracing_buffers_splice_read()
does not include "zero out any left over data" bytes.
But tracing_buffers_read() includes these bytes, we make them
consistent.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 15:17:08 +08:00
len & = PAGE_MASK ;
}
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
trace_access_lock ( info - > cpu ) ;
2009-04-29 00:23:13 -04:00
entries = ring_buffer_entries_cpu ( info - > tr - > buffer , info - > cpu ) ;
2010-05-20 10:43:18 +02:00
for ( i = 0 ; i < pipe - > buffers & & len & & entries ; i + + , len - = PAGE_SIZE ) {
2008-12-01 22:20:19 -05:00
struct page * page ;
int r ;
ref = kzalloc ( sizeof ( * ref ) , GFP_KERNEL ) ;
if ( ! ref )
break ;
2009-04-29 00:16:21 -04:00
ref - > ref = 1 ;
2008-12-01 22:20:19 -05:00
ref - > buffer = info - > tr - > buffer ;
2011-05-03 17:56:42 -07:00
ref - > page = ring_buffer_alloc_read_page ( ref - > buffer , info - > cpu ) ;
2008-12-01 22:20:19 -05:00
if ( ! ref - > page ) {
kfree ( ref ) ;
break ;
}
r = ring_buffer_read_page ( ref - > buffer , & ref - > page ,
2009-04-29 00:26:30 -04:00
len , info - > cpu , 1 ) ;
2008-12-01 22:20:19 -05:00
if ( r < 0 ) {
2011-05-03 17:56:42 -07:00
ring_buffer_free_read_page ( ref - > buffer , ref - > page ) ;
2008-12-01 22:20:19 -05:00
kfree ( ref ) ;
break ;
}
/*
* zero out any left over data , this is going to
* user land .
*/
size = ring_buffer_page_len ( ref - > page ) ;
if ( size < PAGE_SIZE )
memset ( ref - > page + size , 0 , PAGE_SIZE - size ) ;
page = virt_to_page ( ref - > page ) ;
spd . pages [ i ] = page ;
spd . partial [ i ] . len = PAGE_SIZE ;
spd . partial [ i ] . offset = 0 ;
spd . partial [ i ] . private = ( unsigned long ) ref ;
spd . nr_pages + + ;
tracing: fix splice return too large
I got these from strace:
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
It is because the return value of tracing_buffers_splice_read()
does not include "zero out any left over data" bytes.
But tracing_buffers_read() includes these bytes, we make them
consistent.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 15:17:08 +08:00
* ppos + = PAGE_SIZE ;
2009-04-29 00:23:13 -04:00
entries = ring_buffer_entries_cpu ( info - > tr - > buffer , info - > cpu ) ;
2008-12-01 22:20:19 -05:00
}
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
trace_access_unlock ( info - > cpu ) ;
2008-12-01 22:20:19 -05:00
spd . nr_pages = i ;
/* did we read anything? */
if ( ! spd . nr_pages ) {
if ( flags & SPLICE_F_NONBLOCK )
ret = - EAGAIN ;
else
ret = 0 ;
/* TODO: block */
2010-05-20 10:43:18 +02:00
goto out ;
2008-12-01 22:20:19 -05:00
}
ret = splice_to_pipe ( pipe , & spd ) ;
2012-06-12 15:24:40 +02:00
splice_shrink_spd ( & spd ) ;
2010-05-20 10:43:18 +02:00
out :
2008-12-01 22:20:19 -05:00
return ret ;
}
static const struct file_operations tracing_buffers_fops = {
. open = tracing_buffers_open ,
. read = tracing_buffers_read ,
. release = tracing_buffers_release ,
. splice_read = tracing_buffers_splice_read ,
. llseek = no_llseek ,
} ;
2009-04-29 18:03:45 -04:00
static ssize_t
tracing_stats_read ( struct file * filp , char __user * ubuf ,
size_t count , loff_t * ppos )
{
unsigned long cpu = ( unsigned long ) filp - > private_data ;
struct trace_array * tr = & global_trace ;
struct trace_seq * s ;
unsigned long cnt ;
2011-08-16 14:46:16 -07:00
unsigned long long t ;
unsigned long usec_rem ;
2009-04-29 18:03:45 -04:00
2009-06-15 10:57:28 +08:00
s = kmalloc ( sizeof ( * s ) , GFP_KERNEL ) ;
2009-04-29 18:03:45 -04:00
if ( ! s )
2009-11-11 22:26:35 +01:00
return - ENOMEM ;
2009-04-29 18:03:45 -04:00
trace_seq_init ( s ) ;
cnt = ring_buffer_entries_cpu ( tr - > buffer , cpu ) ;
trace_seq_printf ( s , " entries: %ld \n " , cnt ) ;
cnt = ring_buffer_overrun_cpu ( tr - > buffer , cpu ) ;
trace_seq_printf ( s , " overrun: %ld \n " , cnt ) ;
cnt = ring_buffer_commit_overrun_cpu ( tr - > buffer , cpu ) ;
trace_seq_printf ( s , " commit overrun: %ld \n " , cnt ) ;
2011-08-16 14:46:16 -07:00
cnt = ring_buffer_bytes_cpu ( tr - > buffer , cpu ) ;
trace_seq_printf ( s , " bytes: %ld \n " , cnt ) ;
2012-11-13 12:18:23 -08:00
if ( trace_clocks [ trace_clock_id ] . in_ns ) {
/* local or global for trace_clock */
t = ns2usecs ( ring_buffer_oldest_event_ts ( tr - > buffer , cpu ) ) ;
usec_rem = do_div ( t , USEC_PER_SEC ) ;
trace_seq_printf ( s , " oldest event ts: %5llu.%06lu \n " ,
t , usec_rem ) ;
t = ns2usecs ( ring_buffer_time_stamp ( tr - > buffer , cpu ) ) ;
usec_rem = do_div ( t , USEC_PER_SEC ) ;
trace_seq_printf ( s , " now ts: %5llu.%06lu \n " , t , usec_rem ) ;
} else {
/* counter or tsc mode for trace_clock */
trace_seq_printf ( s , " oldest event ts: %llu \n " ,
ring_buffer_oldest_event_ts ( tr - > buffer , cpu ) ) ;
2011-08-16 14:46:16 -07:00
2012-11-13 12:18:23 -08:00
trace_seq_printf ( s , " now ts: %llu \n " ,
ring_buffer_time_stamp ( tr - > buffer , cpu ) ) ;
}
2011-08-16 14:46:16 -07:00
2011-07-15 14:23:58 -07:00
cnt = ring_buffer_dropped_events_cpu ( tr - > buffer , cpu ) ;
trace_seq_printf ( s , " dropped events: %ld \n " , cnt ) ;
2009-04-29 18:03:45 -04:00
count = simple_read_from_buffer ( ubuf , count , ppos , s - > buffer , s - > len ) ;
kfree ( s ) ;
return count ;
}
static const struct file_operations tracing_stats_fops = {
. open = tracing_open_generic ,
. read = tracing_stats_read ,
2010-07-07 23:40:11 +02:00
. llseek = generic_file_llseek ,
2009-04-29 18:03:45 -04:00
} ;
2008-05-12 21:20:42 +02:00
# ifdef CONFIG_DYNAMIC_FTRACE
2008-10-30 16:08:33 -04:00
int __weak ftrace_arch_read_dyn_info ( char * buf , int size )
{
return 0 ;
}
2008-05-12 21:20:42 +02:00
static ssize_t
2008-10-30 16:08:33 -04:00
tracing_read_dyn_info ( struct file * filp , char __user * ubuf ,
2008-05-12 21:20:42 +02:00
size_t cnt , loff_t * ppos )
{
2008-10-31 00:03:22 -04:00
static char ftrace_dyn_info_buffer [ 1024 ] ;
static DEFINE_MUTEX ( dyn_info_mutex ) ;
2008-05-12 21:20:42 +02:00
unsigned long * p = filp - > private_data ;
2008-10-30 16:08:33 -04:00
char * buf = ftrace_dyn_info_buffer ;
2008-10-31 00:03:22 -04:00
int size = ARRAY_SIZE ( ftrace_dyn_info_buffer ) ;
2008-05-12 21:20:42 +02:00
int r ;
2008-10-30 16:08:33 -04:00
mutex_lock ( & dyn_info_mutex ) ;
r = sprintf ( buf , " %ld " , * p ) ;
2008-05-12 21:20:46 +02:00
2008-10-31 00:03:22 -04:00
r + = ftrace_arch_read_dyn_info ( buf + r , ( size - 1 ) - r ) ;
2008-10-30 16:08:33 -04:00
buf [ r + + ] = ' \n ' ;
r = simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
mutex_unlock ( & dyn_info_mutex ) ;
return r ;
2008-05-12 21:20:42 +02:00
}
2009-03-05 21:44:55 -05:00
static const struct file_operations tracing_dyn_info_fops = {
2008-05-12 21:20:46 +02:00
. open = tracing_open_generic ,
2008-10-30 16:08:33 -04:00
. read = tracing_read_dyn_info ,
2010-07-07 23:40:11 +02:00
. llseek = generic_file_llseek ,
2008-05-12 21:20:42 +02:00
} ;
# endif
static struct dentry * d_tracer ;
struct dentry * tracing_init_dentry ( void )
{
static int once ;
if ( d_tracer )
return d_tracer ;
2009-03-22 23:10:45 +01:00
if ( ! debugfs_initialized ( ) )
return NULL ;
2008-05-12 21:20:42 +02:00
d_tracer = debugfs_create_dir ( " tracing " , NULL ) ;
if ( ! d_tracer & & ! once ) {
once = 1 ;
pr_warning ( " Could not create debugfs directory 'tracing' \n " ) ;
return NULL ;
}
return d_tracer ;
}
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
static struct dentry * d_percpu ;
struct dentry * tracing_dentry_percpu ( void )
{
static int once ;
struct dentry * d_tracer ;
if ( d_percpu )
return d_percpu ;
d_tracer = tracing_init_dentry ( ) ;
if ( ! d_tracer )
return NULL ;
d_percpu = debugfs_create_dir ( " per_cpu " , d_tracer ) ;
if ( ! d_percpu & & ! once ) {
once = 1 ;
pr_warning ( " Could not create debugfs directory 'per_cpu' \n " ) ;
return NULL ;
}
return d_percpu ;
}
static void tracing_init_debugfs_percpu ( long cpu )
{
struct dentry * d_percpu = tracing_dentry_percpu ( ) ;
2009-03-27 00:25:38 +01:00
struct dentry * d_cpu ;
2010-10-20 21:51:26 -04:00
char cpu_dir [ 30 ] ; /* 30 characters should be more than enough */
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
2012-04-23 10:11:57 +09:00
if ( ! d_percpu )
return ;
2010-10-20 21:51:26 -04:00
snprintf ( cpu_dir , 30 , " cpu%ld " , cpu ) ;
2009-02-26 00:41:38 +01:00
d_cpu = debugfs_create_dir ( cpu_dir , d_percpu ) ;
if ( ! d_cpu ) {
pr_warning ( " Could not create debugfs '%s' entry \n " , cpu_dir ) ;
return ;
}
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
2009-02-26 00:41:38 +01:00
/* per cpu trace_pipe */
2009-03-27 00:25:38 +01:00
trace_create_file ( " trace_pipe " , 0444 , d_cpu ,
( void * ) cpu , & tracing_pipe_fops ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
/* per cpu trace */
2009-03-27 00:25:38 +01:00
trace_create_file ( " trace " , 0644 , d_cpu ,
( void * ) cpu , & tracing_fops ) ;
2009-03-13 00:37:42 -04:00
2009-03-27 00:25:38 +01:00
trace_create_file ( " trace_pipe_raw " , 0444 , d_cpu ,
( void * ) cpu , & tracing_buffers_fops ) ;
2009-03-13 00:37:42 -04:00
2009-04-29 18:03:45 -04:00
trace_create_file ( " stats " , 0444 , d_cpu ,
( void * ) cpu , & tracing_stats_fops ) ;
2012-02-02 12:00:41 -08:00
trace_create_file ( " buffer_size_kb " , 0444 , d_cpu ,
( void * ) cpu , & tracing_entries_fops ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
}
2008-05-12 21:20:44 +02:00
# ifdef CONFIG_FTRACE_SELFTEST
/* Let selftest have access to static functions in this file */
# include "trace_selftest.c"
# endif
2009-02-26 23:43:05 -05:00
struct trace_option_dentry {
struct tracer_opt * opt ;
struct tracer_flags * flags ;
struct dentry * entry ;
} ;
static ssize_t
trace_options_read ( struct file * filp , char __user * ubuf , size_t cnt ,
loff_t * ppos )
{
struct trace_option_dentry * topt = filp - > private_data ;
char * buf ;
if ( topt - > flags - > val & topt - > opt - > bit )
buf = " 1 \n " ;
else
buf = " 0 \n " ;
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , 2 ) ;
}
static ssize_t
trace_options_write ( struct file * filp , const char __user * ubuf , size_t cnt ,
loff_t * ppos )
{
struct trace_option_dentry * topt = filp - > private_data ;
unsigned long val ;
int ret ;
2011-06-07 21:58:27 +02:00
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
2009-02-26 23:43:05 -05:00
return ret ;
2009-12-08 11:17:06 +08:00
if ( val ! = 0 & & val ! = 1 )
return - EINVAL ;
2009-02-26 23:43:05 -05:00
2009-12-08 11:17:06 +08:00
if ( ! ! ( topt - > flags - > val & topt - > opt - > bit ) ! = val ) {
2009-02-26 23:43:05 -05:00
mutex_lock ( & trace_types_lock ) ;
2009-12-08 11:17:06 +08:00
ret = __set_tracer_option ( current_trace , topt - > flags ,
2009-12-21 22:35:16 -05:00
topt - > opt , ! val ) ;
2009-02-26 23:43:05 -05:00
mutex_unlock ( & trace_types_lock ) ;
if ( ret )
return ret ;
}
* ppos + = cnt ;
return cnt ;
}
static const struct file_operations trace_options_fops = {
. open = tracing_open_generic ,
. read = trace_options_read ,
. write = trace_options_write ,
2010-07-07 23:40:11 +02:00
. llseek = generic_file_llseek ,
2009-02-26 23:43:05 -05:00
} ;
2009-02-26 22:19:12 -05:00
static ssize_t
trace_options_core_read ( struct file * filp , char __user * ubuf , size_t cnt ,
loff_t * ppos )
{
long index = ( long ) filp - > private_data ;
char * buf ;
if ( trace_flags & ( 1 < < index ) )
buf = " 1 \n " ;
else
buf = " 0 \n " ;
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , 2 ) ;
}
static ssize_t
trace_options_core_write ( struct file * filp , const char __user * ubuf , size_t cnt ,
loff_t * ppos )
{
long index = ( long ) filp - > private_data ;
unsigned long val ;
int ret ;
2011-06-07 21:58:27 +02:00
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
2009-02-26 22:19:12 -05:00
return ret ;
2009-08-07 18:55:48 +08:00
if ( val ! = 0 & & val ! = 1 )
2009-02-26 22:19:12 -05:00
return - EINVAL ;
2009-08-07 18:55:48 +08:00
set_tracer_flags ( 1 < < index , val ) ;
2009-02-26 22:19:12 -05:00
* ppos + = cnt ;
return cnt ;
}
static const struct file_operations trace_options_core_fops = {
. open = tracing_open_generic ,
. read = trace_options_core_read ,
. write = trace_options_core_write ,
2010-07-07 23:40:11 +02:00
. llseek = generic_file_llseek ,
2009-02-26 22:19:12 -05:00
} ;
2009-03-27 00:25:38 +01:00
struct dentry * trace_create_file ( const char * name ,
2011-07-24 04:33:43 -04:00
umode_t mode ,
2009-03-27 00:25:38 +01:00
struct dentry * parent ,
void * data ,
const struct file_operations * fops )
{
struct dentry * ret ;
ret = debugfs_create_file ( name , mode , parent , data , fops ) ;
if ( ! ret )
pr_warning ( " Could not create debugfs '%s' entry \n " , name ) ;
return ret ;
}
2009-02-26 22:19:12 -05:00
static struct dentry * trace_options_init_dentry ( void )
{
struct dentry * d_tracer ;
static struct dentry * t_options ;
if ( t_options )
return t_options ;
d_tracer = tracing_init_dentry ( ) ;
if ( ! d_tracer )
return NULL ;
t_options = debugfs_create_dir ( " options " , d_tracer ) ;
if ( ! t_options ) {
pr_warning ( " Could not create debugfs directory 'options' \n " ) ;
return NULL ;
}
return t_options ;
}
2009-02-26 23:43:05 -05:00
static void
create_trace_option_file ( struct trace_option_dentry * topt ,
struct tracer_flags * flags ,
struct tracer_opt * opt )
{
struct dentry * t_options ;
t_options = trace_options_init_dentry ( ) ;
if ( ! t_options )
return ;
topt - > flags = flags ;
topt - > opt = opt ;
2009-03-27 00:25:38 +01:00
topt - > entry = trace_create_file ( opt - > name , 0644 , t_options , topt ,
2009-02-26 23:43:05 -05:00
& trace_options_fops ) ;
}
static struct trace_option_dentry *
create_trace_option_files ( struct tracer * tracer )
{
struct trace_option_dentry * topts ;
struct tracer_flags * flags ;
struct tracer_opt * opts ;
int cnt ;
if ( ! tracer )
return NULL ;
flags = tracer - > flags ;
if ( ! flags | | ! flags - > opts )
return NULL ;
opts = flags - > opts ;
for ( cnt = 0 ; opts [ cnt ] . name ; cnt + + )
;
2009-02-27 10:51:10 -05:00
topts = kcalloc ( cnt + 1 , sizeof ( * topts ) , GFP_KERNEL ) ;
2009-02-26 23:43:05 -05:00
if ( ! topts )
return NULL ;
for ( cnt = 0 ; opts [ cnt ] . name ; cnt + + )
create_trace_option_file ( & topts [ cnt ] , flags ,
& opts [ cnt ] ) ;
return topts ;
}
static void
destroy_trace_option_files ( struct trace_option_dentry * topts )
{
int cnt ;
if ( ! topts )
return ;
for ( cnt = 0 ; topts [ cnt ] . opt ; cnt + + ) {
if ( topts [ cnt ] . entry )
debugfs_remove ( topts [ cnt ] . entry ) ;
}
kfree ( topts ) ;
}
2009-02-26 22:19:12 -05:00
static struct dentry *
create_trace_option_core_file ( const char * option , long index )
{
struct dentry * t_options ;
t_options = trace_options_init_dentry ( ) ;
if ( ! t_options )
return NULL ;
2009-03-27 00:25:38 +01:00
return trace_create_file ( option , 0644 , t_options , ( void * ) index ,
2009-02-26 22:19:12 -05:00
& trace_options_core_fops ) ;
}
static __init void create_trace_options_dir ( void )
{
struct dentry * t_options ;
int i ;
t_options = trace_options_init_dentry ( ) ;
if ( ! t_options )
return ;
2009-03-27 00:25:38 +01:00
for ( i = 0 ; trace_options [ i ] ; i + + )
create_trace_option_core_file ( trace_options [ i ] , i ) ;
2009-02-26 22:19:12 -05:00
}
2012-02-22 15:50:28 -05:00
static ssize_t
rb_simple_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2012-04-16 15:41:28 -04:00
struct trace_array * tr = filp - > private_data ;
struct ring_buffer * buffer = tr - > buffer ;
2012-02-22 15:50:28 -05:00
char buf [ 64 ] ;
int r ;
if ( buffer )
r = ring_buffer_record_is_on ( buffer ) ;
else
r = 0 ;
r = sprintf ( buf , " %d \n " , r ) ;
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
}
static ssize_t
rb_simple_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2012-04-16 15:41:28 -04:00
struct trace_array * tr = filp - > private_data ;
struct ring_buffer * buffer = tr - > buffer ;
2012-02-22 15:50:28 -05:00
unsigned long val ;
int ret ;
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
return ret ;
if ( buffer ) {
if ( val )
ring_buffer_record_on ( buffer ) ;
else
ring_buffer_record_off ( buffer ) ;
}
( * ppos ) + + ;
return cnt ;
}
static const struct file_operations rb_simple_fops = {
. open = tracing_open_generic ,
. read = rb_simple_read ,
. write = rb_simple_write ,
. llseek = default_llseek ,
} ;
2008-09-23 11:34:32 +01:00
static __init int tracer_init_debugfs ( void )
2008-05-12 21:20:42 +02:00
{
struct dentry * d_tracer ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
int cpu ;
2008-05-12 21:20:42 +02:00
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 20:08:50 +08:00
trace_access_lock_init ( ) ;
2008-05-12 21:20:42 +02:00
d_tracer = tracing_init_dentry ( ) ;
2009-03-27 00:25:38 +01:00
trace_create_file ( " trace_options " , 0644 , d_tracer ,
NULL , & tracing_iter_fops ) ;
2008-05-12 21:20:42 +02:00
2009-03-27 00:25:38 +01:00
trace_create_file ( " tracing_cpumask " , 0644 , d_tracer ,
NULL , & tracing_cpumask_fops ) ;
trace_create_file ( " trace " , 0644 , d_tracer ,
( void * ) TRACE_PIPE_ALL_CPU , & tracing_fops ) ;
2009-02-26 22:19:12 -05:00
2009-03-27 00:25:38 +01:00
trace_create_file ( " available_tracers " , 0444 , d_tracer ,
& global_trace , & show_traces_fops ) ;
2009-04-17 10:34:30 +08:00
trace_create_file ( " current_tracer " , 0644 , d_tracer ,
2009-03-27 00:25:38 +01:00
& global_trace , & set_tracer_fops ) ;
2009-08-27 16:52:21 -04:00
# ifdef CONFIG_TRACER_MAX_TRACE
2009-03-27 00:25:38 +01:00
trace_create_file ( " tracing_max_latency " , 0644 , d_tracer ,
& tracing_max_latency , & tracing_max_lat_fops ) ;
2010-02-25 15:36:43 -08:00
# endif
2009-03-27 00:25:38 +01:00
trace_create_file ( " tracing_thresh " , 0644 , d_tracer ,
& tracing_thresh , & tracing_max_lat_fops ) ;
2009-02-26 22:19:12 -05:00
2009-04-17 10:34:30 +08:00
trace_create_file ( " README " , 0444 , d_tracer ,
2009-03-27 00:25:38 +01:00
NULL , & tracing_readme_fops ) ;
trace_create_file ( " trace_pipe " , 0444 , d_tracer ,
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
( void * ) TRACE_PIPE_ALL_CPU , & tracing_pipe_fops ) ;
2009-03-27 00:25:38 +01:00
trace_create_file ( " buffer_size_kb " , 0644 , d_tracer ,
2012-02-02 12:00:41 -08:00
( void * ) RING_BUFFER_ALL_CPUS , & tracing_entries_fops ) ;
2009-03-27 00:25:38 +01:00
2011-08-16 14:46:15 -07:00
trace_create_file ( " buffer_total_size_kb " , 0444 , d_tracer ,
& global_trace , & tracing_total_entries_fops ) ;
2011-06-13 17:51:57 -07:00
trace_create_file ( " free_buffer " , 0644 , d_tracer ,
& global_trace , & tracing_free_buffer_fops ) ;
2009-03-27 00:25:38 +01:00
trace_create_file ( " trace_marker " , 0220 , d_tracer ,
NULL , & tracing_mark_fops ) ;
2008-09-16 22:06:42 +03:00
2009-04-10 16:04:48 -04:00
trace_create_file ( " saved_cmdlines " , 0444 , d_tracer ,
NULL , & tracing_saved_cmdlines_fops ) ;
2008-09-16 22:06:42 +03:00
2009-08-25 16:12:56 +08:00
trace_create_file ( " trace_clock " , 0644 , d_tracer , NULL ,
& trace_clock_fops ) ;
2012-02-22 15:50:28 -05:00
trace_create_file ( " tracing_on " , 0644 , d_tracer ,
2012-04-16 15:41:28 -04:00
& global_trace , & rb_simple_fops ) ;
2012-02-22 15:50:28 -05:00
2008-05-12 21:20:42 +02:00
# ifdef CONFIG_DYNAMIC_FTRACE
2009-03-27 00:25:38 +01:00
trace_create_file ( " dyn_ftrace_total_info " , 0444 , d_tracer ,
& ftrace_update_tot_cnt , & tracing_dyn_info_fops ) ;
2008-05-12 21:20:42 +02:00
# endif
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
2009-03-27 00:25:38 +01:00
create_trace_options_dir ( ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 03:22:28 +01:00
for_each_tracing_cpu ( cpu )
tracing_init_debugfs_percpu ( cpu ) ;
2008-09-23 11:34:32 +01:00
return 0 ;
2008-05-12 21:20:42 +02:00
}
2008-07-30 22:36:46 -04:00
static int trace_panic_handler ( struct notifier_block * this ,
unsigned long event , void * unused )
{
2008-10-23 19:26:08 -04:00
if ( ftrace_dump_on_oops )
2010-04-18 19:08:41 +02:00
ftrace_dump ( ftrace_dump_on_oops ) ;
2008-07-30 22:36:46 -04:00
return NOTIFY_OK ;
}
static struct notifier_block trace_panic_notifier = {
. notifier_call = trace_panic_handler ,
. next = NULL ,
. priority = 150 /* priority: INT_MAX >= x >= 0 */
} ;
static int trace_die_handler ( struct notifier_block * self ,
unsigned long val ,
void * data )
{
switch ( val ) {
case DIE_OOPS :
2008-10-23 19:26:08 -04:00
if ( ftrace_dump_on_oops )
2010-04-18 19:08:41 +02:00
ftrace_dump ( ftrace_dump_on_oops ) ;
2008-07-30 22:36:46 -04:00
break ;
default :
break ;
}
return NOTIFY_OK ;
}
static struct notifier_block trace_die_notifier = {
. notifier_call = trace_die_handler ,
. priority = 200
} ;
/*
* printk is set to max of 1024 , we really don ' t need it that big .
* Nothing should be printing 1000 characters anyway .
*/
# define TRACE_MAX_PRINT 1000
/*
* Define here KERN_TRACE so that we have one place to modify
* it if we decide to change what log level the ftrace dump
* should be at .
*/
2009-01-14 12:24:42 -05:00
# define KERN_TRACE KERN_EMERG
2008-07-30 22:36:46 -04:00
2010-08-05 09:22:23 -05:00
void
2008-07-30 22:36:46 -04:00
trace_printk_seq ( struct trace_seq * s )
{
/* Probably should print a warning here. */
if ( s - > len > = 1000 )
s - > len = 1000 ;
/* should be zero ended, but we are paranoid. */
s - > buffer [ s - > len ] = 0 ;
printk ( KERN_TRACE " %s " , s - > buffer ) ;
2009-03-02 14:04:40 -05:00
trace_seq_init ( s ) ;
2008-07-30 22:36:46 -04:00
}
2010-08-05 09:22:23 -05:00
void trace_init_global_iter ( struct trace_iterator * iter )
{
iter - > tr = & global_trace ;
iter - > trace = current_trace ;
iter - > cpu_file = TRACE_PIPE_ALL_CPU ;
}
2010-04-18 19:08:41 +02:00
static void
__ftrace_dump ( bool disable_tracing , enum ftrace_dump_mode oops_dump_mode )
2008-07-30 22:36:46 -04:00
{
2009-12-02 19:49:50 +01:00
static arch_spinlock_t ftrace_dump_lock =
2009-12-03 12:38:57 +01:00
( arch_spinlock_t ) __ARCH_SPIN_LOCK_UNLOCKED ;
2008-07-30 22:36:46 -04:00
/* use static because iter can be a bit big for the stack */
static struct trace_iterator iter ;
2009-03-22 05:04:35 +01:00
unsigned int old_userobj ;
2008-07-30 22:36:46 -04:00
static int dump_ran ;
2008-10-01 00:29:53 -04:00
unsigned long flags ;
int cnt = 0 , cpu ;
2008-07-30 22:36:46 -04:00
/* only one dump */
2009-04-28 11:39:34 -04:00
local_irq_save ( flags ) ;
2009-12-02 20:01:25 +01:00
arch_spin_lock ( & ftrace_dump_lock ) ;
2008-07-30 22:36:46 -04:00
if ( dump_ran )
goto out ;
dump_ran = 1 ;
2009-01-14 14:50:19 -05:00
tracing_off ( ) ;
2009-03-22 05:04:35 +01:00
2011-09-29 21:26:16 -04:00
/* Did function tracer already get disabled? */
if ( ftrace_is_dead ( ) ) {
printk ( " # WARNING: FUNCTION TRACING IS CORRUPTED \n " ) ;
printk ( " # MAY BE MISSING FUNCTION EVENTS \n " ) ;
}
2009-03-22 05:04:35 +01:00
if ( disable_tracing )
ftrace_kill ( ) ;
2008-07-30 22:36:46 -04:00
2010-08-05 09:22:23 -05:00
trace_init_global_iter ( & iter ) ;
2008-10-01 00:29:53 -04:00
for_each_tracing_cpu ( cpu ) {
2010-08-05 09:22:23 -05:00
atomic_inc ( & iter . tr - > data [ cpu ] - > disabled ) ;
2008-10-01 00:29:53 -04:00
}
2009-03-22 05:04:35 +01:00
old_userobj = trace_flags & TRACE_ITER_SYM_USEROBJ ;
2008-11-22 13:28:48 +02:00
/* don't look at user memory in panic mode */
trace_flags & = ~ TRACE_ITER_SYM_USEROBJ ;
2009-03-04 18:20:36 -05:00
/* Simulate the iterator */
2008-07-30 22:36:46 -04:00
iter . tr = & global_trace ;
iter . trace = current_trace ;
2010-04-18 19:08:41 +02:00
switch ( oops_dump_mode ) {
case DUMP_ALL :
iter . cpu_file = TRACE_PIPE_ALL_CPU ;
break ;
case DUMP_ORIG :
iter . cpu_file = raw_smp_processor_id ( ) ;
break ;
case DUMP_NONE :
goto out_enable ;
default :
printk ( KERN_TRACE " Bad dumping mode, switching to all CPUs dump \n " ) ;
iter . cpu_file = TRACE_PIPE_ALL_CPU ;
}
printk ( KERN_TRACE " Dumping ftrace buffer: \n " ) ;
2008-07-30 22:36:46 -04:00
/*
* We need to stop all tracing on all CPUS to read the
* the next buffer . This is a bit expensive , but is
* not done often . We fill all what we can read ,
* and then release the locks again .
*/
while ( ! trace_empty ( & iter ) ) {
if ( ! cnt )
printk ( KERN_TRACE " --------------------------------- \n " ) ;
cnt + + ;
/* reset all but tr, trace, and overruns */
memset ( & iter . seq , 0 ,
sizeof ( struct trace_iterator ) -
offsetof ( struct trace_iterator , seq ) ) ;
iter . iter_flags | = TRACE_FILE_LAT_FMT ;
iter . pos = - 1 ;
2010-08-05 09:22:23 -05:00
if ( trace_find_next_entry_inc ( & iter ) ! = NULL ) {
2009-07-28 20:17:22 +08:00
int ret ;
ret = print_trace_line ( & iter ) ;
if ( ret ! = TRACE_TYPE_NO_CONSUME )
trace_consume ( & iter ) ;
2008-07-30 22:36:46 -04:00
}
2012-03-01 22:06:48 -05:00
touch_nmi_watchdog ( ) ;
2008-07-30 22:36:46 -04:00
trace_printk_seq ( & iter . seq ) ;
}
if ( ! cnt )
printk ( KERN_TRACE " (ftrace buffer empty) \n " ) ;
else
printk ( KERN_TRACE " --------------------------------- \n " ) ;
2010-04-18 19:08:41 +02:00
out_enable :
2009-03-22 05:04:35 +01:00
/* Re-enable tracing if requested */
if ( ! disable_tracing ) {
trace_flags | = old_userobj ;
for_each_tracing_cpu ( cpu ) {
2010-08-05 09:22:23 -05:00
atomic_dec ( & iter . tr - > data [ cpu ] - > disabled ) ;
2009-03-22 05:04:35 +01:00
}
tracing_on ( ) ;
}
2008-07-30 22:36:46 -04:00
out :
2009-12-02 20:01:25 +01:00
arch_spin_unlock ( & ftrace_dump_lock ) ;
2009-04-28 11:39:34 -04:00
local_irq_restore ( flags ) ;
2008-07-30 22:36:46 -04:00
}
2009-03-22 05:04:35 +01:00
/* By default: disable tracing after the dump */
2010-04-18 19:08:41 +02:00
void ftrace_dump ( enum ftrace_dump_mode oops_dump_mode )
2009-03-22 05:04:35 +01:00
{
2010-04-18 19:08:41 +02:00
__ftrace_dump ( true , oops_dump_mode ) ;
2009-03-22 05:04:35 +01:00
}
2011-10-02 11:01:15 -07:00
EXPORT_SYMBOL_GPL ( ftrace_dump ) ;
2009-03-22 05:04:35 +01:00
2008-09-29 23:02:41 -04:00
__init static int tracer_alloc_buffers ( void )
2008-05-12 21:20:42 +02:00
{
2009-03-11 13:42:01 -04:00
int ring_buf_size ;
2010-12-08 13:46:47 -08:00
enum ring_buffer_flags rb_flags ;
2008-05-12 21:20:43 +02:00
int i ;
2009-01-01 10:12:22 +10:30
int ret = - ENOMEM ;
2008-05-12 21:20:43 +02:00
2010-12-08 13:46:47 -08:00
2009-01-01 10:12:22 +10:30
if ( ! alloc_cpumask_var ( & tracing_buffer_mask , GFP_KERNEL ) )
goto out ;
if ( ! alloc_cpumask_var ( & tracing_cpumask , GFP_KERNEL ) )
goto out_free_buffer_mask ;
2008-05-12 21:20:43 +02:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
/* Only allocate trace_printk buffers if a trace_printk exists */
if ( __stop___trace_bprintk_fmt ! = __start___trace_bprintk_fmt )
2012-10-11 10:15:05 -04:00
/* Must be called before global_trace.buffer is allocated */
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 14:01:55 -04:00
trace_printk_init_buffers ( ) ;
2009-03-11 13:42:01 -04:00
/* To save memory, keep the ring buffer size to its minimum */
if ( ring_buffer_expanded )
ring_buf_size = trace_buf_size ;
else
ring_buf_size = 1 ;
2010-12-08 13:46:47 -08:00
rb_flags = trace_flags & TRACE_ITER_OVERWRITE ? RB_FL_OVERWRITE : 0 ;
2009-01-01 10:12:22 +10:30
cpumask_copy ( tracing_buffer_mask , cpu_possible_mask ) ;
cpumask_copy ( tracing_cpumask , cpu_all_mask ) ;
/* TODO: make the number of buffers hot pluggable with CPUS */
2010-12-08 13:46:47 -08:00
global_trace . buffer = ring_buffer_alloc ( ring_buf_size , rb_flags ) ;
2008-09-29 23:02:41 -04:00
if ( ! global_trace . buffer ) {
printk ( KERN_ERR " tracer: failed to allocate ring buffer! \n " ) ;
WARN_ON ( 1 ) ;
2009-01-01 10:12:22 +10:30
goto out_free_cpumask ;
2008-05-12 21:20:43 +02:00
}
2012-02-22 15:50:28 -05:00
if ( global_trace . buffer_disabled )
tracing_off ( ) ;
2008-05-12 21:20:43 +02:00
2009-01-01 10:12:22 +10:30
2008-05-12 21:20:43 +02:00
# ifdef CONFIG_TRACER_MAX_TRACE
2010-12-08 13:46:47 -08:00
max_tr . buffer = ring_buffer_alloc ( 1 , rb_flags ) ;
2008-09-29 23:02:41 -04:00
if ( ! max_tr . buffer ) {
printk ( KERN_ERR " tracer: failed to allocate max ring buffer! \n " ) ;
WARN_ON ( 1 ) ;
ring_buffer_free ( global_trace . buffer ) ;
2009-01-01 10:12:22 +10:30
goto out_free_cpumask ;
2008-05-12 21:20:43 +02:00
}
2008-05-12 21:20:59 +02:00
# endif
2008-05-12 21:21:00 +02:00
2008-05-12 21:20:43 +02:00
/* Allocate the first page for all buffers */
2008-05-12 21:21:00 +02:00
for_each_tracing_cpu ( i ) {
2009-07-16 21:44:26 +02:00
global_trace . data [ i ] = & per_cpu ( global_trace_cpu , i ) ;
2009-10-29 22:34:13 +09:00
max_tr . data [ i ] = & per_cpu ( max_tr_data , i ) ;
2008-05-12 21:20:43 +02:00
}
2012-02-02 12:00:41 -08:00
2012-05-03 10:40:34 -07:00
set_buffer_entries ( & global_trace ,
ring_buffer_size ( global_trace . buffer , 0 ) ) ;
2012-02-02 12:00:41 -08:00
# ifdef CONFIG_TRACER_MAX_TRACE
set_buffer_entries ( & max_tr , 1 ) ;
# endif
2008-05-12 21:20:42 +02:00
trace_init_cmdlines ( ) ;
2012-11-01 20:54:21 -04:00
init_irq_work ( & trace_work_wakeup , trace_wake_up ) ;
2008-05-12 21:20:42 +02:00
2008-09-21 20:16:30 +02:00
register_tracer ( & nop_trace ) ;
2009-02-02 21:38:33 -05:00
current_trace = & nop_trace ;
2008-05-12 21:20:44 +02:00
/* All seems OK, enable tracing */
tracing_disabled = 0 ;
2008-09-29 23:02:41 -04:00
2008-07-30 22:36:46 -04:00
atomic_notifier_chain_register ( & panic_notifier_list ,
& trace_panic_notifier ) ;
register_die_notifier ( & trace_die_notifier ) ;
2009-03-16 01:45:03 +01:00
2012-11-01 22:56:07 -04:00
while ( trace_boot_options ) {
char * option ;
option = strsep ( & trace_boot_options , " , " ) ;
trace_set_options ( option ) ;
}
2009-03-16 01:45:03 +01:00
return 0 ;
2008-07-30 22:36:46 -04:00
2009-01-01 10:12:22 +10:30
out_free_cpumask :
free_cpumask_var ( tracing_cpumask ) ;
out_free_buffer_mask :
free_cpumask_var ( tracing_buffer_mask ) ;
out :
return ret ;
2008-05-12 21:20:42 +02:00
}
2009-02-02 21:38:32 -05:00
__init static int clear_boot_tracer ( void )
{
/*
* The default tracer at boot buffer is an init section .
* This function is called in lateinit . If we did not
* find the boot tracer , then clear it out , to prevent
* later registration from accessing the buffer that is
* about to be freed .
*/
if ( ! default_bootup_tracer )
return 0 ;
printk ( KERN_INFO " ftrace bootup tracer '%s' not registered. \n " ,
default_bootup_tracer ) ;
default_bootup_tracer = NULL ;
return 0 ;
}
2008-09-23 11:34:32 +01:00
early_initcall ( tracer_alloc_buffers ) ;
fs_initcall ( tracer_init_debugfs ) ;
2009-02-02 21:38:32 -05:00
late_initcall ( clear_boot_tracer ) ;