2008-05-12 23:20:42 +04:00
/*
* ring buffer based function tracer
*
2012-05-11 21:29:49 +04:00
* Copyright ( C ) 2007 - 2012 Steven Rostedt < srostedt @ redhat . com >
2008-05-12 23:20:42 +04:00
* Copyright ( C ) 2008 Ingo Molnar < mingo @ redhat . com >
*
* Originally taken from the RT patch by :
* Arnaldo Carvalho de Melo < acme @ redhat . com >
*
* Based on code from the latency_tracer , that is :
* Copyright ( C ) 2004 - 2006 Ingo Molnar
2012-12-06 13:39:54 +04:00
* Copyright ( C ) 2004 Nadia Yvette Chambers
2008-05-12 23:20:42 +04:00
*/
2008-12-02 06:20:19 +03:00
# include <linux/ring_buffer.h>
2009-10-18 02:52:28 +04:00
# include <generated/utsrelease.h>
2008-12-02 06:20:19 +03:00
# include <linux/stacktrace.h>
# include <linux/writeback.h>
2008-05-12 23:20:42 +04:00
# include <linux/kallsyms.h>
# include <linux/seq_file.h>
2008-07-31 06:36:46 +04:00
# include <linux/notifier.h>
2008-12-02 06:20:19 +03:00
# include <linux/irqflags.h>
2008-05-12 23:20:42 +04:00
# include <linux/debugfs.h>
2015-01-20 20:13:40 +03:00
# include <linux/tracefs.h>
2008-05-12 23:20:43 +04:00
# include <linux/pagemap.h>
2008-05-12 23:20:42 +04:00
# include <linux/hardirq.h>
# include <linux/linkage.h>
# include <linux/uaccess.h>
2008-12-02 06:20:19 +03:00
# include <linux/kprobes.h>
2008-05-12 23:20:42 +04:00
# include <linux/ftrace.h>
# include <linux/module.h>
# include <linux/percpu.h>
2008-12-02 06:20:19 +03:00
# include <linux/splice.h>
2008-07-31 06:36:46 +04:00
# include <linux/kdebug.h>
2009-03-27 16:22:10 +03:00
# include <linux/string.h>
2015-01-20 23:48:46 +03:00
# include <linux/mount.h>
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
# include <linux/rwsem.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 11:04:11 +03:00
# include <linux/slab.h>
2008-05-12 23:20:42 +04:00
# include <linux/ctype.h>
# include <linux/init.h>
2008-05-12 23:20:49 +04:00
# include <linux/poll.h>
2012-03-02 07:06:48 +04:00
# include <linux/nmi.h>
2008-05-12 23:20:42 +04:00
# include <linux/fs.h>
2013-02-07 19:47:07 +04:00
# include <linux/sched/rt.h>
2008-05-12 23:20:51 +04:00
2008-05-12 23:20:42 +04:00
# include "trace.h"
2008-12-24 07:24:12 +03:00
# include "trace_output.h"
2008-05-12 23:20:42 +04:00
2009-03-11 20:42:01 +03:00
/*
* On boot up , the ring buffer is set to the minimum size , so that
* we do not waste memory on systems that are not using tracing .
*/
2013-03-08 07:48:09 +04:00
bool ring_buffer_expanded ;
2009-03-11 20:42:01 +03:00
2008-12-06 05:41:33 +03:00
/*
* We need to change this state when a selftest is running .
2008-12-05 01:47:35 +03:00
* A selftest will lurk into the ring - buffer to count the
* entries inserted during the selftest although some concurrent
2009-03-05 12:24:48 +03:00
* insertions into the ring - buffer such as trace_printk could occurred
2008-12-05 01:47:35 +03:00
* at the same time , giving false positive or negative results .
*/
2008-12-06 05:41:33 +03:00
static bool __read_mostly tracing_selftest_running ;
2008-12-05 01:47:35 +03:00
2009-02-03 05:38:32 +03:00
/*
* If a tracer is running , we do not want to run SELFTEST .
*/
2009-07-01 06:47:05 +04:00
bool __read_mostly tracing_selftest_disabled ;
2009-02-03 05:38:32 +03:00
2014-12-13 06:27:10 +03:00
/* Pipe tracepoints to printk */
struct trace_iterator * tracepoint_print_iter ;
int tracepoint_printk ;
2008-11-17 21:23:42 +03:00
/* For tracers that don't implement custom flags */
static struct tracer_opt dummy_tracer_opt [ ] = {
{ }
} ;
static struct tracer_flags dummy_tracer_flags = {
. val = 0 ,
. opts = dummy_tracer_opt
} ;
2014-01-10 20:13:54 +04:00
static int
dummy_set_flag ( struct trace_array * tr , u32 old_flags , u32 bit , int set )
2008-11-17 21:23:42 +03:00
{
return 0 ;
}
2008-11-06 00:05:44 +03:00
2012-10-11 20:14:25 +04:00
/*
* To prevent the comm cache from being overwritten when no
* tracing is active , only save the comm when a trace event
* occurred .
*/
static DEFINE_PER_CPU ( bool , trace_cmdline_save ) ;
2008-11-06 00:05:44 +03:00
/*
* Kill all tracing for good ( never come back ) .
* It is initialized to 1 but will turn to zero if the initialization
* of the tracer is successful . But that is the only place that sets
* this back to zero .
*/
2009-02-10 21:44:12 +03:00
static int tracing_disabled = 1 ;
2008-11-06 00:05:44 +03:00
2009-10-08 03:17:45 +04:00
DEFINE_PER_CPU ( int , ftrace_cpu_disabled ) ;
2008-10-01 08:29:53 +04:00
2010-08-05 18:22:23 +04:00
cpumask_var_t __read_mostly tracing_buffer_mask ;
2008-05-12 23:21:00 +04:00
2008-10-24 03:26:08 +04:00
/*
* ftrace_dump_on_oops - variable to dump ftrace buffer on oops
*
* If there is an oops ( or kernel panic ) and the ftrace_dump_on_oops
* is set , then ftrace_dump is called . This will output the contents
* of the ftrace buffers to the console . This is very useful for
* capturing traces that lead to crashes and outputing it to a
* serial console .
*
* It is default off , but you can enable it with either specifying
* " ftrace_dump_on_oops " in the kernel command line , or setting
2010-04-18 21:08:41 +04:00
* / proc / sys / kernel / ftrace_dump_on_oops
* Set 1 if you want to dump buffers of all CPUs
* Set 2 if you want to dump the buffer of the CPU that triggered oops
2008-10-24 03:26:08 +04:00
*/
2010-04-18 21:08:41 +04:00
enum ftrace_dump_mode ftrace_dump_on_oops ;
2008-10-24 03:26:08 +04:00
2013-06-15 00:21:43 +04:00
/* When set, tracing will stop when a WARN*() is hit */
int __disable_trace_on_warning ;
2015-04-01 00:23:45 +03:00
# ifdef CONFIG_TRACE_ENUM_MAP_FILE
/* Map of enums to their values, for "enum_map" file */
struct trace_enum_map_head {
struct module * mod ;
unsigned long length ;
} ;
union trace_enum_map_item ;
struct trace_enum_map_tail {
/*
* " end " is first and points to NULL as it must be different
* than " mod " or " enum_string "
*/
union trace_enum_map_item * next ;
const char * end ; /* points to NULL */
} ;
static DEFINE_MUTEX ( trace_enum_mutex ) ;
/*
* The trace_enum_maps are saved in an array with two extra elements ,
* one at the beginning , and one at the end . The beginning item contains
* the count of the saved maps ( head . length ) , and the module they
* belong to if not built in ( head . mod ) . The ending item contains a
* pointer to the next array of saved enum_map items .
*/
union trace_enum_map_item {
struct trace_enum_map map ;
struct trace_enum_map_head head ;
struct trace_enum_map_tail tail ;
} ;
static union trace_enum_map_item * trace_enum_maps ;
# endif /* CONFIG_TRACE_ENUM_MAP_FILE */
2013-11-07 07:42:48 +04:00
static int tracing_set_tracer ( struct trace_array * tr , const char * buf ) ;
2009-02-03 05:38:32 +03:00
2009-09-18 10:06:47 +04:00
# define MAX_TRACER_SIZE 100
static char bootup_tracer_buf [ MAX_TRACER_SIZE ] __initdata ;
2009-02-03 05:38:32 +03:00
static char * default_bootup_tracer ;
2008-11-01 21:57:37 +03:00
2013-03-08 07:48:09 +04:00
static bool allocate_snapshot ;
2009-10-14 22:50:32 +04:00
static int __init set_cmdline_ftrace ( char * str )
2008-11-01 21:57:37 +03:00
{
2013-04-08 08:06:44 +04:00
strlcpy ( bootup_tracer_buf , str , MAX_TRACER_SIZE ) ;
2009-02-03 05:38:32 +03:00
default_bootup_tracer = bootup_tracer_buf ;
2009-03-11 20:42:01 +03:00
/* We are using ftrace early, expand it */
2013-03-08 07:48:09 +04:00
ring_buffer_expanded = true ;
2008-11-01 21:57:37 +03:00
return 1 ;
}
2009-10-14 22:50:32 +04:00
__setup ( " ftrace= " , set_cmdline_ftrace ) ;
2008-11-01 21:57:37 +03:00
2008-10-24 03:26:08 +04:00
static int __init set_ftrace_dump_on_oops ( char * str )
{
2010-04-18 21:08:41 +04:00
if ( * str + + ! = ' = ' | | ! * str ) {
ftrace_dump_on_oops = DUMP_ALL ;
return 1 ;
}
if ( ! strcmp ( " orig_cpu " , str ) ) {
ftrace_dump_on_oops = DUMP_ORIG ;
return 1 ;
}
return 0 ;
2008-10-24 03:26:08 +04:00
}
__setup ( " ftrace_dump_on_oops " , set_ftrace_dump_on_oops ) ;
2008-05-12 23:20:44 +04:00
2013-06-15 00:21:43 +04:00
static int __init stop_trace_on_warning ( char * str )
{
2014-11-13 02:14:00 +03:00
if ( ( strcmp ( str , " =0 " ) ! = 0 & & strcmp ( str , " =off " ) ! = 0 ) )
__disable_trace_on_warning = 1 ;
2013-06-15 00:21:43 +04:00
return 1 ;
}
2014-11-13 02:14:00 +03:00
__setup ( " traceoff_on_warning " , stop_trace_on_warning ) ;
2013-06-15 00:21:43 +04:00
2013-03-12 19:17:54 +04:00
static int __init boot_alloc_snapshot ( char * str )
2013-03-08 07:48:09 +04:00
{
allocate_snapshot = true ;
/* We also need the main ring buffer expanded */
ring_buffer_expanded = true ;
return 1 ;
}
2013-03-12 19:17:54 +04:00
__setup ( " alloc_snapshot " , boot_alloc_snapshot ) ;
2013-03-08 07:48:09 +04:00
2012-11-02 06:56:07 +04:00
static char trace_boot_options_buf [ MAX_TRACER_SIZE ] __initdata ;
static char * trace_boot_options __initdata ;
static int __init set_trace_boot_options ( char * str )
{
2013-04-08 08:06:44 +04:00
strlcpy ( trace_boot_options_buf , str , MAX_TRACER_SIZE ) ;
2012-11-02 06:56:07 +04:00
trace_boot_options = trace_boot_options_buf ;
return 0 ;
}
__setup ( " trace_options= " , set_trace_boot_options ) ;
2014-02-11 08:38:46 +04:00
static char trace_boot_clock_buf [ MAX_TRACER_SIZE ] __initdata ;
static char * trace_boot_clock __initdata ;
static int __init set_trace_boot_clock ( char * str )
{
strlcpy ( trace_boot_clock_buf , str , MAX_TRACER_SIZE ) ;
trace_boot_clock = trace_boot_clock_buf ;
return 0 ;
}
__setup ( " trace_clock= " , set_trace_boot_clock ) ;
2014-12-13 06:27:10 +03:00
static int __init set_tracepoint_printk ( char * str )
{
if ( ( strcmp ( str , " =0 " ) ! = 0 & & strcmp ( str , " =off " ) ! = 0 ) )
tracepoint_printk = 1 ;
return 1 ;
}
__setup ( " tp_printk " , set_tracepoint_printk ) ;
2013-06-15 00:21:43 +04:00
2009-03-30 09:48:00 +04:00
unsigned long long ns2usecs ( cycle_t nsec )
2008-05-12 23:20:42 +04:00
{
nsec + = 500 ;
do_div ( nsec , 1000 ) ;
return nsec ;
}
2008-05-12 23:21:00 +04:00
/*
* The global_trace is the descriptor that holds the tracing
* buffers for the live tracing . For each CPU , it contains
* a link list of pages that will store trace entries . The
* page descriptor of the pages in the memory is used to hold
* the link list by linking the lru item in the page descriptor
* to each of the pages in the buffer per CPU .
*
* For each active CPU there is a data field that holds the
* pages for the buffer for that CPU . Each CPU has the same number
* of pages allocated for its buffer .
*/
2008-05-12 23:20:42 +04:00
static struct trace_array global_trace ;
2012-05-04 07:09:03 +04:00
LIST_HEAD ( ftrace_trace_arrays ) ;
2008-05-12 23:20:42 +04:00
2013-07-02 06:50:29 +04:00
int trace_array_get ( struct trace_array * this_tr )
{
struct trace_array * tr ;
int ret = - ENODEV ;
mutex_lock ( & trace_types_lock ) ;
list_for_each_entry ( tr , & ftrace_trace_arrays , list ) {
if ( tr = = this_tr ) {
tr - > ref + + ;
ret = 0 ;
break ;
}
}
mutex_unlock ( & trace_types_lock ) ;
return ret ;
}
static void __trace_array_put ( struct trace_array * this_tr )
{
WARN_ON ( ! this_tr - > ref ) ;
this_tr - > ref - - ;
}
void trace_array_put ( struct trace_array * this_tr )
{
mutex_lock ( & trace_types_lock ) ;
__trace_array_put ( this_tr ) ;
mutex_unlock ( & trace_types_lock ) ;
}
2015-05-05 17:09:53 +03:00
int filter_check_discard ( struct trace_event_file * file , void * rec ,
2013-10-24 17:34:17 +04:00
struct ring_buffer * buffer ,
struct ring_buffer_event * event )
2009-04-08 12:15:54 +04:00
{
2015-05-13 22:12:33 +03:00
if ( unlikely ( file - > flags & EVENT_FILE_FL_FILTERED ) & &
2013-10-24 17:34:17 +04:00
! filter_match_preds ( file - > filter , rec ) ) {
ring_buffer_discard_commit ( buffer , event ) ;
return 1 ;
}
return 0 ;
}
EXPORT_SYMBOL_GPL ( filter_check_discard ) ;
2015-05-05 18:45:27 +03:00
int call_filter_check_discard ( struct trace_event_call * call , void * rec ,
2013-10-24 17:34:17 +04:00
struct ring_buffer * buffer ,
struct ring_buffer_event * event )
{
if ( unlikely ( call - > flags & TRACE_EVENT_FL_FILTERED ) & &
! filter_match_preds ( call - > filter , rec ) ) {
ring_buffer_discard_commit ( buffer , event ) ;
return 1 ;
}
return 0 ;
2009-04-08 12:15:54 +04:00
}
2013-10-24 17:34:17 +04:00
EXPORT_SYMBOL_GPL ( call_filter_check_discard ) ;
2009-04-08 12:15:54 +04:00
2014-04-17 23:44:42 +04:00
static cycle_t buffer_ftrace_now ( struct trace_buffer * buf , int cpu )
2009-03-18 00:22:06 +03:00
{
u64 ts ;
/* Early boot up does not have a buffer yet */
2013-08-03 05:36:16 +04:00
if ( ! buf - > buffer )
2009-03-18 00:22:06 +03:00
return trace_clock_local ( ) ;
2013-08-03 05:36:16 +04:00
ts = ring_buffer_time_stamp ( buf - > buffer , cpu ) ;
ring_buffer_normalize_time_stamp ( buf - > buffer , cpu , & ts ) ;
2009-03-18 00:22:06 +03:00
return ts ;
}
2008-05-12 23:20:42 +04:00
2013-08-03 05:36:16 +04:00
cycle_t ftrace_now ( int cpu )
{
return buffer_ftrace_now ( & global_trace . trace_buffer , cpu ) ;
}
2013-07-01 23:58:24 +04:00
/**
* tracing_is_enabled - Show if global_trace has been disabled
*
* Shows if the global trace has been enabled or not . It uses the
* mirror flag " buffer_disabled " to be used in fast paths such as for
* the irqsoff tracer . But it may be inaccurate due to races . If you
* need to know the accurate state , use tracing_is_on ( ) which is a little
* slower , but accurate .
*/
ftrace: restructure tracing start/stop infrastructure
Impact: change where tracing is started up and stopped
Currently, when a new tracer is selected via echo'ing a tracer name into
the current_tracer file, the startup is only done if tracing_enabled is
set to one. If tracing_enabled is changed to zero (by echo'ing 0 into
the tracing_enabled file) a full shutdown is performed.
The full startup and shutdown of a tracer can be expensive and the
user can lose out traces when echo'ing in 0 to the tracing_enabled file,
because the process takes too long. There can also be places that
the user would like to start and stop the tracer several times and
doing the full startup and shutdown of a tracer might be too expensive.
This patch performs the full startup and shutdown when a tracer is
selected. It also adds a way to do a quick start or stop of a tracer.
The quick version is just a flag that prevents the tracing from
taking place, but the overhead of the code is still there.
For example, the startup of a tracer may enable tracepoints, or enable
the function tracer. The stop and start will just set a flag to
have the tracer ignore the calls when the tracepoint or function trace
is called. The overhead of the tracer may still be present when
the tracer is stopped, but no tracing will occur. Setting the tracer
to the 'nop' tracer (or any other tracer) will perform the shutdown
of the tracer which will disable the tracepoint or disable the
function tracer.
The tracing_enabled file will simply start or stop tracing.
This change is all internal. The end result for the user should be the same
as before. If tracing_enabled is not set, no trace will happen.
If tracing_enabled is set, then the trace will happen. The tracing_enabled
variable is static between tracers. Enabling tracing_enabled and
going to another tracer will keep tracing_enabled enabled. Same
is true with disabling tracing_enabled.
This patch will now provide a fast start/stop method to the users
for enabling or disabling tracing.
Note: There were two methods to the struct tracer that were never
used: The methods start and stop. These were to be used as a hook
to the reading of the trace output, but ended up not being
necessary. These two methods are now used to enable the start
and stop of each tracer, in case the tracer needs to do more than
just not write into the buffer. For example, the irqsoff tracer
must stop recording max latencies when tracing is stopped.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 00:05:44 +03:00
int tracing_is_enabled ( void )
{
2013-07-01 23:58:24 +04:00
/*
* For quick access ( irqsoff uses this in fast path ) , just
* return the mirror variable of the state of the ring buffer .
* It ' s a little racy , but we don ' t really care .
*/
smp_rmb ( ) ;
return ! global_trace . buffer_disabled ;
ftrace: restructure tracing start/stop infrastructure
Impact: change where tracing is started up and stopped
Currently, when a new tracer is selected via echo'ing a tracer name into
the current_tracer file, the startup is only done if tracing_enabled is
set to one. If tracing_enabled is changed to zero (by echo'ing 0 into
the tracing_enabled file) a full shutdown is performed.
The full startup and shutdown of a tracer can be expensive and the
user can lose out traces when echo'ing in 0 to the tracing_enabled file,
because the process takes too long. There can also be places that
the user would like to start and stop the tracer several times and
doing the full startup and shutdown of a tracer might be too expensive.
This patch performs the full startup and shutdown when a tracer is
selected. It also adds a way to do a quick start or stop of a tracer.
The quick version is just a flag that prevents the tracing from
taking place, but the overhead of the code is still there.
For example, the startup of a tracer may enable tracepoints, or enable
the function tracer. The stop and start will just set a flag to
have the tracer ignore the calls when the tracepoint or function trace
is called. The overhead of the tracer may still be present when
the tracer is stopped, but no tracing will occur. Setting the tracer
to the 'nop' tracer (or any other tracer) will perform the shutdown
of the tracer which will disable the tracepoint or disable the
function tracer.
The tracing_enabled file will simply start or stop tracing.
This change is all internal. The end result for the user should be the same
as before. If tracing_enabled is not set, no trace will happen.
If tracing_enabled is set, then the trace will happen. The tracing_enabled
variable is static between tracers. Enabling tracing_enabled and
going to another tracer will keep tracing_enabled enabled. Same
is true with disabling tracing_enabled.
This patch will now provide a fast start/stop method to the users
for enabling or disabling tracing.
Note: There were two methods to the struct tracer that were never
used: The methods start and stop. These were to be used as a hook
to the reading of the trace output, but ended up not being
necessary. These two methods are now used to enable the start
and stop of each tracer, in case the tracer needs to do more than
just not write into the buffer. For example, the irqsoff tracer
must stop recording max latencies when tracing is stopped.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 00:05:44 +03:00
}
2008-05-12 23:21:00 +04:00
/*
2008-09-30 07:02:41 +04:00
* trace_buf_size is the size in bytes that is allocated
* for a buffer . Note , the number of bytes is always rounded
* to page size .
2008-07-31 06:36:46 +04:00
*
* This number is purposely set to a low number of 16384.
* If the dump on oops happens , it will be much appreciated
* to not have to wait for all that output . Anyway this can be
* boot time and run time configurable .
2008-05-12 23:21:00 +04:00
*/
2008-09-30 07:02:41 +04:00
# define TRACE_BUF_SIZE_DEFAULT 1441792UL /* 16384 * 88 (sizeof(entry)) */
2008-07-31 06:36:46 +04:00
2008-09-30 07:02:41 +04:00
static unsigned long trace_buf_size = TRACE_BUF_SIZE_DEFAULT ;
2008-05-12 23:20:42 +04:00
2008-05-12 23:21:00 +04:00
/* trace_types holds a link list of available tracers. */
2008-05-12 23:20:42 +04:00
static struct tracer * trace_types __read_mostly ;
2008-05-12 23:21:00 +04:00
/*
* trace_types_lock is used to protect the trace_types list .
*/
2013-07-02 06:37:54 +04:00
DEFINE_MUTEX ( trace_types_lock ) ;
2008-05-12 23:21:00 +04:00
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
/*
* serialize the access of the ring buffer
*
* ring buffer serializes readers , but it is low level protection .
* The validity of the events ( which returns by ring_buffer_peek ( ) . . etc )
* are not protected by ring buffer .
*
* The content of events may become garbage if we allow other process consumes
* these events concurrently :
* A ) the page of the consumed events may become a normal page
* ( not reader page ) in ring buffer , and this page will be rewrited
* by events producer .
* B ) The page of the consumed events may become a page for splice_read ,
* and this page will be returned to system .
*
* These primitives allow multi process access to different cpu ring buffer
* concurrently .
*
* These primitives don ' t distinguish read - only and read - consume access .
* Multi read - only access are also serialized .
*/
# ifdef CONFIG_SMP
static DECLARE_RWSEM ( all_cpu_access_lock ) ;
static DEFINE_PER_CPU ( struct mutex , cpu_access_lock ) ;
static inline void trace_access_lock ( int cpu )
{
2013-01-24 00:22:59 +04:00
if ( cpu = = RING_BUFFER_ALL_CPUS ) {
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
/* gain it for accessing the whole ring buffer. */
down_write ( & all_cpu_access_lock ) ;
} else {
/* gain it for accessing a cpu ring buffer. */
2013-01-24 00:22:59 +04:00
/* Firstly block other trace_access_lock(RING_BUFFER_ALL_CPUS). */
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
down_read ( & all_cpu_access_lock ) ;
/* Secondly block other access to this @cpu ring buffer. */
mutex_lock ( & per_cpu ( cpu_access_lock , cpu ) ) ;
}
}
static inline void trace_access_unlock ( int cpu )
{
2013-01-24 00:22:59 +04:00
if ( cpu = = RING_BUFFER_ALL_CPUS ) {
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
up_write ( & all_cpu_access_lock ) ;
} else {
mutex_unlock ( & per_cpu ( cpu_access_lock , cpu ) ) ;
up_read ( & all_cpu_access_lock ) ;
}
}
static inline void trace_access_lock_init ( void )
{
int cpu ;
for_each_possible_cpu ( cpu )
mutex_init ( & per_cpu ( cpu_access_lock , cpu ) ) ;
}
# else
static DEFINE_MUTEX ( access_lock ) ;
static inline void trace_access_lock ( int cpu )
{
( void ) cpu ;
mutex_lock ( & access_lock ) ;
}
static inline void trace_access_unlock ( int cpu )
{
( void ) cpu ;
mutex_unlock ( & access_lock ) ;
}
static inline void trace_access_lock_init ( void )
{
}
# endif
2008-11-13 01:52:37 +03:00
/* trace_flags holds trace_options default values */
2008-11-13 01:52:38 +03:00
unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK |
2009-03-25 06:17:58 +03:00
TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME |
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 18:34:33 +04:00
TRACE_ITER_GRAPH_TIME | TRACE_ITER_RECORD_CMD | TRACE_ITER_OVERWRITE |
2013-03-14 20:10:40 +04:00
TRACE_ITER_IRQ_INFO | TRACE_ITER_MARKERS | TRACE_ITER_FUNCTION ;
2011-05-11 00:27:21 +04:00
2013-07-03 03:59:57 +04:00
static void tracer_tracing_on ( struct trace_array * tr )
2013-07-01 23:58:24 +04:00
{
if ( tr - > trace_buffer . buffer )
ring_buffer_record_on ( tr - > trace_buffer . buffer ) ;
/*
* This flag is looked at when buffers haven ' t been allocated
* yet , or by some tracers ( like irqsoff ) , that just want to
* know if the ring buffer has been disabled , but it can handle
* races of where it gets disabled but we still do a record .
* As the check is in the fast path of the tracers , it is more
* important to be fast than accurate .
*/
tr - > buffer_disabled = 0 ;
/* Make the flag seen by readers */
smp_wmb ( ) ;
}
2012-02-23 00:50:28 +04:00
/**
* tracing_on - enable tracing buffers
*
* This function enables tracing buffers that may have been
* disabled with tracing_off .
*/
void tracing_on ( void )
{
2013-07-01 23:58:24 +04:00
tracer_tracing_on ( & global_trace ) ;
2012-02-23 00:50:28 +04:00
}
EXPORT_SYMBOL_GPL ( tracing_on ) ;
2013-03-09 06:02:34 +04:00
/**
* __trace_puts - write a constant string into the trace buffer .
* @ ip : The address of the caller
* @ str : The constant string to write
* @ size : The size of the string .
*/
int __trace_puts ( unsigned long ip , const char * str , int size )
{
struct ring_buffer_event * event ;
struct ring_buffer * buffer ;
struct print_entry * entry ;
unsigned long irq_flags ;
int alloc ;
2013-07-18 12:31:05 +04:00
int pc ;
2013-07-18 12:31:18 +04:00
if ( ! ( trace_flags & TRACE_ITER_PRINTK ) )
return 0 ;
2013-07-18 12:31:05 +04:00
pc = preempt_count ( ) ;
2013-03-09 06:02:34 +04:00
2014-01-23 21:27:59 +04:00
if ( unlikely ( tracing_selftest_running | | tracing_disabled ) )
return 0 ;
2013-03-09 06:02:34 +04:00
alloc = sizeof ( * entry ) + size + 2 ; /* possible \n added */
local_save_flags ( irq_flags ) ;
buffer = global_trace . trace_buffer . buffer ;
event = trace_buffer_lock_reserve ( buffer , TRACE_PRINT , alloc ,
2013-07-18 12:31:05 +04:00
irq_flags , pc ) ;
2013-03-09 06:02:34 +04:00
if ( ! event )
return 0 ;
entry = ring_buffer_event_data ( event ) ;
entry - > ip = ip ;
memcpy ( & entry - > buf , str , size ) ;
/* Add a newline if necessary */
if ( entry - > buf [ size - 1 ] ! = ' \n ' ) {
entry - > buf [ size ] = ' \n ' ;
entry - > buf [ size + 1 ] = ' \0 ' ;
} else
entry - > buf [ size ] = ' \0 ' ;
__buffer_unlock_commit ( buffer , event ) ;
2013-07-18 12:31:05 +04:00
ftrace_trace_stack ( buffer , irq_flags , 4 , pc ) ;
2013-03-09 06:02:34 +04:00
return size ;
}
EXPORT_SYMBOL_GPL ( __trace_puts ) ;
/**
* __trace_bputs - write the pointer to a constant string into trace buffer
* @ ip : The address of the caller
* @ str : The constant string to write to the buffer to
*/
int __trace_bputs ( unsigned long ip , const char * str )
{
struct ring_buffer_event * event ;
struct ring_buffer * buffer ;
struct bputs_entry * entry ;
unsigned long irq_flags ;
int size = sizeof ( struct bputs_entry ) ;
2013-07-18 12:31:05 +04:00
int pc ;
2013-07-18 12:31:18 +04:00
if ( ! ( trace_flags & TRACE_ITER_PRINTK ) )
return 0 ;
2013-07-18 12:31:05 +04:00
pc = preempt_count ( ) ;
2013-03-09 06:02:34 +04:00
2014-01-23 21:27:59 +04:00
if ( unlikely ( tracing_selftest_running | | tracing_disabled ) )
return 0 ;
2013-03-09 06:02:34 +04:00
local_save_flags ( irq_flags ) ;
buffer = global_trace . trace_buffer . buffer ;
event = trace_buffer_lock_reserve ( buffer , TRACE_BPUTS , size ,
2013-07-18 12:31:05 +04:00
irq_flags , pc ) ;
2013-03-09 06:02:34 +04:00
if ( ! event )
return 0 ;
entry = ring_buffer_event_data ( event ) ;
entry - > ip = ip ;
entry - > str = str ;
__buffer_unlock_commit ( buffer , event ) ;
2013-07-18 12:31:05 +04:00
ftrace_trace_stack ( buffer , irq_flags , 4 , pc ) ;
2013-03-09 06:02:34 +04:00
return 1 ;
}
EXPORT_SYMBOL_GPL ( __trace_bputs ) ;
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 06:45:37 +04:00
# ifdef CONFIG_TRACER_SNAPSHOT
/**
* trace_snapshot - take a snapshot of the current buffer .
*
* This causes a swap between the snapshot buffer and the current live
* tracing buffer . You can use this to take snapshots of the live
* trace when some condition is triggered , but continue to trace .
*
* Note , make sure to allocate the snapshot with either
* a tracing_snapshot_alloc ( ) , or by doing it manually
* with : echo 1 > / sys / kernel / debug / tracing / snapshot
*
* If the snapshot buffer is not allocated , it will stop tracing .
* Basically making a permanent snapshot .
*/
void tracing_snapshot ( void )
{
struct trace_array * tr = & global_trace ;
struct tracer * tracer = tr - > current_trace ;
unsigned long flags ;
2013-03-09 09:56:08 +04:00
if ( in_nmi ( ) ) {
internal_trace_puts ( " *** SNAPSHOT CALLED FROM NMI CONTEXT *** \n " ) ;
internal_trace_puts ( " *** snapshot is being ignored *** \n " ) ;
return ;
}
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 06:45:37 +04:00
if ( ! tr - > allocated_snapshot ) {
2013-03-09 09:40:58 +04:00
internal_trace_puts ( " *** SNAPSHOT NOT ALLOCATED *** \n " ) ;
internal_trace_puts ( " *** stopping trace here! *** \n " ) ;
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 06:45:37 +04:00
tracing_off ( ) ;
return ;
}
/* Note, snapshot can not be used when the tracer uses it */
if ( tracer - > use_max_tr ) {
2013-03-09 09:40:58 +04:00
internal_trace_puts ( " *** LATENCY TRACER ACTIVE *** \n " ) ;
internal_trace_puts ( " *** Can not use snapshot (sorry) *** \n " ) ;
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 06:45:37 +04:00
return ;
}
local_irq_save ( flags ) ;
update_max_tr ( tr , current , smp_processor_id ( ) ) ;
local_irq_restore ( flags ) ;
}
2013-03-09 09:56:08 +04:00
EXPORT_SYMBOL_GPL ( tracing_snapshot ) ;
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 06:45:37 +04:00
static int resize_buffer_duplicate_size ( struct trace_buffer * trace_buf ,
struct trace_buffer * size_buf , int cpu_id ) ;
2013-03-12 19:17:54 +04:00
static void set_buffer_entries ( struct trace_buffer * buf , unsigned long val ) ;
static int alloc_snapshot ( struct trace_array * tr )
{
int ret ;
if ( ! tr - > allocated_snapshot ) {
/* allocate spare buffer */
ret = resize_buffer_duplicate_size ( & tr - > max_buffer ,
& tr - > trace_buffer , RING_BUFFER_ALL_CPUS ) ;
if ( ret < 0 )
return ret ;
tr - > allocated_snapshot = true ;
}
return 0 ;
}
2014-04-17 23:44:42 +04:00
static void free_snapshot ( struct trace_array * tr )
2013-03-12 19:17:54 +04:00
{
/*
* We don ' t free the ring buffer . instead , resize it because
* The max_tr ring buffer has some state ( e . g . ring - > clock ) and
* we want preserve it .
*/
ring_buffer_resize ( tr - > max_buffer . buffer , 1 , RING_BUFFER_ALL_CPUS ) ;
set_buffer_entries ( & tr - > max_buffer , 1 ) ;
tracing_reset_online_cpus ( & tr - > max_buffer ) ;
tr - > allocated_snapshot = false ;
}
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 06:45:37 +04:00
2013-10-24 17:59:26 +04:00
/**
* tracing_alloc_snapshot - allocate snapshot buffer .
*
* This only allocates the snapshot buffer if it isn ' t already
* allocated - it doesn ' t also take a snapshot .
*
* This is meant to be used in cases where the snapshot buffer needs
* to be set up for events that can ' t sleep but need to be able to
* trigger a snapshot .
*/
int tracing_alloc_snapshot ( void )
{
struct trace_array * tr = & global_trace ;
int ret ;
ret = alloc_snapshot ( tr ) ;
WARN_ON ( ret < 0 ) ;
return ret ;
}
EXPORT_SYMBOL_GPL ( tracing_alloc_snapshot ) ;
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 06:45:37 +04:00
/**
* trace_snapshot_alloc - allocate and take a snapshot of the current buffer .
*
* This is similar to trace_snapshot ( ) , but it will allocate the
* snapshot buffer if it isn ' t already allocated . Use this only
* where it is safe to sleep , as the allocation may sleep .
*
* This causes a swap between the snapshot buffer and the current live
* tracing buffer . You can use this to take snapshots of the live
* trace when some condition is triggered , but continue to trace .
*/
void tracing_snapshot_alloc ( void )
{
int ret ;
2013-10-24 17:59:26 +04:00
ret = tracing_alloc_snapshot ( ) ;
if ( ret < 0 )
2013-03-12 19:17:54 +04:00
return ;
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 06:45:37 +04:00
tracing_snapshot ( ) ;
}
2013-03-09 09:56:08 +04:00
EXPORT_SYMBOL_GPL ( tracing_snapshot_alloc ) ;
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 06:45:37 +04:00
# else
void tracing_snapshot ( void )
{
WARN_ONCE ( 1 , " Snapshot feature not enabled, but internal snapshot used " ) ;
}
2013-03-09 09:56:08 +04:00
EXPORT_SYMBOL_GPL ( tracing_snapshot ) ;
2013-10-24 17:59:26 +04:00
int tracing_alloc_snapshot ( void )
{
WARN_ONCE ( 1 , " Snapshot feature not enabled, but snapshot allocation used " ) ;
return - ENODEV ;
}
EXPORT_SYMBOL_GPL ( tracing_alloc_snapshot ) ;
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 06:45:37 +04:00
void tracing_snapshot_alloc ( void )
{
/* Give warning */
tracing_snapshot ( ) ;
}
2013-03-09 09:56:08 +04:00
EXPORT_SYMBOL_GPL ( tracing_snapshot_alloc ) ;
tracing: Add internal tracing_snapshot() functions
The new snapshot feature is quite handy. It's a way for the user
to take advantage of the spare buffer that, until then, only
the latency tracers used to "snapshot" the buffer when it hit
a max latency. Now users can trigger a "snapshot" manually when
some condition is hit in a program. But a snapshot currently can
not be triggered by a condition inside the kernel.
With the addition of tracing_snapshot() and tracing_snapshot_alloc(),
snapshots can now be taking when a condition is hit, and the
developer wants to snapshot the case without stopping the trace.
Note, any snapshot will overwrite the old one, so take care
in how this is done.
These new functions are to be used like tracing_on(), tracing_off()
and trace_printk() are. That is, they should never be called
in the mainline Linux kernel. They are solely for the purpose
of debugging.
The tracing_snapshot() will not allocate a buffer, but it is
safe to be called from any context (except NMIs). But if a
snapshot buffer isn't allocated when it is called, it will write
to the live buffer, complaining about the lack of a snapshot
buffer, and then stop tracing (giving you the "permanent snapshot").
tracing_snapshot_alloc() will allocate the snapshot buffer if
it was not already allocated and then take the snapshot. This routine
*may sleep*, and must be called from context that can sleep.
The allocation is done with GFP_KERNEL and not atomic.
If you need a snapshot in an atomic context, say in early boot,
then it is best to call the tracing_snapshot_alloc() before then,
where it will allocate the buffer, and then you can use the
tracing_snapshot() anywhere you want and still get snapshots.
Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 06:45:37 +04:00
# endif /* CONFIG_TRACER_SNAPSHOT */
2013-07-03 03:59:57 +04:00
static void tracer_tracing_off ( struct trace_array * tr )
2013-07-01 23:58:24 +04:00
{
if ( tr - > trace_buffer . buffer )
ring_buffer_record_off ( tr - > trace_buffer . buffer ) ;
/*
* This flag is looked at when buffers haven ' t been allocated
* yet , or by some tracers ( like irqsoff ) , that just want to
* know if the ring buffer has been disabled , but it can handle
* races of where it gets disabled but we still do a record .
* As the check is in the fast path of the tracers , it is more
* important to be fast than accurate .
*/
tr - > buffer_disabled = 1 ;
/* Make the flag seen by readers */
smp_wmb ( ) ;
}
2012-02-23 00:50:28 +04:00
/**
* tracing_off - turn off tracing buffers
*
* This function stops the tracing buffers from recording data .
* It does not disable any overhead the tracers themselves may
* be causing . This function simply causes all recording to
* the ring buffers to fail .
*/
void tracing_off ( void )
{
2013-07-01 23:58:24 +04:00
tracer_tracing_off ( & global_trace ) ;
2012-02-23 00:50:28 +04:00
}
EXPORT_SYMBOL_GPL ( tracing_off ) ;
2013-06-15 00:21:43 +04:00
void disable_trace_on_warning ( void )
{
if ( __disable_trace_on_warning )
tracing_off ( ) ;
}
2013-07-01 23:58:24 +04:00
/**
* tracer_tracing_is_on - show real state of ring buffer enabled
* @ tr : the trace array to know if ring buffer is enabled
*
* Shows real state of the ring buffer if it is enabled or not .
*/
2013-07-03 03:59:57 +04:00
static int tracer_tracing_is_on ( struct trace_array * tr )
2013-07-01 23:58:24 +04:00
{
if ( tr - > trace_buffer . buffer )
return ring_buffer_record_is_on ( tr - > trace_buffer . buffer ) ;
return ! tr - > buffer_disabled ;
}
2012-02-23 00:50:28 +04:00
/**
* tracing_is_on - show state of ring buffers enabled
*/
int tracing_is_on ( void )
{
2013-07-01 23:58:24 +04:00
return tracer_tracing_is_on ( & global_trace ) ;
2012-02-23 00:50:28 +04:00
}
EXPORT_SYMBOL_GPL ( tracing_is_on ) ;
2008-09-30 07:02:41 +04:00
static int __init set_buf_size ( char * str )
2008-05-12 23:20:42 +04:00
{
2008-09-30 07:02:41 +04:00
unsigned long buf_size ;
2008-05-12 23:21:00 +04:00
2008-05-12 23:20:42 +04:00
if ( ! str )
return 0 ;
2009-06-24 13:33:15 +04:00
buf_size = memparse ( str , & str ) ;
2008-05-12 23:21:00 +04:00
/* nr_entries can not be zero */
2009-06-24 13:33:15 +04:00
if ( buf_size = = 0 )
2008-05-12 23:21:00 +04:00
return 0 ;
2008-09-30 07:02:41 +04:00
trace_buf_size = buf_size ;
2008-05-12 23:20:42 +04:00
return 1 ;
}
2008-09-30 07:02:41 +04:00
__setup ( " trace_buf_size= " , set_buf_size ) ;
2008-05-12 23:20:42 +04:00
2010-02-26 02:36:43 +03:00
static int __init set_tracing_thresh ( char * str )
{
2012-08-02 10:02:00 +04:00
unsigned long threshold ;
2010-02-26 02:36:43 +03:00
int ret ;
if ( ! str )
return 0 ;
2012-09-27 00:08:38 +04:00
ret = kstrtoul ( str , 0 , & threshold ) ;
2010-02-26 02:36:43 +03:00
if ( ret < 0 )
return 0 ;
2012-08-02 10:02:00 +04:00
tracing_thresh = threshold * 1000 ;
2010-02-26 02:36:43 +03:00
return 1 ;
}
__setup ( " tracing_thresh= " , set_tracing_thresh ) ;
2008-05-12 23:20:44 +04:00
unsigned long nsecs_to_usecs ( unsigned long nsecs )
{
return nsecs / 1000 ;
}
2008-05-12 23:21:00 +04:00
/* These must match the bit postions in trace_iterator_flags */
2008-05-12 23:20:42 +04:00
static const char * trace_options [ ] = {
" print-parent " ,
" sym-offset " ,
" sym-addr " ,
" verbose " ,
2008-05-12 23:20:47 +04:00
" raw " ,
2008-05-12 23:20:49 +04:00
" hex " ,
2008-05-12 23:20:47 +04:00
" bin " ,
2008-05-12 23:20:49 +04:00
" block " ,
2008-05-12 23:20:51 +04:00
" stacktrace " ,
2009-03-05 12:24:48 +03:00
" trace_printk " ,
ftrace: function tracer with irqs disabled
Impact: disable interrupts during trace entry creation (as opposed to preempt)
To help with performance, I set the ftracer to not disable interrupts,
and only to disable preemption. If an interrupt occurred, it would not
be traced, because the function tracer protects itself from recursion.
This may be faster, but the trace output might miss some traces.
This patch makes the fuction trace disable interrupts, but it also
adds a runtime feature to disable preemption instead. It does this by
having two different tracer functions. When the function tracer is
enabled, it will check to see which version is requested (irqs disabled
or preemption disabled). Then it will use the corresponding function
as the tracer.
Irq disabling is the default behaviour, but if the user wants better
performance, with the chance of missing traces, then they can choose
the preempt disabled version.
Running hackbench 3 times with the irqs disabled and 3 times with
the preempt disabled function tracer yielded:
tracing type times entries recorded
------------ -------- ----------------
irq disabled 43.393 166433066
43.282 166172618
43.298 166256704
preempt disabled 38.969 159871710
38.943 159972935
39.325 161056510
Average:
irqs disabled: 43.324 166287462
preempt disabled: 39.079 160300385
preempt is 10.8 percent faster than irqs disabled.
I wrote a patch to count function trace recursion and reran hackbench.
With irq disabled: 1,150 times the function tracer did not trace due to
recursion.
with preempt disabled: 5,117,718 times.
The thousand times with irq disabled could be due to NMIs, or simply a case
where it called a function that was not protected by notrace.
But we also see that a large amount of the trace is lost with the
preempt version.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04 07:15:57 +03:00
" ftrace_preempt " ,
2008-11-12 23:24:24 +03:00
" branch " ,
2008-11-13 01:52:38 +03:00
" annotate " ,
2008-11-22 14:28:47 +03:00
" userstacktrace " ,
2008-11-22 14:28:48 +03:00
" sym-userobj " ,
2008-12-13 22:18:13 +03:00
" printk-msg-only " ,
2009-02-03 01:29:21 +03:00
" context-info " ,
2009-03-05 04:34:24 +03:00
" latency-format " ,
2009-03-24 18:06:24 +03:00
" sleep-time " ,
2009-03-25 06:17:58 +03:00
" graph-time " ,
2010-07-02 07:07:32 +04:00
" record-cmd " ,
2010-12-09 00:46:47 +03:00
" overwrite " ,
2011-06-15 06:44:07 +04:00
" disable_on_free " ,
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 18:34:33 +04:00
" irq-info " ,
2012-09-08 05:12:19 +04:00
" markers " ,
2013-03-14 20:10:40 +04:00
" function-trace " ,
2008-05-12 23:20:42 +04:00
NULL
} ;
2009-08-25 12:12:56 +04:00
static struct {
u64 ( * func ) ( void ) ;
const char * name ;
2012-11-14 00:18:22 +04:00
int in_ns ; /* is this clock in nanoseconds? */
2009-08-25 12:12:56 +04:00
} trace_clocks [ ] = {
2014-07-17 01:05:25 +04:00
{ trace_clock_local , " local " , 1 } ,
{ trace_clock_global , " global " , 1 } ,
{ trace_clock_counter , " counter " , 0 } ,
2014-08-06 04:46:42 +04:00
{ trace_clock_jiffies , " uptime " , 0 } ,
2014-07-17 01:05:25 +04:00
{ trace_clock , " perf " , 1 } ,
{ ktime_get_mono_fast_ns , " mono " , 1 } ,
2015-05-08 17:30:39 +03:00
{ ktime_get_raw_fast_ns , " mono_raw " , 1 } ,
2012-11-14 00:18:21 +04:00
ARCH_TRACE_CLOCKS
2009-08-25 12:12:56 +04:00
} ;
2009-09-11 19:29:27 +04:00
/*
* trace_parser_get_init - gets the buffer for trace parser
*/
int trace_parser_get_init ( struct trace_parser * parser , int size )
{
memset ( parser , 0 , sizeof ( * parser ) ) ;
parser - > buffer = kmalloc ( size , GFP_KERNEL ) ;
if ( ! parser - > buffer )
return 1 ;
parser - > size = size ;
return 0 ;
}
/*
* trace_parser_put - frees the buffer for trace parser
*/
void trace_parser_put ( struct trace_parser * parser )
{
kfree ( parser - > buffer ) ;
}
/*
* trace_get_user - reads the user input string separated by space
* ( matched by isspace ( ch ) )
*
* For each string found the ' struct trace_parser ' is updated ,
* and the function returns .
*
* Returns number of bytes read .
*
* See kernel / trace / trace . h for ' struct trace_parser ' details .
*/
int trace_get_user ( struct trace_parser * parser , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
char ch ;
size_t read = 0 ;
ssize_t ret ;
if ( ! * ppos )
trace_parser_clear ( parser ) ;
ret = get_user ( ch , ubuf + + ) ;
if ( ret )
goto out ;
read + + ;
cnt - - ;
/*
* The parser is not finished with the last write ,
* continue reading the user input without skipping spaces .
*/
if ( ! parser - > cont ) {
/* skip white space */
while ( cnt & & isspace ( ch ) ) {
ret = get_user ( ch , ubuf + + ) ;
if ( ret )
goto out ;
read + + ;
cnt - - ;
}
/* only spaces were written */
if ( isspace ( ch ) ) {
* ppos + = read ;
ret = read ;
goto out ;
}
parser - > idx = 0 ;
}
/* read the non-space input */
while ( cnt & & ! isspace ( ch ) ) {
2009-09-22 09:51:54 +04:00
if ( parser - > idx < parser - > size - 1 )
2009-09-11 19:29:27 +04:00
parser - > buffer [ parser - > idx + + ] = ch ;
else {
ret = - EINVAL ;
goto out ;
}
ret = get_user ( ch , ubuf + + ) ;
if ( ret )
goto out ;
read + + ;
cnt - - ;
}
/* We either got finished input or we have to wait for another call. */
if ( isspace ( ch ) ) {
parser - > buffer [ parser - > idx ] = 0 ;
parser - > cont = false ;
2013-10-10 06:23:23 +04:00
} else if ( parser - > idx < parser - > size - 1 ) {
2009-09-11 19:29:27 +04:00
parser - > cont = true ;
parser - > buffer [ parser - > idx + + ] = ch ;
2013-10-10 06:23:23 +04:00
} else {
ret = - EINVAL ;
goto out ;
2009-09-11 19:29:27 +04:00
}
* ppos + = read ;
ret = read ;
out :
return ret ;
}
2014-06-25 23:54:42 +04:00
/* TODO add a seq_buf_to_buffer() */
2009-03-22 20:11:11 +03:00
static ssize_t trace_seq_to_buffer ( struct trace_seq * s , void * buf , size_t cnt )
2009-02-09 09:15:56 +03:00
{
int len ;
2014-11-14 23:49:41 +03:00
if ( trace_seq_used ( s ) < = s - > seq . readpos )
2009-02-09 09:15:56 +03:00
return - EBUSY ;
2014-11-14 23:49:41 +03:00
len = trace_seq_used ( s ) - s - > seq . readpos ;
2009-02-09 09:15:56 +03:00
if ( cnt > len )
cnt = len ;
2014-06-25 23:54:42 +04:00
memcpy ( buf , s - > buffer + s - > seq . readpos , cnt ) ;
2009-02-09 09:15:56 +03:00
2014-06-25 23:54:42 +04:00
s - > seq . readpos + = cnt ;
2009-02-09 09:15:56 +03:00
return cnt ;
}
2010-02-26 02:36:43 +03:00
unsigned long __read_mostly tracing_thresh ;
2009-08-28 00:52:21 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
/*
* Copy the new maximum trace into the separate maximum - trace
* structure . ( this way the maximum trace is permanently saved ,
* for later retrieval via / sys / kernel / debug / tracing / latency_trace )
*/
static void
__update_max_tr ( struct trace_array * tr , struct task_struct * tsk , int cpu )
{
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
struct trace_buffer * trace_buf = & tr - > trace_buffer ;
struct trace_buffer * max_buf = & tr - > max_buffer ;
struct trace_array_cpu * data = per_cpu_ptr ( trace_buf - > data , cpu ) ;
struct trace_array_cpu * max_data = per_cpu_ptr ( max_buf - > data , cpu ) ;
2009-08-28 00:52:21 +04:00
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
max_buf - > cpu = cpu ;
max_buf - > time_start = data - > preempt_timestamp ;
2009-08-28 00:52:21 +04:00
2014-01-14 20:28:38 +04:00
max_data - > saved_latency = tr - > max_latency ;
2009-09-02 20:27:41 +04:00
max_data - > critical_start = data - > critical_start ;
max_data - > critical_end = data - > critical_end ;
2009-08-28 00:52:21 +04:00
2010-03-06 00:23:50 +03:00
memcpy ( max_data - > comm , tsk - > comm , TASK_COMM_LEN ) ;
2009-09-02 20:27:41 +04:00
max_data - > pid = tsk - > pid ;
tracing: Use current_uid() for critical time tracing
The irqsoff tracer records the max time that interrupts are disabled.
There are hooks in the assembly code that calls back into the tracer when
interrupts are disabled or enabled.
When they are enabled, the tracer checks if the amount of time they
were disabled is larger than the previous recorded max interrupts off
time. If it is, it creates a snapshot of the currently running trace
to store where the last largest interrupts off time was held and how
it happened.
During testing, this RCU lockdep dump appeared:
[ 1257.829021] ===============================
[ 1257.829021] [ INFO: suspicious RCU usage. ]
[ 1257.829021] 3.10.0-rc1-test+ #171 Tainted: G W
[ 1257.829021] -------------------------------
[ 1257.829021] /home/rostedt/work/git/linux-trace.git/include/linux/rcupdate.h:780 rcu_read_lock() used illegally while idle!
[ 1257.829021]
[ 1257.829021] other info that might help us debug this:
[ 1257.829021]
[ 1257.829021]
[ 1257.829021] RCU used illegally from idle CPU!
[ 1257.829021] rcu_scheduler_active = 1, debug_locks = 0
[ 1257.829021] RCU used illegally from extended quiescent state!
[ 1257.829021] 2 locks held by trace-cmd/4831:
[ 1257.829021] #0: (max_trace_lock){......}, at: [<ffffffff810e2b77>] stop_critical_timing+0x1a3/0x209
[ 1257.829021] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff810dae5a>] __update_max_tr+0x88/0x1ee
[ 1257.829021]
[ 1257.829021] stack backtrace:
[ 1257.829021] CPU: 3 PID: 4831 Comm: trace-cmd Tainted: G W 3.10.0-rc1-test+ #171
[ 1257.829021] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
[ 1257.829021] 0000000000000001 ffff880065f49da8 ffffffff8153dd2b ffff880065f49dd8
[ 1257.829021] ffffffff81092a00 ffff88006bd78680 ffff88007add7500 0000000000000003
[ 1257.829021] ffff88006bd78680 ffff880065f49e18 ffffffff810daebf ffffffff810dae5a
[ 1257.829021] Call Trace:
[ 1257.829021] [<ffffffff8153dd2b>] dump_stack+0x19/0x1b
[ 1257.829021] [<ffffffff81092a00>] lockdep_rcu_suspicious+0x109/0x112
[ 1257.829021] [<ffffffff810daebf>] __update_max_tr+0xed/0x1ee
[ 1257.829021] [<ffffffff810dae5a>] ? __update_max_tr+0x88/0x1ee
[ 1257.829021] [<ffffffff811002b9>] ? user_enter+0xfd/0x107
[ 1257.829021] [<ffffffff810dbf85>] update_max_tr_single+0x11d/0x12d
[ 1257.829021] [<ffffffff811002b9>] ? user_enter+0xfd/0x107
[ 1257.829021] [<ffffffff810e2b15>] stop_critical_timing+0x141/0x209
[ 1257.829021] [<ffffffff8109569a>] ? trace_hardirqs_on+0xd/0xf
[ 1257.829021] [<ffffffff811002b9>] ? user_enter+0xfd/0x107
[ 1257.829021] [<ffffffff810e3057>] time_hardirqs_on+0x2a/0x2f
[ 1257.829021] [<ffffffff811002b9>] ? user_enter+0xfd/0x107
[ 1257.829021] [<ffffffff8109550c>] trace_hardirqs_on_caller+0x16/0x197
[ 1257.829021] [<ffffffff8109569a>] trace_hardirqs_on+0xd/0xf
[ 1257.829021] [<ffffffff811002b9>] user_enter+0xfd/0x107
[ 1257.829021] [<ffffffff810029b4>] do_notify_resume+0x92/0x97
[ 1257.829021] [<ffffffff8154bdca>] int_signal+0x12/0x17
What happened was entering into the user code, the interrupts were enabled
and a max interrupts off was recorded. The trace buffer was saved along with
various information about the task: comm, pid, uid, priority, etc.
The uid is recorded with task_uid(tsk). But this is a macro that uses rcu_read_lock()
to retrieve the data, and this happened to happen where RCU is blind (user_enter).
As only the preempt and irqs off tracers can have this happen, and they both
only have the tsk == current, if tsk == current, use current_uid() instead of
task_uid(), as current_uid() does not use RCU as only current can change its uid.
This fixes the RCU suspicious splat.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-05-31 05:10:37 +04:00
/*
* If tsk = = current , then use current_uid ( ) , as that does not use
* RCU . The irq tracer can be called out of RCU scope .
*/
if ( tsk = = current )
max_data - > uid = current_uid ( ) ;
else
max_data - > uid = task_uid ( tsk ) ;
2009-09-02 20:27:41 +04:00
max_data - > nice = tsk - > static_prio - 20 - MAX_RT_PRIO ;
max_data - > policy = tsk - > policy ;
max_data - > rt_priority = tsk - > rt_priority ;
2009-08-28 00:52:21 +04:00
/* record this tasks comm */
tracing_record_cmdline ( tsk ) ;
}
2008-05-12 23:21:00 +04:00
/**
* update_max_tr - snapshot all trace buffers from global_trace to max_tr
* @ tr : tracer
* @ tsk : the task with the latency
* @ cpu : The cpu that initiated the trace .
*
* Flip the buffers between the @ tr and the max_tr and record information
* about which task was the cause of this latency .
*/
2008-05-12 23:20:51 +04:00
void
2008-05-12 23:20:42 +04:00
update_max_tr ( struct trace_array * tr , struct task_struct * tsk , int cpu )
{
2013-03-12 19:32:32 +04:00
struct ring_buffer * buf ;
2008-05-12 23:20:42 +04:00
2012-05-11 21:29:49 +04:00
if ( tr - > stop_count )
2009-09-01 06:32:27 +04:00
return ;
2008-05-12 23:20:43 +04:00
WARN_ON_ONCE ( ! irqs_disabled ( ) ) ;
2013-01-22 22:35:11 +04:00
2013-03-06 03:25:02 +04:00
if ( ! tr - > allocated_snapshot ) {
2012-12-26 06:53:00 +04:00
/* Only the nop tracer should hit this when disabling */
2012-05-11 21:29:49 +04:00
WARN_ON_ONCE ( tr - > current_trace ! = & nop_trace ) ;
2013-01-22 22:35:11 +04:00
return ;
2012-12-26 06:53:00 +04:00
}
2013-01-22 22:35:11 +04:00
2014-01-14 19:04:59 +04:00
arch_spin_lock ( & tr - > max_lock ) ;
2008-09-30 07:02:41 +04:00
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
buf = tr - > trace_buffer . buffer ;
tr - > trace_buffer . buffer = tr - > max_buffer . buffer ;
tr - > max_buffer . buffer = buf ;
2008-09-30 07:02:41 +04:00
2008-05-12 23:20:42 +04:00
__update_max_tr ( tr , tsk , cpu ) ;
2014-01-14 19:04:59 +04:00
arch_spin_unlock ( & tr - > max_lock ) ;
2008-05-12 23:20:42 +04:00
}
/**
* update_max_tr_single - only copy one trace over , and reset the rest
* @ tr - tracer
* @ tsk - task with the latency
* @ cpu - the cpu of the buffer to copy .
2008-05-12 23:21:00 +04:00
*
* Flip the trace of a single CPU buffer between the @ tr and the max_tr .
2008-05-12 23:20:42 +04:00
*/
2008-05-12 23:20:51 +04:00
void
2008-05-12 23:20:42 +04:00
update_max_tr_single ( struct trace_array * tr , struct task_struct * tsk , int cpu )
{
2008-09-30 07:02:41 +04:00
int ret ;
2008-05-12 23:20:42 +04:00
2012-05-11 21:29:49 +04:00
if ( tr - > stop_count )
2009-09-01 06:32:27 +04:00
return ;
2008-05-12 23:20:43 +04:00
WARN_ON_ONCE ( ! irqs_disabled ( ) ) ;
2013-04-30 04:08:14 +04:00
if ( ! tr - > allocated_snapshot ) {
2013-03-27 01:33:00 +04:00
/* Only the nop tracer should hit this when disabling */
Tracing updates for Linux 3.10
Along with the usual minor fixes and clean ups there are a few major
changes with this pull request.
1) Multiple buffers for the ftrace facility
This feature has been requested by many people over the last few years.
I even heard that Google was about to implement it themselves. I finally
had time and cleaned up the code such that you can now create multiple
instances of the ftrace buffer and have different events go to different
buffers. This way, a low frequency event will not be lost in the noise
of a high frequency event.
Note, currently only events can go to different buffers, the tracers
(ie. function, function_graph and the latency tracers) still can only
be written to the main buffer.
2) The function tracer triggers have now been extended.
The function tracer had two triggers. One to enable tracing when a
function is hit, and one to disable tracing. Now you can record a
stack trace on a single (or many) function(s), take a snapshot of the
buffer (copy it to the snapshot buffer), and you can enable or disable
an event to be traced when a function is hit.
3) A perf clock has been added.
A "perf" clock can be chosen to be used when tracing. This will cause
ftrace to use the same clock as perf uses, and hopefully this will make
it easier to interleave the perf and ftrace data for analysis.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJRfnTPAAoJEOdOSU1xswtMqYYH/1WIdrwXmxHflErnYkCIr3sU
QtYae2K5A1HcgiqOvRJrdWMOt016iMx5CaQQyBFM1vvMiPY0sTWRmwNxDfZzz9LN
10jRvWEzZSLtzl+a9mkFWLEpr5nR/QODOxkWFCnRWscp46sp04LSTxGDYsOnPQZB
sam/AQ1h4xA+DqDBChm9BDEUEPorGleTlN54LBaCGgSFGvrbF+eAg2s4vHNAQAvQ
8d5xjSE9zC7J+FqbVxvJTbKI3+EqKL6hMsJKsKfi0SI+FuxBaFMSltXck5zKyTI4
HpNJzXCmw+v90Tju7oMkPHh6RTbESPCHoGU+wqE52fM6m7oScVeuI/kfc6USwU4=
=W1n+
-----END PGP SIGNATURE-----
Merge tag 'trace-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"Along with the usual minor fixes and clean ups there are a few major
changes with this pull request.
1) Multiple buffers for the ftrace facility
This feature has been requested by many people over the last few
years. I even heard that Google was about to implement it themselves.
I finally had time and cleaned up the code such that you can now
create multiple instances of the ftrace buffer and have different
events go to different buffers. This way, a low frequency event will
not be lost in the noise of a high frequency event.
Note, currently only events can go to different buffers, the tracers
(ie function, function_graph and the latency tracers) still can only
be written to the main buffer.
2) The function tracer triggers have now been extended.
The function tracer had two triggers. One to enable tracing when a
function is hit, and one to disable tracing. Now you can record a
stack trace on a single (or many) function(s), take a snapshot of the
buffer (copy it to the snapshot buffer), and you can enable or disable
an event to be traced when a function is hit.
3) A perf clock has been added.
A "perf" clock can be chosen to be used when tracing. This will cause
ftrace to use the same clock as perf uses, and hopefully this will
make it easier to interleave the perf and ftrace data for analysis."
* tag 'trace-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (82 commits)
tracepoints: Prevent null probe from being added
tracing: Compare to 1 instead of zero for is_signed_type()
tracing: Remove obsolete macro guard _TRACE_PROFILE_INIT
ftrace: Get rid of ftrace_profile_bits
tracing: Check return value of tracing_init_dentry()
tracing: Get rid of unneeded key calculation in ftrace_hash_move()
tracing: Reset ftrace_graph_filter_enabled if count is zero
tracing: Fix off-by-one on allocating stat->pages
kernel: tracing: Use strlcpy instead of strncpy
tracing: Update debugfs README file
tracing: Fix ftrace_dump()
tracing: Rename trace_event_mutex to trace_event_sem
tracing: Fix comment about prefix in arch_syscall_match_sym_name()
tracing: Convert trace_destroy_fields() to static
tracing: Move find_event_field() into trace_events.c
tracing: Use TRACE_MAX_PRINT instead of constant
tracing: Use pr_warn_once instead of open coded implementation
ring-buffer: Add ring buffer startup selftest
tracing: Bring Documentation/trace/ftrace.txt up to date
tracing: Add "perf" trace_clock
...
Conflicts:
kernel/trace/ftrace.c
kernel/trace/trace.c
2013-04-30 00:55:38 +04:00
WARN_ON_ONCE ( tr - > current_trace ! = & nop_trace ) ;
2010-07-01 09:34:35 +04:00
return ;
2013-03-27 01:33:00 +04:00
}
2010-07-01 09:34:35 +04:00
2014-01-14 19:04:59 +04:00
arch_spin_lock ( & tr - > max_lock ) ;
2008-05-12 23:20:42 +04:00
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ret = ring_buffer_swap_cpu ( tr - > max_buffer . buffer , tr - > trace_buffer . buffer , cpu ) ;
2008-09-30 07:02:41 +04:00
2009-09-04 03:13:05 +04:00
if ( ret = = - EBUSY ) {
/*
* We failed to swap the buffer due to a commit taking
* place on this CPU . We fail to record , but we reset
* the max trace buffer ( no one writes directly to it )
* and flag that it failed .
*/
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
trace_array_printk_buf ( tr - > max_buffer . buffer , _THIS_IP_ ,
2009-09-04 03:13:05 +04:00
" Failed to swap buffers due to commit in progress \n " ) ;
}
WARN_ON_ONCE ( ret & & ret ! = - EAGAIN & & ret ! = - EBUSY ) ;
2008-05-12 23:20:42 +04:00
__update_max_tr ( tr , tsk , cpu ) ;
2014-01-14 19:04:59 +04:00
arch_spin_unlock ( & tr - > max_lock ) ;
2008-05-12 23:20:42 +04:00
}
2009-08-28 00:52:21 +04:00
# endif /* CONFIG_TRACER_MAX_TRACE */
2008-05-12 23:20:42 +04:00
2014-11-10 21:46:34 +03:00
static int wait_on_pipe ( struct trace_iterator * iter , bool full )
2012-11-02 04:54:21 +04:00
{
2013-03-01 04:59:17 +04:00
/* Iterators are static, they should be filled or empty */
if ( trace_buffer_iter ( iter , iter - > cpu_file ) )
2014-06-10 17:46:00 +04:00
return 0 ;
2012-11-02 04:54:21 +04:00
2014-11-10 21:46:34 +03:00
return ring_buffer_wait ( iter - > trace_buffer - > buffer , iter - > cpu_file ,
full ) ;
2012-11-02 04:54:21 +04:00
}
2013-03-07 20:10:56 +04:00
# ifdef CONFIG_FTRACE_STARTUP_TEST
static int run_tracer_selftest ( struct tracer * type )
{
struct trace_array * tr = & global_trace ;
struct tracer * saved_tracer = tr - > current_trace ;
int ret ;
2012-11-02 04:54:21 +04:00
2013-03-07 20:10:56 +04:00
if ( ! type - > selftest | | tracing_selftest_disabled )
return 0 ;
2012-11-02 04:54:21 +04:00
/*
2013-03-07 20:10:56 +04:00
* Run a selftest on this tracer .
* Here we reset the trace buffer , and set the current
* tracer to be this tracer . The tracer can then run some
* internal tracing to verify that everything is in order .
* If we fail , we do not register this tracer .
2012-11-02 04:54:21 +04:00
*/
2013-03-07 20:10:56 +04:00
tracing_reset_online_cpus ( & tr - > trace_buffer ) ;
2012-11-02 04:54:21 +04:00
2013-03-07 20:10:56 +04:00
tr - > current_trace = type ;
# ifdef CONFIG_TRACER_MAX_TRACE
if ( type - > use_max_tr ) {
/* If we expanded the buffers, make sure the max is expanded too */
if ( ring_buffer_expanded )
ring_buffer_resize ( tr - > max_buffer . buffer , trace_buf_size ,
RING_BUFFER_ALL_CPUS ) ;
tr - > allocated_snapshot = true ;
}
# endif
/* the test is responsible for initializing and enabling */
pr_info ( " Testing tracer %s: " , type - > name ) ;
ret = type - > selftest ( type , tr ) ;
/* the test is responsible for resetting too */
tr - > current_trace = saved_tracer ;
if ( ret ) {
printk ( KERN_CONT " FAILED! \n " ) ;
/* Add the warning after printing 'FAILED' */
WARN_ON ( 1 ) ;
return - 1 ;
}
/* Only reset on passing, to avoid touching corrupted buffers */
tracing_reset_online_cpus ( & tr - > trace_buffer ) ;
# ifdef CONFIG_TRACER_MAX_TRACE
if ( type - > use_max_tr ) {
tr - > allocated_snapshot = false ;
2012-11-02 04:54:21 +04:00
2013-03-07 20:10:56 +04:00
/* Shrink the max buffer again */
if ( ring_buffer_expanded )
ring_buffer_resize ( tr - > max_buffer . buffer , 1 ,
RING_BUFFER_ALL_CPUS ) ;
}
# endif
printk ( KERN_CONT " PASSED \n " ) ;
return 0 ;
}
# else
static inline int run_tracer_selftest ( struct tracer * type )
{
return 0 ;
2012-11-02 04:54:21 +04:00
}
2013-03-07 20:10:56 +04:00
# endif /* CONFIG_FTRACE_STARTUP_TEST */
2012-11-02 04:54:21 +04:00
2008-05-12 23:21:00 +04:00
/**
* register_tracer - register a tracer with the ftrace system .
* @ type - the plugin for the tracer
*
* Register a new plugin tracer .
*/
2008-05-12 23:20:42 +04:00
int register_tracer ( struct tracer * type )
{
struct tracer * t ;
int ret = 0 ;
if ( ! type - > name ) {
pr_info ( " Tracer must have a name \n " ) ;
return - 1 ;
}
2010-07-10 14:06:44 +04:00
if ( strlen ( type - > name ) > = MAX_TRACER_SIZE ) {
2009-09-18 10:06:47 +04:00
pr_info ( " Tracer has a name longer than %d \n " , MAX_TRACER_SIZE ) ;
return - 1 ;
}
2008-05-12 23:20:42 +04:00
mutex_lock ( & trace_types_lock ) ;
2008-11-19 12:00:15 +03:00
2008-12-06 05:41:33 +03:00
tracing_selftest_running = true ;
2008-05-12 23:20:42 +04:00
for ( t = trace_types ; t ; t = t - > next ) {
if ( strcmp ( type - > name , t - > name ) = = 0 ) {
/* already found */
2009-09-18 10:06:47 +04:00
pr_info ( " Tracer %s already registered \n " ,
2008-05-12 23:20:42 +04:00
type - > name ) ;
ret = - 1 ;
goto out ;
}
}
2008-11-17 21:23:42 +03:00
if ( ! type - > set_flag )
type - > set_flag = & dummy_set_flag ;
if ( ! type - > flags )
type - > flags = & dummy_tracer_flags ;
else
if ( ! type - > flags - > opts )
type - > flags - > opts = dummy_tracer_opt ;
2009-02-11 04:25:00 +03:00
2013-03-07 20:10:56 +04:00
ret = run_tracer_selftest ( type ) ;
if ( ret < 0 )
goto out ;
2008-05-12 23:20:44 +04:00
2008-05-12 23:20:42 +04:00
type - > next = trace_types ;
trace_types = type ;
2008-05-12 23:20:44 +04:00
2008-05-12 23:20:42 +04:00
out :
2008-12-06 05:41:33 +03:00
tracing_selftest_running = false ;
2008-05-12 23:20:42 +04:00
mutex_unlock ( & trace_types_lock ) ;
2009-02-05 09:13:38 +03:00
if ( ret | | ! default_bootup_tracer )
goto out_unlock ;
2009-09-18 10:06:47 +04:00
if ( strncmp ( default_bootup_tracer , type - > name , MAX_TRACER_SIZE ) )
2009-02-05 09:13:38 +03:00
goto out_unlock ;
printk ( KERN_INFO " Starting tracer '%s' \n " , type - > name ) ;
/* Do we want this tracer to start on bootup? */
2013-11-07 07:42:48 +04:00
tracing_set_tracer ( & global_trace , type - > name ) ;
2009-02-05 09:13:38 +03:00
default_bootup_tracer = NULL ;
/* disable other selftests, since this will break it. */
2013-03-08 07:48:09 +04:00
tracing_selftest_disabled = true ;
2009-02-03 05:38:32 +03:00
# ifdef CONFIG_FTRACE_STARTUP_TEST
2009-02-05 09:13:38 +03:00
printk ( KERN_INFO " Disabling FTRACE selftests due to running tracer '%s' \n " ,
type - > name ) ;
2009-02-03 05:38:32 +03:00
# endif
2009-02-05 09:13:38 +03:00
out_unlock :
2008-05-12 23:20:42 +04:00
return ret ;
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
void tracing_reset ( struct trace_buffer * buf , int cpu )
2009-09-04 20:35:16 +04:00
{
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
struct ring_buffer * buffer = buf - > buffer ;
2009-09-04 20:35:16 +04:00
2012-12-19 11:02:34 +04:00
if ( ! buffer )
return ;
2009-09-04 20:35:16 +04:00
ring_buffer_record_disable ( buffer ) ;
/* Make sure all commits have finished */
synchronize_sched ( ) ;
2012-05-09 04:57:53 +04:00
ring_buffer_reset_cpu ( buffer , cpu ) ;
2009-09-04 20:35:16 +04:00
ring_buffer_record_enable ( buffer ) ;
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
void tracing_reset_online_cpus ( struct trace_buffer * buf )
2008-12-19 13:08:39 +03:00
{
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
struct ring_buffer * buffer = buf - > buffer ;
2008-12-19 13:08:39 +03:00
int cpu ;
2012-12-19 11:02:34 +04:00
if ( ! buffer )
return ;
2009-09-04 20:02:35 +04:00
ring_buffer_record_disable ( buffer ) ;
/* Make sure all commits have finished */
synchronize_sched ( ) ;
2013-08-03 05:36:16 +04:00
buf - > time_start = buffer_ftrace_now ( buf , buf - > cpu ) ;
2008-12-19 13:08:39 +03:00
for_each_online_cpu ( cpu )
2012-05-09 04:57:53 +04:00
ring_buffer_reset_cpu ( buffer , cpu ) ;
2009-09-04 20:02:35 +04:00
ring_buffer_record_enable ( buffer ) ;
2008-12-19 13:08:39 +03:00
}
2013-07-24 06:21:59 +04:00
/* Must have trace_types_lock held */
2013-03-05 08:26:06 +04:00
void tracing_reset_all_online_cpus ( void )
2009-05-07 05:54:09 +04:00
{
2013-03-05 08:26:06 +04:00
struct trace_array * tr ;
list_for_each_entry ( tr , & ftrace_trace_arrays , list ) {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
tracing_reset_online_cpus ( & tr - > trace_buffer ) ;
# ifdef CONFIG_TRACER_MAX_TRACE
tracing_reset_online_cpus ( & tr - > max_buffer ) ;
# endif
2013-03-05 08:26:06 +04:00
}
2009-05-07 05:54:09 +04:00
}
2014-06-05 05:24:27 +04:00
# define SAVED_CMDLINES_DEFAULT 128
2009-03-18 11:03:19 +03:00
# define NO_CMDLINE_MAP UINT_MAX
2009-12-03 14:38:57 +03:00
static arch_spinlock_t trace_cmdline_lock = __ARCH_SPIN_LOCK_UNLOCKED ;
2014-06-05 05:24:27 +04:00
struct saved_cmdlines_buffer {
unsigned map_pid_to_cmdline [ PID_MAX_DEFAULT + 1 ] ;
unsigned * map_cmdline_to_pid ;
unsigned cmdline_num ;
int cmdline_idx ;
char * saved_cmdlines ;
} ;
static struct saved_cmdlines_buffer * savedcmd ;
2008-05-12 23:21:00 +04:00
/* temporary disable recording */
2009-02-10 21:44:12 +03:00
static atomic_t trace_record_cmdline_disabled __read_mostly ;
2008-05-12 23:20:42 +04:00
2014-06-05 05:24:27 +04:00
static inline char * get_saved_cmdlines ( int idx )
{
return & savedcmd - > saved_cmdlines [ idx * TASK_COMM_LEN ] ;
}
static inline void set_cmdline ( int idx , const char * cmdline )
2008-05-12 23:20:42 +04:00
{
2014-06-05 05:24:27 +04:00
memcpy ( get_saved_cmdlines ( idx ) , cmdline , TASK_COMM_LEN ) ;
}
static int allocate_cmdlines_buffer ( unsigned int val ,
struct saved_cmdlines_buffer * s )
{
s - > map_cmdline_to_pid = kmalloc ( val * sizeof ( * s - > map_cmdline_to_pid ) ,
GFP_KERNEL ) ;
if ( ! s - > map_cmdline_to_pid )
return - ENOMEM ;
s - > saved_cmdlines = kmalloc ( val * TASK_COMM_LEN , GFP_KERNEL ) ;
if ( ! s - > saved_cmdlines ) {
kfree ( s - > map_cmdline_to_pid ) ;
return - ENOMEM ;
}
s - > cmdline_idx = 0 ;
s - > cmdline_num = val ;
memset ( & s - > map_pid_to_cmdline , NO_CMDLINE_MAP ,
sizeof ( s - > map_pid_to_cmdline ) ) ;
memset ( s - > map_cmdline_to_pid , NO_CMDLINE_MAP ,
val * sizeof ( * s - > map_cmdline_to_pid ) ) ;
return 0 ;
}
static int trace_create_savedcmd ( void )
{
int ret ;
2014-06-10 11:11:35 +04:00
savedcmd = kmalloc ( sizeof ( * savedcmd ) , GFP_KERNEL ) ;
2014-06-05 05:24:27 +04:00
if ( ! savedcmd )
return - ENOMEM ;
ret = allocate_cmdlines_buffer ( SAVED_CMDLINES_DEFAULT , savedcmd ) ;
if ( ret < 0 ) {
kfree ( savedcmd ) ;
savedcmd = NULL ;
return - ENOMEM ;
}
return 0 ;
2008-05-12 23:20:42 +04:00
}
2009-09-13 03:43:07 +04:00
int is_tracing_stopped ( void )
{
2012-05-11 21:29:49 +04:00
return global_trace . stop_count ;
2009-09-13 03:43:07 +04:00
}
2008-11-06 00:05:44 +03:00
/**
* tracing_start - quick start of the tracer
*
* If tracing is enabled but was stopped by tracing_stop ,
* this will start the tracer back up .
*/
void tracing_start ( void )
{
struct ring_buffer * buffer ;
unsigned long flags ;
if ( tracing_disabled )
return ;
2012-05-11 21:29:49 +04:00
raw_spin_lock_irqsave ( & global_trace . start_lock , flags ) ;
if ( - - global_trace . stop_count ) {
if ( global_trace . stop_count < 0 ) {
2009-01-22 22:26:15 +03:00
/* Someone screwed up their debugging */
WARN_ON_ONCE ( 1 ) ;
2012-05-11 21:29:49 +04:00
global_trace . stop_count = 0 ;
2009-01-22 22:26:15 +03:00
}
2008-11-06 00:05:44 +03:00
goto out ;
}
2010-03-13 03:56:00 +03:00
/* Prevent the buffers from switching */
2014-01-14 19:04:59 +04:00
arch_spin_lock ( & global_trace . max_lock ) ;
2008-11-06 00:05:44 +03:00
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
buffer = global_trace . trace_buffer . buffer ;
2008-11-06 00:05:44 +03:00
if ( buffer )
ring_buffer_record_enable ( buffer ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
buffer = global_trace . max_buffer . buffer ;
2008-11-06 00:05:44 +03:00
if ( buffer )
ring_buffer_record_enable ( buffer ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# endif
2008-11-06 00:05:44 +03:00
2014-01-14 19:04:59 +04:00
arch_spin_unlock ( & global_trace . max_lock ) ;
2010-03-13 03:56:00 +03:00
2008-11-06 00:05:44 +03:00
out :
2012-05-11 21:29:49 +04:00
raw_spin_unlock_irqrestore ( & global_trace . start_lock , flags ) ;
}
static void tracing_start_tr ( struct trace_array * tr )
{
struct ring_buffer * buffer ;
unsigned long flags ;
if ( tracing_disabled )
return ;
/* If global, we need to also start the max tracer */
if ( tr - > flags & TRACE_ARRAY_FL_GLOBAL )
return tracing_start ( ) ;
raw_spin_lock_irqsave ( & tr - > start_lock , flags ) ;
if ( - - tr - > stop_count ) {
if ( tr - > stop_count < 0 ) {
/* Someone screwed up their debugging */
WARN_ON_ONCE ( 1 ) ;
tr - > stop_count = 0 ;
}
goto out ;
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
buffer = tr - > trace_buffer . buffer ;
2012-05-11 21:29:49 +04:00
if ( buffer )
ring_buffer_record_enable ( buffer ) ;
out :
raw_spin_unlock_irqrestore ( & tr - > start_lock , flags ) ;
2008-11-06 00:05:44 +03:00
}
/**
* tracing_stop - quick stop of the tracer
*
* Light weight way to stop tracing . Use in conjunction with
* tracing_start .
*/
void tracing_stop ( void )
{
struct ring_buffer * buffer ;
unsigned long flags ;
2012-05-11 21:29:49 +04:00
raw_spin_lock_irqsave ( & global_trace . start_lock , flags ) ;
if ( global_trace . stop_count + + )
2008-11-06 00:05:44 +03:00
goto out ;
2010-03-13 03:56:00 +03:00
/* Prevent the buffers from switching */
2014-01-14 19:04:59 +04:00
arch_spin_lock ( & global_trace . max_lock ) ;
2010-03-13 03:56:00 +03:00
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
buffer = global_trace . trace_buffer . buffer ;
2008-11-06 00:05:44 +03:00
if ( buffer )
ring_buffer_record_disable ( buffer ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
buffer = global_trace . max_buffer . buffer ;
2008-11-06 00:05:44 +03:00
if ( buffer )
ring_buffer_record_disable ( buffer ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# endif
2008-11-06 00:05:44 +03:00
2014-01-14 19:04:59 +04:00
arch_spin_unlock ( & global_trace . max_lock ) ;
2010-03-13 03:56:00 +03:00
2008-11-06 00:05:44 +03:00
out :
2012-05-11 21:29:49 +04:00
raw_spin_unlock_irqrestore ( & global_trace . start_lock , flags ) ;
}
static void tracing_stop_tr ( struct trace_array * tr )
{
struct ring_buffer * buffer ;
unsigned long flags ;
/* If global, we need to also stop the max tracer */
if ( tr - > flags & TRACE_ARRAY_FL_GLOBAL )
return tracing_stop ( ) ;
raw_spin_lock_irqsave ( & tr - > start_lock , flags ) ;
if ( tr - > stop_count + + )
goto out ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
buffer = tr - > trace_buffer . buffer ;
2012-05-11 21:29:49 +04:00
if ( buffer )
ring_buffer_record_disable ( buffer ) ;
out :
raw_spin_unlock_irqrestore ( & tr - > start_lock , flags ) ;
2008-11-06 00:05:44 +03:00
}
2008-05-12 23:20:51 +04:00
void trace_stop_cmdline_recording ( void ) ;
2008-05-12 23:20:42 +04:00
2014-05-30 17:42:39 +04:00
static int trace_save_cmdline ( struct task_struct * tsk )
2008-05-12 23:20:42 +04:00
{
2009-03-18 11:00:41 +03:00
unsigned pid , idx ;
2008-05-12 23:20:42 +04:00
if ( ! tsk - > pid | | unlikely ( tsk - > pid > PID_MAX_DEFAULT ) )
2014-05-30 17:42:39 +04:00
return 0 ;
2008-05-12 23:20:42 +04:00
/*
* It ' s not the end of the world if we don ' t get
* the lock , but we also don ' t want to spin
* nor do we want to disable interrupts ,
* so if we miss here , then better luck next time .
*/
2009-12-02 22:01:25 +03:00
if ( ! arch_spin_trylock ( & trace_cmdline_lock ) )
2014-05-30 17:42:39 +04:00
return 0 ;
2008-05-12 23:20:42 +04:00
2014-06-05 05:24:27 +04:00
idx = savedcmd - > map_pid_to_cmdline [ tsk - > pid ] ;
2009-03-18 11:03:19 +03:00
if ( idx = = NO_CMDLINE_MAP ) {
2014-06-05 05:24:27 +04:00
idx = ( savedcmd - > cmdline_idx + 1 ) % savedcmd - > cmdline_num ;
2008-05-12 23:20:42 +04:00
2009-03-18 11:00:41 +03:00
/*
* Check whether the cmdline buffer at idx has a pid
* mapped . We are going to overwrite that entry so we
* need to clear the map_pid_to_cmdline . Otherwise we
* would read the new comm for the old pid .
*/
2014-06-05 05:24:27 +04:00
pid = savedcmd - > map_cmdline_to_pid [ idx ] ;
2009-03-18 11:00:41 +03:00
if ( pid ! = NO_CMDLINE_MAP )
2014-06-05 05:24:27 +04:00
savedcmd - > map_pid_to_cmdline [ pid ] = NO_CMDLINE_MAP ;
2008-05-12 23:20:42 +04:00
2014-06-05 05:24:27 +04:00
savedcmd - > map_cmdline_to_pid [ idx ] = tsk - > pid ;
savedcmd - > map_pid_to_cmdline [ tsk - > pid ] = idx ;
2008-05-12 23:20:42 +04:00
2014-06-05 05:24:27 +04:00
savedcmd - > cmdline_idx = idx ;
2008-05-12 23:20:42 +04:00
}
2014-06-05 05:24:27 +04:00
set_cmdline ( idx , tsk - > comm ) ;
2008-05-12 23:20:42 +04:00
2009-12-02 22:01:25 +03:00
arch_spin_unlock ( & trace_cmdline_lock ) ;
2014-05-30 17:42:39 +04:00
return 1 ;
2008-05-12 23:20:42 +04:00
}
2014-05-30 18:49:46 +04:00
static void __trace_find_cmdline ( int pid , char comm [ ] )
2008-05-12 23:20:42 +04:00
{
unsigned map ;
2009-03-17 02:20:15 +03:00
if ( ! pid ) {
strcpy ( comm , " <idle> " ) ;
return ;
}
2008-05-12 23:20:42 +04:00
2010-01-25 23:11:53 +03:00
if ( WARN_ON_ONCE ( pid < 0 ) ) {
strcpy ( comm , " <XXX> " ) ;
return ;
}
2009-03-17 02:20:15 +03:00
if ( pid > PID_MAX_DEFAULT ) {
strcpy ( comm , " <...> " ) ;
return ;
}
2008-05-12 23:20:42 +04:00
2014-06-05 05:24:27 +04:00
map = savedcmd - > map_pid_to_cmdline [ pid ] ;
2009-03-18 10:58:44 +03:00
if ( map ! = NO_CMDLINE_MAP )
2014-06-05 05:24:27 +04:00
strcpy ( comm , get_saved_cmdlines ( map ) ) ;
2009-03-18 10:58:44 +03:00
else
strcpy ( comm , " <...> " ) ;
2014-05-30 18:49:46 +04:00
}
void trace_find_cmdline ( int pid , char comm [ ] )
{
preempt_disable ( ) ;
arch_spin_lock ( & trace_cmdline_lock ) ;
__trace_find_cmdline ( pid , comm ) ;
2008-05-12 23:20:42 +04:00
2009-12-02 22:01:25 +03:00
arch_spin_unlock ( & trace_cmdline_lock ) ;
2009-05-26 19:28:02 +04:00
preempt_enable ( ) ;
2008-05-12 23:20:42 +04:00
}
2008-05-12 23:20:51 +04:00
void tracing_record_cmdline ( struct task_struct * tsk )
2008-05-12 23:20:42 +04:00
{
2012-05-11 22:25:30 +04:00
if ( atomic_read ( & trace_record_cmdline_disabled ) | | ! tracing_is_on ( ) )
2008-05-12 23:20:42 +04:00
return ;
2012-10-11 20:14:25 +04:00
if ( ! __this_cpu_read ( trace_cmdline_save ) )
return ;
2014-05-30 17:42:39 +04:00
if ( trace_save_cmdline ( tsk ) )
__this_cpu_write ( trace_cmdline_save , false ) ;
2008-05-12 23:20:42 +04:00
}
2008-09-16 22:56:41 +04:00
void
2008-10-01 21:14:09 +04:00
tracing_generic_entry_update ( struct trace_entry * entry , unsigned long flags ,
int pc )
2008-05-12 23:20:42 +04:00
{
struct task_struct * tsk = current ;
2008-09-30 07:02:42 +04:00
entry - > preempt_count = pc & 0xff ;
entry - > pid = ( tsk ) ? tsk - > pid : 0 ;
entry - > flags =
2008-10-24 17:42:59 +04:00
# ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT
2008-08-01 20:26:40 +04:00
( irqs_disabled_flags ( flags ) ? TRACE_FLAG_IRQS_OFF : 0 ) |
2008-10-24 17:42:59 +04:00
# else
TRACE_FLAG_IRQS_NOSUPPORT |
# endif
2008-05-12 23:20:42 +04:00
( ( pc & HARDIRQ_MASK ) ? TRACE_FLAG_HARDIRQ : 0 ) |
( ( pc & SOFTIRQ_MASK ) ? TRACE_FLAG_SOFTIRQ : 0 ) |
2013-10-04 19:28:26 +04:00
( tif_need_resched ( ) ? TRACE_FLAG_NEED_RESCHED : 0 ) |
( test_preempt_need_resched ( ) ? TRACE_FLAG_PREEMPT_RESCHED : 0 ) ;
2008-05-12 23:20:42 +04:00
}
2009-08-07 03:25:54 +04:00
EXPORT_SYMBOL_GPL ( tracing_generic_entry_update ) ;
2008-05-12 23:20:42 +04:00
2009-09-02 22:17:06 +04:00
struct ring_buffer_event *
trace_buffer_lock_reserve ( struct ring_buffer * buffer ,
int type ,
unsigned long len ,
unsigned long flags , int pc )
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 21:14:13 +03:00
{
struct ring_buffer_event * event ;
2009-09-02 22:17:06 +04:00
event = ring_buffer_lock_reserve ( buffer , len ) ;
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 21:14:13 +03:00
if ( event ! = NULL ) {
struct trace_entry * ent = ring_buffer_event_data ( event ) ;
tracing_generic_entry_update ( ent , flags , pc ) ;
ent - > type = type ;
}
return event ;
}
2012-10-11 20:14:25 +04:00
void
__buffer_unlock_commit ( struct ring_buffer * buffer , struct ring_buffer_event * event )
{
__this_cpu_write ( trace_cmdline_save , true ) ;
ring_buffer_unlock_commit ( buffer , event ) ;
}
2009-09-02 22:17:06 +04:00
static inline void
__trace_buffer_unlock_commit ( struct ring_buffer * buffer ,
struct ring_buffer_event * event ,
2012-11-02 04:54:21 +04:00
unsigned long flags , int pc )
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 21:14:13 +03:00
{
2012-10-11 20:14:25 +04:00
__buffer_unlock_commit ( buffer , event ) ;
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 21:14:13 +03:00
2009-09-02 22:17:06 +04:00
ftrace_trace_stack ( buffer , flags , 6 , pc ) ;
ftrace_trace_userstack ( buffer , flags , pc ) ;
2009-03-23 01:10:46 +03:00
}
2009-09-02 22:17:06 +04:00
void trace_buffer_unlock_commit ( struct ring_buffer * buffer ,
struct ring_buffer_event * event ,
unsigned long flags , int pc )
2009-03-23 01:10:46 +03:00
{
2012-11-02 04:54:21 +04:00
__trace_buffer_unlock_commit ( buffer , event , flags , pc ) ;
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 21:14:13 +03:00
}
2012-11-02 04:54:21 +04:00
EXPORT_SYMBOL_GPL ( trace_buffer_unlock_commit ) ;
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 21:14:13 +03:00
2014-03-26 07:39:41 +04:00
static struct ring_buffer * temp_buffer ;
2012-08-02 18:32:10 +04:00
struct ring_buffer_event *
trace_event_buffer_lock_reserve ( struct ring_buffer * * current_rb ,
2015-05-05 17:09:53 +03:00
struct trace_event_file * trace_file ,
2012-08-02 18:32:10 +04:00
int type , unsigned long len ,
unsigned long flags , int pc )
{
2014-03-26 07:39:41 +04:00
struct ring_buffer_event * entry ;
2015-05-05 17:09:53 +03:00
* current_rb = trace_file - > tr - > trace_buffer . buffer ;
2014-03-26 07:39:41 +04:00
entry = trace_buffer_lock_reserve ( * current_rb ,
2012-08-02 18:32:10 +04:00
type , len , flags , pc ) ;
2014-03-26 07:39:41 +04:00
/*
* If tracing is off , but we have triggers enabled
* we still need to look at the event data . Use the temp_buffer
* to store the trace event for the tigger to use . It ' s recusive
* safe and will not be recorded anywhere .
*/
2015-05-13 22:12:33 +03:00
if ( ! entry & & trace_file - > flags & EVENT_FILE_FL_TRIGGER_COND ) {
2014-03-26 07:39:41 +04:00
* current_rb = temp_buffer ;
entry = trace_buffer_lock_reserve ( * current_rb ,
type , len , flags , pc ) ;
}
return entry ;
2012-08-02 18:32:10 +04:00
}
EXPORT_SYMBOL_GPL ( trace_event_buffer_lock_reserve ) ;
2009-02-28 03:38:04 +03:00
struct ring_buffer_event *
2009-09-02 22:17:06 +04:00
trace_current_buffer_lock_reserve ( struct ring_buffer * * current_rb ,
int type , unsigned long len ,
2009-02-28 03:38:04 +03:00
unsigned long flags , int pc )
{
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
* current_rb = global_trace . trace_buffer . buffer ;
2009-09-02 22:17:06 +04:00
return trace_buffer_lock_reserve ( * current_rb ,
2009-02-28 03:38:04 +03:00
type , len , flags , pc ) ;
}
2009-05-06 03:22:53 +04:00
EXPORT_SYMBOL_GPL ( trace_current_buffer_lock_reserve ) ;
2009-02-28 03:38:04 +03:00
2009-09-02 22:17:06 +04:00
void trace_current_buffer_unlock_commit ( struct ring_buffer * buffer ,
struct ring_buffer_event * event ,
2009-02-28 03:38:04 +03:00
unsigned long flags , int pc )
{
2012-11-02 04:54:21 +04:00
__trace_buffer_unlock_commit ( buffer , event , flags , pc ) ;
2009-03-23 01:10:46 +03:00
}
2009-05-06 03:22:53 +04:00
EXPORT_SYMBOL_GPL ( trace_current_buffer_unlock_commit ) ;
2009-03-23 01:10:46 +03:00
2012-11-02 04:54:21 +04:00
void trace_buffer_unlock_commit_regs ( struct ring_buffer * buffer ,
struct ring_buffer_event * event ,
unsigned long flags , int pc ,
struct pt_regs * regs )
2011-06-08 11:09:34 +04:00
{
2012-10-11 20:14:25 +04:00
__buffer_unlock_commit ( buffer , event ) ;
2011-06-08 11:09:34 +04:00
ftrace_trace_stack_regs ( buffer , flags , 0 , pc , regs ) ;
ftrace_trace_userstack ( buffer , flags , pc ) ;
}
2012-11-02 04:54:21 +04:00
EXPORT_SYMBOL_GPL ( trace_buffer_unlock_commit_regs ) ;
2011-06-08 11:09:34 +04:00
2009-09-02 22:17:06 +04:00
void trace_current_buffer_discard_commit ( struct ring_buffer * buffer ,
struct ring_buffer_event * event )
2009-04-02 09:16:59 +04:00
{
2009-09-02 22:17:06 +04:00
ring_buffer_discard_commit ( buffer , event ) ;
2009-02-28 03:38:04 +03:00
}
2009-04-18 00:01:56 +04:00
EXPORT_SYMBOL_GPL ( trace_current_buffer_discard_commit ) ;
2009-02-28 03:38:04 +03:00
2008-05-12 23:20:51 +04:00
void
2009-02-05 09:13:37 +03:00
trace_function ( struct trace_array * tr ,
2008-10-01 21:14:09 +04:00
unsigned long ip , unsigned long parent_ip , unsigned long flags ,
int pc )
2008-05-12 23:20:42 +04:00
{
2015-05-05 18:45:27 +03:00
struct trace_event_call * call = & event_function ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
struct ring_buffer * buffer = tr - > trace_buffer . buffer ;
2008-09-30 07:02:41 +04:00
struct ring_buffer_event * event ;
2008-09-30 07:02:42 +04:00
struct ftrace_entry * entry ;
2008-05-12 23:20:42 +04:00
2008-10-01 08:29:53 +04:00
/* If we are reading the ring buffer, don't trace */
2009-10-29 16:34:15 +03:00
if ( unlikely ( __this_cpu_read ( ftrace_cpu_disabled ) ) )
2008-10-01 08:29:53 +04:00
return ;
2009-09-02 22:17:06 +04:00
event = trace_buffer_lock_reserve ( buffer , TRACE_FN , sizeof ( * entry ) ,
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 21:14:13 +03:00
flags , pc ) ;
2008-09-30 07:02:41 +04:00
if ( ! event )
return ;
entry = ring_buffer_event_data ( event ) ;
2008-09-30 07:02:42 +04:00
entry - > ip = ip ;
entry - > parent_ip = parent_ip ;
2009-03-31 09:48:49 +04:00
2013-10-24 17:34:17 +04:00
if ( ! call_filter_check_discard ( call , entry , buffer , event ) )
2012-10-11 20:14:25 +04:00
__buffer_unlock_commit ( buffer , event ) ;
2008-05-12 23:20:42 +04:00
}
2009-07-29 19:51:13 +04:00
# ifdef CONFIG_STACKTRACE
2011-07-15 00:36:53 +04:00
# define FTRACE_STACK_MAX_ENTRIES (PAGE_SIZE / sizeof(unsigned long))
struct ftrace_stack {
unsigned long calls [ FTRACE_STACK_MAX_ENTRIES ] ;
} ;
static DEFINE_PER_CPU ( struct ftrace_stack , ftrace_stack ) ;
static DEFINE_PER_CPU ( int , ftrace_stack_reserve ) ;
2009-09-02 22:17:06 +04:00
static void __ftrace_trace_stack ( struct ring_buffer * buffer ,
2009-01-16 03:12:40 +03:00
unsigned long flags ,
2011-06-08 11:09:34 +04:00
int skip , int pc , struct pt_regs * regs )
2008-05-12 23:20:51 +04:00
{
2015-05-05 18:45:27 +03:00
struct trace_event_call * call = & event_kernel_stack ;
2008-09-30 07:02:41 +04:00
struct ring_buffer_event * event ;
2008-09-30 07:02:42 +04:00
struct stack_entry * entry ;
2008-05-12 23:20:51 +04:00
struct stack_trace trace ;
2011-07-15 00:36:53 +04:00
int use_stack ;
int size = FTRACE_STACK_ENTRIES ;
trace . nr_entries = 0 ;
trace . skip = skip ;
/*
* Since events can happen in NMIs there ' s no safe way to
* use the per cpu ftrace_stacks . We reserve it and if an interrupt
* or NMI comes in , it will just have to use the default
* FTRACE_STACK_SIZE .
*/
preempt_disable_notrace ( ) ;
2012-11-19 09:21:01 +04:00
use_stack = __this_cpu_inc_return ( ftrace_stack_reserve ) ;
2011-07-15 00:36:53 +04:00
/*
* We don ' t need any atomic variables , just a barrier .
* If an interrupt comes in , we don ' t care , because it would
* have exited and put the counter back to what we want .
* We just need a barrier to keep gcc from moving things
* around .
*/
barrier ( ) ;
if ( use_stack = = 1 ) {
2014-04-29 23:17:40 +04:00
trace . entries = this_cpu_ptr ( ftrace_stack . calls ) ;
2011-07-15 00:36:53 +04:00
trace . max_entries = FTRACE_STACK_MAX_ENTRIES ;
if ( regs )
save_stack_trace_regs ( regs , & trace ) ;
else
save_stack_trace ( & trace ) ;
if ( trace . nr_entries > size )
size = trace . nr_entries ;
} else
/* From now on, use_stack is a boolean */
use_stack = 0 ;
size * = sizeof ( unsigned long ) ;
2008-05-12 23:20:51 +04:00
2009-09-02 22:17:06 +04:00
event = trace_buffer_lock_reserve ( buffer , TRACE_STACK ,
2011-07-15 00:36:53 +04:00
sizeof ( * entry ) + size , flags , pc ) ;
2008-09-30 07:02:41 +04:00
if ( ! event )
2011-07-15 00:36:53 +04:00
goto out ;
entry = ring_buffer_event_data ( event ) ;
2008-05-12 23:20:51 +04:00
2011-07-15 00:36:53 +04:00
memset ( & entry - > caller , 0 , size ) ;
if ( use_stack )
memcpy ( & entry - > caller , trace . entries ,
trace . nr_entries * sizeof ( unsigned long ) ) ;
else {
trace . max_entries = FTRACE_STACK_ENTRIES ;
trace . entries = entry - > caller ;
if ( regs )
save_stack_trace_regs ( regs , & trace ) ;
else
save_stack_trace ( & trace ) ;
}
entry - > size = trace . nr_entries ;
2008-05-12 23:20:51 +04:00
2013-10-24 17:34:17 +04:00
if ( ! call_filter_check_discard ( call , entry , buffer , event ) )
2012-10-11 20:14:25 +04:00
__buffer_unlock_commit ( buffer , event ) ;
2011-07-15 00:36:53 +04:00
out :
/* Again, don't let gcc optimize things here */
barrier ( ) ;
2012-11-19 09:21:01 +04:00
__this_cpu_dec ( ftrace_stack_reserve ) ;
2011-07-15 00:36:53 +04:00
preempt_enable_notrace ( ) ;
2008-05-12 23:20:47 +04:00
}
2011-06-08 11:09:34 +04:00
void ftrace_trace_stack_regs ( struct ring_buffer * buffer , unsigned long flags ,
int skip , int pc , struct pt_regs * regs )
{
if ( ! ( trace_flags & TRACE_ITER_STACKTRACE ) )
return ;
__ftrace_trace_stack ( buffer , flags , skip , pc , regs ) ;
}
2009-09-02 22:17:06 +04:00
void ftrace_trace_stack ( struct ring_buffer * buffer , unsigned long flags ,
int skip , int pc )
2009-01-16 03:12:40 +03:00
{
if ( ! ( trace_flags & TRACE_ITER_STACKTRACE ) )
return ;
2011-06-08 11:09:34 +04:00
__ftrace_trace_stack ( buffer , flags , skip , pc , NULL ) ;
2009-01-16 03:12:40 +03:00
}
2009-07-29 19:51:13 +04:00
void __trace_stack ( struct trace_array * tr , unsigned long flags , int skip ,
int pc )
2008-10-01 21:14:09 +04:00
{
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
__ftrace_trace_stack ( tr - > trace_buffer . buffer , flags , skip , pc , NULL ) ;
2008-10-01 21:14:09 +04:00
}
2009-12-11 17:48:22 +03:00
/**
* trace_dump_stack - record a stack back trace in the trace buffer
2013-03-13 17:55:57 +04:00
* @ skip : Number of functions to skip ( helper handlers )
2009-12-11 17:48:22 +03:00
*/
2013-03-13 17:55:57 +04:00
void trace_dump_stack ( int skip )
2009-12-11 17:48:22 +03:00
{
unsigned long flags ;
if ( tracing_disabled | | tracing_selftest_running )
2009-12-14 23:58:33 +03:00
return ;
2009-12-11 17:48:22 +03:00
local_save_flags ( flags ) ;
2013-03-13 17:55:57 +04:00
/*
* Skip 3 more , seems to get us at the caller of
* this function .
*/
skip + = 3 ;
__ftrace_trace_stack ( global_trace . trace_buffer . buffer ,
flags , skip , preempt_count ( ) , NULL ) ;
2009-12-11 17:48:22 +03:00
}
2010-11-10 14:56:12 +03:00
static DEFINE_PER_CPU ( int , user_stack_count ) ;
2009-09-02 22:17:06 +04:00
void
ftrace_trace_userstack ( struct ring_buffer * buffer , unsigned long flags , int pc )
2008-11-22 14:28:47 +03:00
{
2015-05-05 18:45:27 +03:00
struct trace_event_call * call = & event_user_stack ;
2008-11-23 13:39:06 +03:00
struct ring_buffer_event * event ;
2008-11-22 14:28:47 +03:00
struct userstack_entry * entry ;
struct stack_trace trace ;
if ( ! ( trace_flags & TRACE_ITER_USERSTACKTRACE ) )
return ;
tracing: Do not record user stack trace from NMI context
A bug was found with Li Zefan's ftrace_stress_test that caused applications
to segfault during the test.
Placing a tracing_off() in the segfault code, and examining several
traces, I found that the following was always the case. The lock tracer
was enabled (lockdep being required) and userstack was enabled. Testing
this out, I just enabled the two, but that was not good enough. I needed
to run something else that could trigger it. Running a load like hackbench
did not work, but executing a new program would. The following would
trigger the segfault within seconds:
# echo 1 > /debug/tracing/options/userstacktrace
# echo 1 > /debug/tracing/events/lock/enable
# while :; do ls > /dev/null ; done
Enabling the function graph tracer and looking at what was happening
I finally noticed that all cashes happened just after an NMI.
1) | copy_user_handle_tail() {
1) | bad_area_nosemaphore() {
1) | __bad_area_nosemaphore() {
1) | no_context() {
1) | fixup_exception() {
1) 0.319 us | search_exception_tables();
1) 0.873 us | }
[...]
1) 0.314 us | __rcu_read_unlock();
1) 0.325 us | native_apic_mem_write();
1) 0.943 us | }
1) 0.304 us | rcu_nmi_exit();
[...]
1) 0.479 us | find_vma();
1) | bad_area() {
1) | __bad_area() {
After capturing several traces of failures, all of them happened
after an NMI. Curious about this, I added a trace_printk() to the NMI
handler to read the regs->ip to see where the NMI happened. In which I
found out it was here:
ffffffff8135b660 <page_fault>:
ffffffff8135b660: 48 83 ec 78 sub $0x78,%rsp
ffffffff8135b664: e8 97 01 00 00 callq ffffffff8135b800 <error_entry>
What was happening is that the NMI would happen at the place that a page
fault occurred. It would call rcu_read_lock() which was traced by
the lock events, and the user_stack_trace would run. This would trigger
a page fault inside the NMI. I do not see where the CR2 register is
saved or restored in NMI handling. This means that it would corrupt
the page fault handling that the NMI interrupted.
The reason the while loop of ls helped trigger the bug, was that
each execution of ls would cause lots of pages to be faulted in, and
increase the chances of the race happening.
The simple solution is to not allow user stack traces in NMI context.
After this patch, I ran the above "ls" test for a couple of hours
without any issues. Without this patch, the bug would trigger in less
than a minute.
Cc: stable@kernel.org
Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-03-13 04:03:30 +03:00
/*
* NMIs can not handle page faults , even with fix ups .
* The save user stack can ( and often does ) fault .
*/
if ( unlikely ( in_nmi ( ) ) )
return ;
2008-11-22 14:28:47 +03:00
2010-11-10 14:56:12 +03:00
/*
* prevent recursion , since the user stack tracing may
* trigger other kernel events .
*/
preempt_disable ( ) ;
if ( __this_cpu_read ( user_stack_count ) )
goto out ;
__this_cpu_inc ( user_stack_count ) ;
2009-09-02 22:17:06 +04:00
event = trace_buffer_lock_reserve ( buffer , TRACE_USER_STACK ,
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 21:14:13 +03:00
sizeof ( * entry ) , flags , pc ) ;
2008-11-22 14:28:47 +03:00
if ( ! event )
2010-12-09 10:47:56 +03:00
goto out_drop_count ;
2008-11-22 14:28:47 +03:00
entry = ring_buffer_event_data ( event ) ;
2009-09-11 19:36:23 +04:00
entry - > tgid = current - > tgid ;
2008-11-22 14:28:47 +03:00
memset ( & entry - > caller , 0 , sizeof ( entry - > caller ) ) ;
trace . nr_entries = 0 ;
trace . max_entries = FTRACE_STACK_ENTRIES ;
trace . skip = 0 ;
trace . entries = entry - > caller ;
save_stack_trace_user ( & trace ) ;
2013-10-24 17:34:17 +04:00
if ( ! call_filter_check_discard ( call , entry , buffer , event ) )
2012-10-11 20:14:25 +04:00
__buffer_unlock_commit ( buffer , event ) ;
2010-11-10 14:56:12 +03:00
2010-12-09 10:47:56 +03:00
out_drop_count :
2010-11-10 14:56:12 +03:00
__this_cpu_dec ( user_stack_count ) ;
out :
preempt_enable ( ) ;
2008-11-22 14:28:47 +03:00
}
2009-02-10 21:44:12 +03:00
# ifdef UNUSED
static void __trace_userstack ( struct trace_array * tr , unsigned long flags )
2008-11-22 14:28:47 +03:00
{
2009-02-05 09:13:37 +03:00
ftrace_trace_userstack ( tr , flags , preempt_count ( ) ) ;
2008-11-22 14:28:47 +03:00
}
2009-02-10 21:44:12 +03:00
# endif /* UNUSED */
2008-11-22 14:28:47 +03:00
2009-07-29 19:51:13 +04:00
# endif /* CONFIG_STACKTRACE */
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
/* created for use with alloc_percpu */
struct trace_buffer_struct {
char buffer [ TRACE_BUF_SIZE ] ;
} ;
static struct trace_buffer_struct * trace_percpu_buffer ;
static struct trace_buffer_struct * trace_percpu_sirq_buffer ;
static struct trace_buffer_struct * trace_percpu_irq_buffer ;
static struct trace_buffer_struct * trace_percpu_nmi_buffer ;
/*
* The buffer used is dependent on the context . There is a per cpu
* buffer for normal context , softirq contex , hard irq context and
* for NMI context . Thise allows for lockless recording .
*
* Note , if the buffers failed to be allocated , then this returns NULL
*/
static char * get_trace_buf ( void )
{
struct trace_buffer_struct * percpu_buffer ;
/*
* If we have allocated per cpu buffers , then we do not
* need to do any locking .
*/
if ( in_nmi ( ) )
percpu_buffer = trace_percpu_nmi_buffer ;
else if ( in_irq ( ) )
percpu_buffer = trace_percpu_irq_buffer ;
else if ( in_softirq ( ) )
percpu_buffer = trace_percpu_sirq_buffer ;
else
percpu_buffer = trace_percpu_buffer ;
if ( ! percpu_buffer )
return NULL ;
2012-11-13 05:53:04 +04:00
return this_cpu_ptr ( & percpu_buffer - > buffer [ 0 ] ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
}
static int alloc_percpu_trace_buffer ( void )
{
struct trace_buffer_struct * buffers ;
struct trace_buffer_struct * sirq_buffers ;
struct trace_buffer_struct * irq_buffers ;
struct trace_buffer_struct * nmi_buffers ;
buffers = alloc_percpu ( struct trace_buffer_struct ) ;
if ( ! buffers )
goto err_warn ;
sirq_buffers = alloc_percpu ( struct trace_buffer_struct ) ;
if ( ! sirq_buffers )
goto err_sirq ;
irq_buffers = alloc_percpu ( struct trace_buffer_struct ) ;
if ( ! irq_buffers )
goto err_irq ;
nmi_buffers = alloc_percpu ( struct trace_buffer_struct ) ;
if ( ! nmi_buffers )
goto err_nmi ;
trace_percpu_buffer = buffers ;
trace_percpu_sirq_buffer = sirq_buffers ;
trace_percpu_irq_buffer = irq_buffers ;
trace_percpu_nmi_buffer = nmi_buffers ;
return 0 ;
err_nmi :
free_percpu ( irq_buffers ) ;
err_irq :
free_percpu ( sirq_buffers ) ;
err_sirq :
free_percpu ( buffers ) ;
err_warn :
WARN ( 1 , " Could not allocate percpu trace_printk buffer " ) ;
return - ENOMEM ;
}
2012-10-11 18:15:05 +04:00
static int buffers_allocated ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
void trace_printk_init_buffers ( void )
{
if ( buffers_allocated )
return ;
if ( alloc_percpu_trace_buffer ( ) )
return ;
2014-05-28 21:14:40 +04:00
/* trace_printk() is for debug use only. Don't use it in production. */
2015-01-27 19:17:20 +03:00
pr_warning ( " \n " ) ;
pr_warning ( " ********************************************************** \n " ) ;
2014-05-28 21:14:40 +04:00
pr_warning ( " ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE ** \n " ) ;
pr_warning ( " ** ** \n " ) ;
pr_warning ( " ** trace_printk() being used. Allocating extra memory. ** \n " ) ;
pr_warning ( " ** ** \n " ) ;
pr_warning ( " ** This means that this is a DEBUG kernel and it is ** \n " ) ;
2014-11-07 17:53:44 +03:00
pr_warning ( " ** unsafe for production use. ** \n " ) ;
2014-05-28 21:14:40 +04:00
pr_warning ( " ** ** \n " ) ;
pr_warning ( " ** If you see this message and you are not debugging ** \n " ) ;
pr_warning ( " ** the kernel, report this immediately to your vendor! ** \n " ) ;
pr_warning ( " ** ** \n " ) ;
pr_warning ( " ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE ** \n " ) ;
pr_warning ( " ********************************************************** \n " ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
2012-10-11 05:44:34 +04:00
/* Expand the buffers to set size */
tracing_update_buffers ( ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
buffers_allocated = 1 ;
2012-10-11 18:15:05 +04:00
/*
* trace_printk_init_buffers ( ) can be called by modules .
* If that happens , then we need to start cmdline recording
* directly here . If the global_trace . buffer is already
* allocated here , then this was called by module code .
*/
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
if ( global_trace . trace_buffer . buffer )
2012-10-11 18:15:05 +04:00
tracing_start_cmdline_record ( ) ;
}
void trace_printk_start_comm ( void )
{
/* Start tracing comms if trace printk is set */
if ( ! buffers_allocated )
return ;
tracing_start_cmdline_record ( ) ;
}
static void trace_printk_start_stop_comm ( int enabled )
{
if ( ! buffers_allocated )
return ;
if ( enabled )
tracing_start_cmdline_record ( ) ;
else
tracing_stop_cmdline_record ( ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
}
2009-03-06 19:21:49 +03:00
/**
2009-03-12 20:24:49 +03:00
* trace_vbprintk - write binary msg to tracing buffer
2009-03-06 19:21:49 +03:00
*
*/
2009-03-19 21:03:53 +03:00
int trace_vbprintk ( unsigned long ip , const char * fmt , va_list args )
2009-03-06 19:21:49 +03:00
{
2015-05-05 18:45:27 +03:00
struct trace_event_call * call = & event_bprint ;
2009-03-06 19:21:49 +03:00
struct ring_buffer_event * event ;
2009-09-02 22:17:06 +04:00
struct ring_buffer * buffer ;
2009-03-06 19:21:49 +03:00
struct trace_array * tr = & global_trace ;
2009-03-12 20:24:49 +03:00
struct bprint_entry * entry ;
2009-03-06 19:21:49 +03:00
unsigned long flags ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
char * tbuffer ;
int len = 0 , size , pc ;
2009-03-06 19:21:49 +03:00
if ( unlikely ( tracing_selftest_running | | tracing_disabled ) )
return 0 ;
/* Don't pollute graph traces with trace_vprintk internals */
pause_graph_tracing ( ) ;
pc = preempt_count ( ) ;
2010-06-03 17:36:50 +04:00
preempt_disable_notrace ( ) ;
2009-03-06 19:21:49 +03:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
tbuffer = get_trace_buf ( ) ;
if ( ! tbuffer ) {
len = 0 ;
2009-03-06 19:21:49 +03:00
goto out ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
}
2009-03-06 19:21:49 +03:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
len = vbin_printf ( ( u32 * ) tbuffer , TRACE_BUF_SIZE / sizeof ( int ) , fmt , args ) ;
2009-03-06 19:21:49 +03:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
if ( len > TRACE_BUF_SIZE / sizeof ( int ) | | len < 0 )
goto out ;
2009-03-06 19:21:49 +03:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
local_save_flags ( flags ) ;
2009-03-06 19:21:49 +03:00
size = sizeof ( * entry ) + sizeof ( u32 ) * len ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
buffer = tr - > trace_buffer . buffer ;
2009-09-02 22:17:06 +04:00
event = trace_buffer_lock_reserve ( buffer , TRACE_BPRINT , size ,
flags , pc ) ;
2009-03-06 19:21:49 +03:00
if ( ! event )
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
goto out ;
2009-03-06 19:21:49 +03:00
entry = ring_buffer_event_data ( event ) ;
entry - > ip = ip ;
entry - > fmt = fmt ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
memcpy ( entry - > buf , tbuffer , sizeof ( u32 ) * len ) ;
2013-10-24 17:34:17 +04:00
if ( ! call_filter_check_discard ( call , entry , buffer , event ) ) {
2012-10-11 20:14:25 +04:00
__buffer_unlock_commit ( buffer , event ) ;
2010-01-07 01:27:11 +03:00
ftrace_trace_stack ( buffer , flags , 6 , pc ) ;
}
2009-03-06 19:21:49 +03:00
out :
2010-06-03 17:36:50 +04:00
preempt_enable_notrace ( ) ;
2009-03-06 19:21:49 +03:00
unpause_graph_tracing ( ) ;
return len ;
}
2009-03-12 20:24:49 +03:00
EXPORT_SYMBOL_GPL ( trace_vbprintk ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
static int
__trace_array_vprintk ( struct ring_buffer * buffer ,
unsigned long ip , const char * fmt , va_list args )
2009-03-12 20:24:49 +03:00
{
2015-05-05 18:45:27 +03:00
struct trace_event_call * call = & event_print ;
2009-03-12 20:24:49 +03:00
struct ring_buffer_event * event ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
int len = 0 , size , pc ;
2009-03-12 20:24:49 +03:00
struct print_entry * entry ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
unsigned long flags ;
char * tbuffer ;
2009-03-12 20:24:49 +03:00
if ( tracing_disabled | | tracing_selftest_running )
return 0 ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
/* Don't pollute graph traces with trace_vprintk internals */
pause_graph_tracing ( ) ;
2009-03-12 20:24:49 +03:00
pc = preempt_count ( ) ;
preempt_disable_notrace ( ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
tbuffer = get_trace_buf ( ) ;
if ( ! tbuffer ) {
len = 0 ;
2009-03-12 20:24:49 +03:00
goto out ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
}
2009-03-12 20:24:49 +03:00
2014-11-27 18:57:52 +03:00
len = vscnprintf ( tbuffer , TRACE_BUF_SIZE , fmt , args ) ;
2009-03-12 20:24:49 +03:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
local_save_flags ( flags ) ;
2009-03-12 20:24:49 +03:00
size = sizeof ( * entry ) + len + 1 ;
2009-09-02 22:17:06 +04:00
event = trace_buffer_lock_reserve ( buffer , TRACE_PRINT , size ,
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
flags , pc ) ;
2009-03-12 20:24:49 +03:00
if ( ! event )
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
goto out ;
2009-03-12 20:24:49 +03:00
entry = ring_buffer_event_data ( event ) ;
2009-11-16 22:56:13 +03:00
entry - > ip = ip ;
2009-03-12 20:24:49 +03:00
2014-11-27 18:57:52 +03:00
memcpy ( & entry - > buf , tbuffer , len + 1 ) ;
2013-10-24 17:34:17 +04:00
if ( ! call_filter_check_discard ( call , entry , buffer , event ) ) {
2012-10-11 20:14:25 +04:00
__buffer_unlock_commit ( buffer , event ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
ftrace_trace_stack ( buffer , flags , 6 , pc ) ;
2010-01-07 01:27:11 +03:00
}
2009-03-12 20:24:49 +03:00
out :
preempt_enable_notrace ( ) ;
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
unpause_graph_tracing ( ) ;
2009-03-12 20:24:49 +03:00
return len ;
}
2009-09-04 03:11:07 +04:00
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
int trace_array_vprintk ( struct trace_array * tr ,
unsigned long ip , const char * fmt , va_list args )
{
return __trace_array_vprintk ( tr - > trace_buffer . buffer , ip , fmt , args ) ;
}
int trace_array_printk ( struct trace_array * tr ,
unsigned long ip , const char * fmt , . . . )
{
int ret ;
va_list ap ;
if ( ! ( trace_flags & TRACE_ITER_PRINTK ) )
return 0 ;
va_start ( ap , fmt ) ;
ret = trace_array_vprintk ( tr , ip , fmt , ap ) ;
va_end ( ap ) ;
return ret ;
}
int trace_array_printk_buf ( struct ring_buffer * buffer ,
unsigned long ip , const char * fmt , . . . )
{
int ret ;
va_list ap ;
if ( ! ( trace_flags & TRACE_ITER_PRINTK ) )
return 0 ;
va_start ( ap , fmt ) ;
ret = __trace_array_vprintk ( buffer , ip , fmt , ap ) ;
va_end ( ap ) ;
return ret ;
}
2009-09-04 03:11:07 +04:00
int trace_vprintk ( unsigned long ip , const char * fmt , va_list args )
{
2009-10-09 09:41:35 +04:00
return trace_array_vprintk ( & global_trace , ip , fmt , args ) ;
2009-09-04 03:11:07 +04:00
}
2009-03-06 19:21:49 +03:00
EXPORT_SYMBOL_GPL ( trace_vprintk ) ;
2008-11-12 14:59:32 +03:00
static void trace_iterator_increment ( struct trace_iterator * iter )
2008-09-04 01:42:51 +04:00
{
2012-06-28 04:46:14 +04:00
struct ring_buffer_iter * buf_iter = trace_buffer_iter ( iter , iter - > cpu ) ;
2008-09-04 01:42:51 +04:00
iter - > idx + + ;
2012-06-28 04:46:14 +04:00
if ( buf_iter )
ring_buffer_read ( buf_iter , NULL ) ;
2008-09-04 01:42:51 +04:00
}
2008-05-12 23:20:51 +04:00
static struct trace_entry *
2010-04-01 03:49:26 +04:00
peek_next_entry ( struct trace_iterator * iter , int cpu , u64 * ts ,
unsigned long * lost_events )
2008-08-01 20:26:41 +04:00
{
2008-09-30 07:02:41 +04:00
struct ring_buffer_event * event ;
2012-06-28 04:46:14 +04:00
struct ring_buffer_iter * buf_iter = trace_buffer_iter ( iter , cpu ) ;
2008-08-01 20:26:41 +04:00
2008-10-01 08:29:53 +04:00
if ( buf_iter )
event = ring_buffer_iter_peek ( buf_iter , ts ) ;
else
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
event = ring_buffer_peek ( iter - > trace_buffer - > buffer , cpu , ts ,
2010-04-01 03:49:26 +04:00
lost_events ) ;
2008-10-01 08:29:53 +04:00
2011-07-15 00:36:53 +04:00
if ( event ) {
iter - > ent_size = ring_buffer_event_length ( event ) ;
return ring_buffer_event_data ( event ) ;
}
iter - > ent_size = 0 ;
return NULL ;
2008-08-01 20:26:41 +04:00
}
2008-10-01 08:29:53 +04:00
2008-08-01 20:26:41 +04:00
static struct trace_entry *
2010-04-01 03:49:26 +04:00
__find_next_entry ( struct trace_iterator * iter , int * ent_cpu ,
unsigned long * missing_events , u64 * ent_ts )
2008-05-12 23:20:42 +04:00
{
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
struct ring_buffer * buffer = iter - > trace_buffer - > buffer ;
2008-05-12 23:20:42 +04:00
struct trace_entry * ent , * next = NULL ;
2010-04-05 13:11:05 +04:00
unsigned long lost_events = 0 , next_lost = 0 ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
int cpu_file = iter - > cpu_file ;
2008-09-30 07:02:41 +04:00
u64 next_ts = 0 , ts ;
2008-05-12 23:20:42 +04:00
int next_cpu = - 1 ;
2012-03-27 18:43:28 +04:00
int next_size = 0 ;
2008-05-12 23:20:42 +04:00
int cpu ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
/*
* If we are in a per_cpu trace file , don ' t bother by iterating over
* all cpu and peek directly .
*/
2013-01-24 00:22:59 +04:00
if ( cpu_file > RING_BUFFER_ALL_CPUS ) {
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
if ( ring_buffer_empty_cpu ( buffer , cpu_file ) )
return NULL ;
2010-04-01 03:49:26 +04:00
ent = peek_next_entry ( iter , cpu_file , ent_ts , missing_events ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
if ( ent_cpu )
* ent_cpu = cpu_file ;
return ent ;
}
2008-05-12 23:21:00 +04:00
for_each_tracing_cpu ( cpu ) {
2008-08-01 20:26:41 +04:00
2008-09-30 07:02:41 +04:00
if ( ring_buffer_empty_cpu ( buffer , cpu ) )
continue ;
2008-08-01 20:26:41 +04:00
2010-04-01 03:49:26 +04:00
ent = peek_next_entry ( iter , cpu , & ts , & lost_events ) ;
2008-08-01 20:26:41 +04:00
2008-05-12 23:20:46 +04:00
/*
* Pick the entry with the smallest timestamp :
*/
2008-09-30 07:02:41 +04:00
if ( ent & & ( ! next | | ts < next_ts ) ) {
2008-05-12 23:20:42 +04:00
next = ent ;
next_cpu = cpu ;
2008-09-30 07:02:41 +04:00
next_ts = ts ;
2010-04-01 03:49:26 +04:00
next_lost = lost_events ;
2012-03-27 18:43:28 +04:00
next_size = iter - > ent_size ;
2008-05-12 23:20:42 +04:00
}
}
2012-03-27 18:43:28 +04:00
iter - > ent_size = next_size ;
2008-05-12 23:20:42 +04:00
if ( ent_cpu )
* ent_cpu = next_cpu ;
2008-09-30 07:02:41 +04:00
if ( ent_ts )
* ent_ts = next_ts ;
2010-04-01 03:49:26 +04:00
if ( missing_events )
* missing_events = next_lost ;
2008-05-12 23:20:42 +04:00
return next ;
}
2008-08-01 20:26:41 +04:00
/* Find the next real entry, without updating the iterator itself */
2009-02-03 01:29:21 +03:00
struct trace_entry * trace_find_next_entry ( struct trace_iterator * iter ,
int * ent_cpu , u64 * ent_ts )
2008-05-12 23:20:42 +04:00
{
2010-04-01 03:49:26 +04:00
return __find_next_entry ( iter , ent_cpu , NULL , ent_ts ) ;
2008-08-01 20:26:41 +04:00
}
/* Find the next real entry, and increment the iterator to the next entry */
2010-08-05 18:22:23 +04:00
void * trace_find_next_entry_inc ( struct trace_iterator * iter )
2008-08-01 20:26:41 +04:00
{
2010-04-01 03:49:26 +04:00
iter - > ent = __find_next_entry ( iter , & iter - > cpu ,
& iter - > lost_events , & iter - > ts ) ;
2008-08-01 20:26:41 +04:00
2008-09-30 07:02:41 +04:00
if ( iter - > ent )
2008-11-12 14:59:32 +03:00
trace_iterator_increment ( iter ) ;
2008-08-01 20:26:41 +04:00
2008-09-30 07:02:41 +04:00
return iter - > ent ? iter : NULL ;
2008-05-12 23:20:46 +04:00
}
2008-05-12 23:20:42 +04:00
2008-05-12 23:20:51 +04:00
static void trace_consume ( struct trace_iterator * iter )
2008-05-12 23:20:46 +04:00
{
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ring_buffer_consume ( iter - > trace_buffer - > buffer , iter - > cpu , & iter - > ts ,
2010-04-01 03:49:26 +04:00
& iter - > lost_events ) ;
2008-05-12 23:20:42 +04:00
}
2008-05-12 23:20:51 +04:00
static void * s_next ( struct seq_file * m , void * v , loff_t * pos )
2008-05-12 23:20:42 +04:00
{
struct trace_iterator * iter = m - > private ;
int i = ( int ) * pos ;
2008-05-12 23:20:45 +04:00
void * ent ;
2008-05-12 23:20:42 +04:00
2009-12-07 17:11:39 +03:00
WARN_ON_ONCE ( iter - > leftover ) ;
2008-05-12 23:20:42 +04:00
( * pos ) + + ;
/* can't go backwards */
if ( iter - > idx > i )
return NULL ;
if ( iter - > idx < 0 )
2010-08-05 18:22:23 +04:00
ent = trace_find_next_entry_inc ( iter ) ;
2008-05-12 23:20:42 +04:00
else
ent = iter ;
while ( ent & & iter - > idx < i )
2010-08-05 18:22:23 +04:00
ent = trace_find_next_entry_inc ( iter ) ;
2008-05-12 23:20:42 +04:00
iter - > pos = * pos ;
return ent ;
}
2010-08-05 18:22:23 +04:00
void tracing_iter_reset ( struct trace_iterator * iter , int cpu )
2009-09-01 19:06:29 +04:00
{
struct ring_buffer_event * event ;
struct ring_buffer_iter * buf_iter ;
unsigned long entries = 0 ;
u64 ts ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
per_cpu_ptr ( iter - > trace_buffer - > data , cpu ) - > skipped_entries = 0 ;
2009-09-01 19:06:29 +04:00
2012-06-28 04:46:14 +04:00
buf_iter = trace_buffer_iter ( iter , cpu ) ;
if ( ! buf_iter )
2009-09-01 19:06:29 +04:00
return ;
ring_buffer_iter_reset ( buf_iter ) ;
/*
* We could have the case with the max latency tracers
* that a reset never took place on a cpu . This is evident
* by the timestamp being before the start of the buffer .
*/
while ( ( event = ring_buffer_iter_peek ( buf_iter , & ts ) ) ) {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
if ( ts > = iter - > trace_buffer - > time_start )
2009-09-01 19:06:29 +04:00
break ;
entries + + ;
ring_buffer_read ( buf_iter , NULL ) ;
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
per_cpu_ptr ( iter - > trace_buffer - > data , cpu ) - > skipped_entries = entries ;
2009-09-01 19:06:29 +04:00
}
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
/*
* The current tracer is copied to avoid a global locking
* all around .
*/
2008-05-12 23:20:42 +04:00
static void * s_start ( struct seq_file * m , loff_t * pos )
{
struct trace_iterator * iter = m - > private ;
2012-05-11 21:29:49 +04:00
struct trace_array * tr = iter - > tr ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
int cpu_file = iter - > cpu_file ;
2008-05-12 23:20:42 +04:00
void * p = NULL ;
loff_t l = 0 ;
2008-09-30 07:02:41 +04:00
int cpu ;
2008-05-12 23:20:42 +04:00
2012-12-26 06:52:52 +04:00
/*
* copy the tracer to avoid using a global lock all around .
* iter - > trace is a copy of current_trace , the pointer to the
* name may be used instead of a strcmp ( ) , as iter - > trace - > name
* will point to the same string as current_trace - > name .
*/
2008-05-12 23:20:42 +04:00
mutex_lock ( & trace_types_lock ) ;
2012-05-11 21:29:49 +04:00
if ( unlikely ( tr - > current_trace & & iter - > trace - > name ! = tr - > current_trace - > name ) )
* iter - > trace = * tr - > current_trace ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
mutex_unlock ( & trace_types_lock ) ;
2008-05-12 23:20:42 +04:00
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
2012-12-26 06:53:00 +04:00
if ( iter - > snapshot & & iter - > trace - > use_max_tr )
return ERR_PTR ( - EBUSY ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# endif
2012-12-26 06:53:00 +04:00
if ( ! iter - > snapshot )
atomic_inc ( & trace_record_cmdline_disabled ) ;
2008-05-12 23:20:42 +04:00
if ( * pos ! = iter - > pos ) {
iter - > ent = NULL ;
iter - > cpu = 0 ;
iter - > idx = - 1 ;
2013-01-24 00:22:59 +04:00
if ( cpu_file = = RING_BUFFER_ALL_CPUS ) {
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
for_each_tracing_cpu ( cpu )
2009-09-01 19:06:29 +04:00
tracing_iter_reset ( iter , cpu ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
} else
2009-09-01 19:06:29 +04:00
tracing_iter_reset ( iter , cpu_file ) ;
2008-05-12 23:20:42 +04:00
2010-03-02 12:54:50 +03:00
iter - > leftover = 0 ;
2008-05-12 23:20:42 +04:00
for ( p = iter ; p & & l < * pos ; p = s_next ( m , p , & l ) )
;
} else {
2009-12-07 17:11:39 +03:00
/*
* If we overflowed the seq_file before , then we want
* to just reuse the trace_seq buffer again .
*/
if ( iter - > leftover )
p = iter ;
else {
l = * pos - 1 ;
p = s_next ( m , p , & l ) ;
}
2008-05-12 23:20:42 +04:00
}
2009-05-18 15:35:34 +04:00
trace_event_read_lock ( ) ;
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
trace_access_lock ( cpu_file ) ;
2008-05-12 23:20:42 +04:00
return p ;
}
static void s_stop ( struct seq_file * m , void * p )
{
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
struct trace_iterator * iter = m - > private ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
2012-12-26 06:53:00 +04:00
if ( iter - > snapshot & & iter - > trace - > use_max_tr )
return ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# endif
2012-12-26 06:53:00 +04:00
if ( ! iter - > snapshot )
atomic_dec ( & trace_record_cmdline_disabled ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
trace_access_unlock ( iter - > cpu_file ) ;
2009-05-18 15:35:34 +04:00
trace_event_read_unlock ( ) ;
2008-05-12 23:20:42 +04:00
}
2011-11-17 19:35:16 +04:00
static void
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
get_total_entries ( struct trace_buffer * buf ,
unsigned long * total , unsigned long * entries )
2011-11-17 19:35:16 +04:00
{
unsigned long count ;
int cpu ;
* total = 0 ;
* entries = 0 ;
for_each_tracing_cpu ( cpu ) {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
count = ring_buffer_entries_cpu ( buf - > buffer , cpu ) ;
2011-11-17 19:35:16 +04:00
/*
* If this buffer has skipped entries , then we hold all
* entries for the trace and we need to ignore the
* ones before the time stamp .
*/
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
if ( per_cpu_ptr ( buf - > data , cpu ) - > skipped_entries ) {
count - = per_cpu_ptr ( buf - > data , cpu ) - > skipped_entries ;
2011-11-17 19:35:16 +04:00
/* total is the same as the entries */
* total + = count ;
} else
* total + = count +
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ring_buffer_overrun_cpu ( buf - > buffer , cpu ) ;
2011-11-17 19:35:16 +04:00
* entries + = count ;
}
}
2008-05-12 23:20:51 +04:00
static void print_lat_help_header ( struct seq_file * m )
2008-05-12 23:20:42 +04:00
{
2014-11-08 23:42:11 +03:00
seq_puts ( m , " # _------=> CPU# \n "
" # / _-----=> irqs-off \n "
" # | / _----=> need-resched \n "
" # || / _---=> hardirq/softirq \n "
" # ||| / _--=> preempt-depth \n "
" # |||| / delay \n "
" # cmd pid ||||| time | caller \n "
" # \\ / ||||| \\ | / \n " ) ;
2008-05-12 23:20:42 +04:00
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
static void print_event_info ( struct trace_buffer * buf , struct seq_file * m )
2008-05-12 23:20:42 +04:00
{
2011-11-17 19:35:16 +04:00
unsigned long total ;
unsigned long entries ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
get_total_entries ( buf , & total , & entries ) ;
2011-11-17 19:35:16 +04:00
seq_printf ( m , " # entries-in-buffer/entries-written: %lu/%lu #P:%d \n " ,
entries , total , num_online_cpus ( ) ) ;
seq_puts ( m , " # \n " ) ;
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
static void print_func_help_header ( struct trace_buffer * buf , struct seq_file * m )
2011-11-17 19:35:16 +04:00
{
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
print_event_info ( buf , m ) ;
2014-11-08 23:42:11 +03:00
seq_puts ( m , " # TASK-PID CPU# TIMESTAMP FUNCTION \n "
" # | | | | | \n " ) ;
2008-05-12 23:20:42 +04:00
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
static void print_func_help_header_irq ( struct trace_buffer * buf , struct seq_file * m )
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 18:34:33 +04:00
{
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
print_event_info ( buf , m ) ;
2014-11-08 23:42:11 +03:00
seq_puts ( m , " # _-----=> irqs-off \n "
" # / _----=> need-resched \n "
" # | / _---=> hardirq/softirq \n "
" # || / _--=> preempt-depth \n "
" # ||| / delay \n "
" # TASK-PID CPU# |||| TIMESTAMP FUNCTION \n "
" # | | | |||| | | \n " ) ;
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 18:34:33 +04:00
}
2008-05-12 23:20:42 +04:00
2010-04-02 21:01:22 +04:00
void
2008-05-12 23:20:42 +04:00
print_trace_header ( struct seq_file * m , struct trace_iterator * iter )
{
unsigned long sym_flags = ( trace_flags & TRACE_ITER_SYM_MASK ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
struct trace_buffer * buf = iter - > trace_buffer ;
struct trace_array_cpu * data = per_cpu_ptr ( buf - > data , buf - > cpu ) ;
2012-05-11 21:29:49 +04:00
struct tracer * type = iter - > trace ;
2011-11-17 19:35:16 +04:00
unsigned long entries ;
unsigned long total ;
2008-05-12 23:20:42 +04:00
const char * name = " preemption " ;
2013-02-02 03:38:47 +04:00
name = type - > name ;
2008-05-12 23:20:42 +04:00
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
get_total_entries ( buf , & total , & entries ) ;
2008-05-12 23:20:42 +04:00
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 07:12:43 +03:00
seq_printf ( m , " # %s latency trace v1.1.5 on %s \n " ,
2008-05-12 23:20:42 +04:00
name , UTS_RELEASE ) ;
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 07:12:43 +03:00
seq_puts ( m , " # ----------------------------------- "
2008-05-12 23:20:42 +04:00
" --------------------------------- \n " ) ;
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 07:12:43 +03:00
seq_printf ( m , " # latency: %lu us, #%lu/%lu, CPU#%d | "
2008-05-12 23:20:42 +04:00
" (M:%s VP:%d, KP:%d, SP:%d HP:%d " ,
2008-05-12 23:20:44 +04:00
nsecs_to_usecs ( data - > saved_latency ) ,
2008-05-12 23:20:42 +04:00
entries ,
2008-05-12 23:20:43 +04:00
total ,
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
buf - > cpu ,
2008-05-12 23:20:42 +04:00
# if defined(CONFIG_PREEMPT_NONE)
" server " ,
# elif defined(CONFIG_PREEMPT_VOLUNTARY)
" desktop " ,
2008-07-11 04:58:12 +04:00
# elif defined(CONFIG_PREEMPT)
2008-05-12 23:20:42 +04:00
" preempt " ,
# else
" unknown " ,
# endif
/* These are reserved for later use */
0 , 0 , 0 , 0 ) ;
# ifdef CONFIG_SMP
seq_printf ( m , " #P:%d) \n " , num_online_cpus ( ) ) ;
# else
seq_puts ( m , " ) \n " ) ;
# endif
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 07:12:43 +03:00
seq_puts ( m , " # ----------------- \n " ) ;
seq_printf ( m , " # | task: %.16s-%d "
2008-05-12 23:20:42 +04:00
" (uid:%d nice:%ld policy:%ld rt_prio:%ld) \n " ,
2012-03-14 03:02:19 +04:00
data - > comm , data - > pid ,
from_kuid_munged ( seq_user_ns ( m ) , data - > uid ) , data - > nice ,
2008-05-12 23:20:42 +04:00
data - > policy , data - > rt_priority ) ;
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 07:12:43 +03:00
seq_puts ( m , " # ----------------- \n " ) ;
2008-05-12 23:20:42 +04:00
if ( data - > critical_start ) {
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 07:12:43 +03:00
seq_puts ( m , " # => started at: " ) ;
2008-05-12 23:20:46 +04:00
seq_print_ip_sym ( & iter - > seq , data - > critical_start , sym_flags ) ;
trace_print_seq ( m , & iter - > seq ) ;
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 07:12:43 +03:00
seq_puts ( m , " \n # => ended at: " ) ;
2008-05-12 23:20:46 +04:00
seq_print_ip_sym ( & iter - > seq , data - > critical_end , sym_flags ) ;
trace_print_seq ( m , & iter - > seq ) ;
2009-09-02 20:27:41 +04:00
seq_puts ( m , " \n # \n " ) ;
2008-05-12 23:20:42 +04:00
}
ftrace: tracing header should put '#' at the beginning of a line
In a recent discussion, Andrew Morton pointed out that tracing header
should put '#' at the beginning of a line.
Then, we can easily filtered the header by following grep usage:
cat trace | grep -v '^#'
Wakeup trace also has the same header problem.
Comparison of headers displayed:
before this patch:
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
--------------------------------------------------------------------
latency: 19059 us, #21277/21277, CPU#1 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
-----------------
| task: kondemand/1-1644 (uid:0 nice:-5 policy:0 rt_prio:0)
-----------------
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
irqbalan-1887 1d.s. 0us : 1887:120:R + [001] 1644:115:S kondemand/1
irqbalan-1887 1d.s. 1us : default_wake_function <-autoremove_wake_function
irqbalan-1887 1d.s. 2us : check_preempt_wakeup <-try_to_wake_up
after this patch:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 2.6.29-rc7-tip-tip
# --------------------------------------------------------------------
# latency: 529 us, #530/530, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: kondemand/0-1641 (uid:0 nice:-5 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /
# ||||| delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
sshd-2496 0d.s. 0us : 2496:120:R + [000] 1641:115:S kondemand/0
sshd-2496 0d.s. 1us : default_wake_function <-autoremove_wake_function
sshd-2496 0d.s. 1us : check_preempt_wakeup <-try_to_wake_up
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20090308124421.23C3.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-08 07:12:43 +03:00
seq_puts ( m , " # \n " ) ;
2008-05-12 23:20:42 +04:00
}
2008-11-08 06:36:02 +03:00
static void test_cpu_buff_start ( struct trace_iterator * iter )
{
struct trace_seq * s = & iter - > seq ;
2008-11-13 01:52:38 +03:00
if ( ! ( trace_flags & TRACE_ITER_ANNOTATE ) )
return ;
if ( ! ( iter - > iter_flags & TRACE_FILE_ANNOTATE ) )
return ;
2009-01-01 02:42:23 +03:00
if ( cpumask_test_cpu ( iter - > cpu , iter - > started ) )
2008-11-08 06:36:02 +03:00
return ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
if ( per_cpu_ptr ( iter - > trace_buffer - > data , iter - > cpu ) - > skipped_entries )
2009-09-01 19:06:29 +04:00
return ;
2009-01-01 02:42:23 +03:00
cpumask_set_cpu ( iter - > cpu , iter - > started ) ;
2009-04-02 00:53:08 +04:00
/* Don't print started cpu buffer for the first entry of the trace */
if ( iter - > idx > 1 )
trace_seq_printf ( s , " ##### CPU %u buffer started #### \n " ,
iter - > cpu ) ;
2008-11-08 06:36:02 +03:00
}
2008-09-29 22:18:34 +04:00
static enum print_line_t print_trace_fmt ( struct trace_iterator * iter )
2008-05-12 23:20:42 +04:00
{
2008-05-12 23:20:46 +04:00
struct trace_seq * s = & iter - > seq ;
2008-05-12 23:20:42 +04:00
unsigned long sym_flags = ( trace_flags & TRACE_ITER_SYM_MASK ) ;
2008-05-12 23:20:45 +04:00
struct trace_entry * entry ;
2008-12-24 07:24:13 +03:00
struct trace_event * event ;
2008-05-12 23:20:42 +04:00
2008-05-12 23:20:45 +04:00
entry = iter - > ent ;
2008-08-01 20:26:41 +04:00
2008-11-08 06:36:02 +03:00
test_cpu_buff_start ( iter ) ;
2009-02-03 01:29:21 +03:00
event = ftrace_find_event ( entry - > type ) ;
2008-05-12 23:20:42 +04:00
2009-02-03 01:29:21 +03:00
if ( trace_flags & TRACE_ITER_CONTEXT_INFO ) {
2014-11-12 18:29:54 +03:00
if ( iter - > iter_flags & TRACE_FILE_LAT_FMT )
trace_print_lat_context ( iter ) ;
else
trace_print_context ( iter ) ;
2009-02-03 01:29:21 +03:00
}
2008-05-12 23:20:42 +04:00
2014-11-12 18:29:54 +03:00
if ( trace_seq_has_overflowed ( s ) )
return TRACE_TYPE_PARTIAL_LINE ;
2009-02-05 01:16:39 +03:00
if ( event )
2010-04-23 02:46:14 +04:00
return event - > funcs - > trace ( iter , sym_flags , event ) ;
2009-02-04 01:20:41 +03:00
2014-11-12 18:29:54 +03:00
trace_seq_printf ( s , " Unknown type %d \n " , entry - > type ) ;
2008-11-22 14:28:47 +03:00
2014-11-12 18:29:54 +03:00
return trace_handle_return ( s ) ;
2008-05-12 23:20:42 +04:00
}
2008-09-29 22:18:34 +04:00
static enum print_line_t print_raw_fmt ( struct trace_iterator * iter )
2008-05-12 23:20:47 +04:00
{
struct trace_seq * s = & iter - > seq ;
struct trace_entry * entry ;
2008-12-24 07:24:13 +03:00
struct trace_event * event ;
2008-05-12 23:20:47 +04:00
entry = iter - > ent ;
2008-08-01 20:26:41 +04:00
2014-11-12 18:29:54 +03:00
if ( trace_flags & TRACE_ITER_CONTEXT_INFO )
trace_seq_printf ( s , " %d %d %llu " ,
entry - > pid , iter - > cpu , iter - > ts ) ;
if ( trace_seq_has_overflowed ( s ) )
return TRACE_TYPE_PARTIAL_LINE ;
2008-05-12 23:20:47 +04:00
2008-12-24 07:24:13 +03:00
event = ftrace_find_event ( entry - > type ) ;
2009-02-05 01:16:39 +03:00
if ( event )
2010-04-23 02:46:14 +04:00
return event - > funcs - > raw ( iter , 0 , event ) ;
2009-02-04 01:20:41 +03:00
2014-11-12 18:29:54 +03:00
trace_seq_printf ( s , " %d ? \n " , entry - > type ) ;
2008-09-30 07:02:42 +04:00
2014-11-12 18:29:54 +03:00
return trace_handle_return ( s ) ;
2008-05-12 23:20:47 +04:00
}
2008-09-29 22:18:34 +04:00
static enum print_line_t print_hex_fmt ( struct trace_iterator * iter )
2008-05-12 23:20:49 +04:00
{
struct trace_seq * s = & iter - > seq ;
unsigned char newline = ' \n ' ;
struct trace_entry * entry ;
2008-12-24 07:24:13 +03:00
struct trace_event * event ;
2008-05-12 23:20:49 +04:00
entry = iter - > ent ;
2008-08-01 20:26:41 +04:00
2009-02-03 01:29:21 +03:00
if ( trace_flags & TRACE_ITER_CONTEXT_INFO ) {
2014-11-12 18:29:54 +03:00
SEQ_PUT_HEX_FIELD ( s , entry - > pid ) ;
SEQ_PUT_HEX_FIELD ( s , iter - > cpu ) ;
SEQ_PUT_HEX_FIELD ( s , iter - > ts ) ;
if ( trace_seq_has_overflowed ( s ) )
return TRACE_TYPE_PARTIAL_LINE ;
2009-02-03 01:29:21 +03:00
}
2008-05-12 23:20:49 +04:00
2008-12-24 07:24:13 +03:00
event = ftrace_find_event ( entry - > type ) ;
2009-02-05 01:16:39 +03:00
if ( event ) {
2010-04-23 02:46:14 +04:00
enum print_line_t ret = event - > funcs - > hex ( iter , 0 , event ) ;
2009-02-04 01:20:41 +03:00
if ( ret ! = TRACE_TYPE_HANDLED )
return ret ;
}
2008-10-01 18:52:51 +04:00
2014-11-12 18:29:54 +03:00
SEQ_PUT_FIELD ( s , newline ) ;
2008-05-12 23:20:49 +04:00
2014-11-12 18:29:54 +03:00
return trace_handle_return ( s ) ;
2008-05-12 23:20:49 +04:00
}
2008-09-29 22:18:34 +04:00
static enum print_line_t print_bin_fmt ( struct trace_iterator * iter )
2008-05-12 23:20:47 +04:00
{
struct trace_seq * s = & iter - > seq ;
struct trace_entry * entry ;
2008-12-24 07:24:13 +03:00
struct trace_event * event ;
2008-05-12 23:20:47 +04:00
entry = iter - > ent ;
2008-08-01 20:26:41 +04:00
2009-02-03 01:29:21 +03:00
if ( trace_flags & TRACE_ITER_CONTEXT_INFO ) {
2014-11-12 18:29:54 +03:00
SEQ_PUT_FIELD ( s , entry - > pid ) ;
SEQ_PUT_FIELD ( s , iter - > cpu ) ;
SEQ_PUT_FIELD ( s , iter - > ts ) ;
if ( trace_seq_has_overflowed ( s ) )
return TRACE_TYPE_PARTIAL_LINE ;
2009-02-03 01:29:21 +03:00
}
2008-05-12 23:20:47 +04:00
2008-12-24 07:24:13 +03:00
event = ftrace_find_event ( entry - > type ) ;
2010-04-23 02:46:14 +04:00
return event ? event - > funcs - > binary ( iter , 0 , event ) :
TRACE_TYPE_HANDLED ;
2008-05-12 23:20:47 +04:00
}
2010-04-02 21:01:22 +04:00
int trace_empty ( struct trace_iterator * iter )
2008-05-12 23:20:42 +04:00
{
2012-06-28 04:46:14 +04:00
struct ring_buffer_iter * buf_iter ;
2008-05-12 23:20:42 +04:00
int cpu ;
2009-03-12 02:52:30 +03:00
/* If we are looking at one CPU buffer, only check that one */
2013-01-24 00:22:59 +04:00
if ( iter - > cpu_file ! = RING_BUFFER_ALL_CPUS ) {
2009-03-12 02:52:30 +03:00
cpu = iter - > cpu_file ;
2012-06-28 04:46:14 +04:00
buf_iter = trace_buffer_iter ( iter , cpu ) ;
if ( buf_iter ) {
if ( ! ring_buffer_iter_empty ( buf_iter ) )
2009-03-12 02:52:30 +03:00
return 0 ;
} else {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
if ( ! ring_buffer_empty_cpu ( iter - > trace_buffer - > buffer , cpu ) )
2009-03-12 02:52:30 +03:00
return 0 ;
}
return 1 ;
}
2008-05-12 23:21:00 +04:00
for_each_tracing_cpu ( cpu ) {
2012-06-28 04:46:14 +04:00
buf_iter = trace_buffer_iter ( iter , cpu ) ;
if ( buf_iter ) {
if ( ! ring_buffer_iter_empty ( buf_iter ) )
2008-10-01 08:29:53 +04:00
return 0 ;
} else {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
if ( ! ring_buffer_empty_cpu ( iter - > trace_buffer - > buffer , cpu ) )
2008-10-01 08:29:53 +04:00
return 0 ;
}
2008-05-12 23:20:42 +04:00
}
2008-10-01 08:29:53 +04:00
2008-09-30 20:13:45 +04:00
return 1 ;
2008-05-12 23:20:42 +04:00
}
2009-05-18 15:35:34 +04:00
/* Called with trace_event_read_lock() held. */
2010-08-05 18:22:23 +04:00
enum print_line_t print_trace_line ( struct trace_iterator * iter )
2008-05-12 23:20:47 +04:00
{
2008-09-29 22:18:34 +04:00
enum print_line_t ret ;
2014-11-12 18:29:54 +03:00
if ( iter - > lost_events ) {
trace_seq_printf ( & iter - > seq , " CPU:%d [LOST %lu EVENTS] \n " ,
iter - > cpu , iter - > lost_events ) ;
if ( trace_seq_has_overflowed ( & iter - > seq ) )
return TRACE_TYPE_PARTIAL_LINE ;
}
2010-04-01 03:49:26 +04:00
2008-09-29 22:18:34 +04:00
if ( iter - > trace & & iter - > trace - > print_line ) {
ret = iter - > trace - > print_line ( iter ) ;
if ( ret ! = TRACE_TYPE_UNHANDLED )
return ret ;
}
2008-05-23 23:37:28 +04:00
2013-03-09 06:02:34 +04:00
if ( iter - > ent - > type = = TRACE_BPUTS & &
trace_flags & TRACE_ITER_PRINTK & &
trace_flags & TRACE_ITER_PRINTK_MSGONLY )
return trace_print_bputs_msg_only ( iter ) ;
2009-03-12 20:24:49 +03:00
if ( iter - > ent - > type = = TRACE_BPRINT & &
trace_flags & TRACE_ITER_PRINTK & &
trace_flags & TRACE_ITER_PRINTK_MSGONLY )
2009-03-19 19:20:38 +03:00
return trace_print_bprintk_msg_only ( iter ) ;
2009-03-12 20:24:49 +03:00
2008-12-13 22:18:13 +03:00
if ( iter - > ent - > type = = TRACE_PRINT & &
trace_flags & TRACE_ITER_PRINTK & &
trace_flags & TRACE_ITER_PRINTK_MSGONLY )
2009-03-19 19:20:38 +03:00
return trace_print_printk_msg_only ( iter ) ;
2008-12-13 22:18:13 +03:00
2008-05-12 23:20:47 +04:00
if ( trace_flags & TRACE_ITER_BIN )
return print_bin_fmt ( iter ) ;
2008-05-12 23:20:49 +04:00
if ( trace_flags & TRACE_ITER_HEX )
return print_hex_fmt ( iter ) ;
2008-05-12 23:20:47 +04:00
if ( trace_flags & TRACE_ITER_RAW )
return print_raw_fmt ( iter ) ;
return print_trace_fmt ( iter ) ;
}
tracing/latency: Fix header output for latency tracers
In case the the graph tracer (CONFIG_FUNCTION_GRAPH_TRACER) or even the
function tracer (CONFIG_FUNCTION_TRACER) are not set, the latency tracers
do not display proper latency header.
The involved/fixed latency tracers are:
wakeup_rt
wakeup
preemptirqsoff
preemptoff
irqsoff
The patch adds proper handling of tracer configuration options for latency
tracers, and displaying correct header info accordingly.
* The current output (for wakeup tracer) with both graph and function
tracers disabled is:
# tracer: wakeup
#
<idle>-0 0d.h5 1us+: 0:120:R + [000] 7: 0:R watchdog/0
<idle>-0 0d.h5 3us+: ttwu_do_activate.clone.1 <-try_to_wake_up
...
* The fixed output is:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 3.1.0-tip+
# --------------------------------------------------------------------
# latency: 55 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
# -----------------
# | task: migration/0-6 (uid:0 nice:0 policy:1 rt_prio:99)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
cat-1129 0d..4 1us : 1129:120:R + [000] 6: 0:R migration/0
cat-1129 0d..4 2us+: ttwu_do_activate.clone.1 <-try_to_wake_up
* The current output (for wakeup tracer) with only function
tracer enabled is:
# tracer: wakeup
#
cat-1140 0d..4 1us+: 1140:120:R + [000] 6: 0:R migration/0
cat-1140 0d..4 2us : ttwu_do_activate.clone.1 <-try_to_wake_up
* The fixed output is:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 3.1.0-tip+
# --------------------------------------------------------------------
# latency: 207 us, #109/109, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
# -----------------
# | task: watchdog/1-12 (uid:0 nice:0 policy:1 rt_prio:99)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
<idle>-0 1d.h5 1us+: 0:120:R + [001] 12: 0:R watchdog/1
<idle>-0 1d.h5 3us : ttwu_do_activate.clone.1 <-try_to_wake_up
Link: http://lkml.kernel.org/r/20111107150849.GE1807@m.brq.redhat.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-07 19:08:49 +04:00
void trace_latency_header ( struct seq_file * m )
{
struct trace_iterator * iter = m - > private ;
/* print nothing if the buffers are empty */
if ( trace_empty ( iter ) )
return ;
if ( iter - > iter_flags & TRACE_FILE_LAT_FMT )
print_trace_header ( m , iter ) ;
if ( ! ( trace_flags & TRACE_ITER_VERBOSE ) )
print_lat_help_header ( m ) ;
}
2010-04-02 21:01:22 +04:00
void trace_default_header ( struct seq_file * m )
{
struct trace_iterator * iter = m - > private ;
2011-06-03 18:58:49 +04:00
if ( ! ( trace_flags & TRACE_ITER_CONTEXT_INFO ) )
return ;
2010-04-02 21:01:22 +04:00
if ( iter - > iter_flags & TRACE_FILE_LAT_FMT ) {
/* print nothing if the buffers are empty */
if ( trace_empty ( iter ) )
return ;
print_trace_header ( m , iter ) ;
if ( ! ( trace_flags & TRACE_ITER_VERBOSE ) )
print_lat_help_header ( m ) ;
} else {
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 18:34:33 +04:00
if ( ! ( trace_flags & TRACE_ITER_VERBOSE ) ) {
if ( trace_flags & TRACE_ITER_IRQ_INFO )
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
print_func_help_header_irq ( iter - > trace_buffer , m ) ;
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 18:34:33 +04:00
else
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
print_func_help_header ( iter - > trace_buffer , m ) ;
tracing: Add irq, preempt-count and need resched info to default trace output
People keep asking how to get the preempt count, irq, and need resched info
and we keep telling them to enable the latency format. Some developers think
that traces without this info is completely useless, and for a lot of tasks
it is useless.
The first option was to enable the latency trace as the default format, but
the header for the latency format is pretty useless for most tracers and
it also does the timestamp in straight microseconds from the time the trace
started. This is sometimes more difficult to read as the default trace is
seconds from the start of boot up.
Latency format:
# tracer: nop
#
# nop latency trace v1.1.5 on 3.2.0-rc1-test+
# --------------------------------------------------------------------
# latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule
migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch
migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch
migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch
migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule
default format:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule
migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch
migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch
migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule
migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule
The latency format header has latency information that is pretty meaningless
for most tracers. Although some of the header is useful, and we can add that
later to the default format as well.
What is really useful with the latency format is the irqs-off, need-resched
hard/softirq context and the preempt count.
This commit adds the option irq-info which is on by default that adds this
information:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle
<idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle
If a user wants the old format, they can disable the 'irq-info' option:
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call
<idle>-0 [000] 49.309307: mwait_idle <-cpu_idle
<idle>-0 [000] 49.309309: need_resched <-mwait_idle
<idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched
<idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle
<idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle
<idle>-0 [000] 49.309315: need_resched <-mwait_idle
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17 18:34:33 +04:00
}
2010-04-02 21:01:22 +04:00
}
}
2011-09-30 05:26:16 +04:00
static void test_ftrace_alive ( struct seq_file * m )
{
if ( ! ftrace_is_dead ( ) )
return ;
2014-11-08 23:42:11 +03:00
seq_puts ( m , " # WARNING: FUNCTION TRACING IS CORRUPTED \n "
" # MAY BE MISSING FUNCTION EVENTS \n " ) ;
2011-09-30 05:26:16 +04:00
}
2013-03-05 19:25:16 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
2013-03-05 23:35:11 +04:00
static void show_snapshot_main_help ( struct seq_file * m )
2013-03-05 19:25:16 +04:00
{
2014-11-08 23:42:11 +03:00
seq_puts ( m , " # echo 0 > snapshot : Clears and frees snapshot buffer \n "
" # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated. \n "
" # Takes a snapshot of the main buffer. \n "
" # echo 2 > snapshot : Clears snapshot buffer (but does not allocate or free) \n "
" # (Doesn't have to be '2' works with any number that \n "
" # is not a '0' or '1') \n " ) ;
2013-03-05 19:25:16 +04:00
}
2013-03-05 23:35:11 +04:00
static void show_snapshot_percpu_help ( struct seq_file * m )
{
2014-11-08 23:42:10 +03:00
seq_puts ( m , " # echo 0 > snapshot : Invalid for per_cpu snapshot file. \n " ) ;
2013-03-05 23:35:11 +04:00
# ifdef CONFIG_RING_BUFFER_ALLOW_SWAP
2014-11-08 23:42:11 +03:00
seq_puts ( m , " # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated. \n "
" # Takes a snapshot of the main buffer for this cpu. \n " ) ;
2013-03-05 23:35:11 +04:00
# else
2014-11-08 23:42:11 +03:00
seq_puts ( m , " # echo 1 > snapshot : Not supported with this kernel. \n "
" # Must use main snapshot file to allocate. \n " ) ;
2013-03-05 23:35:11 +04:00
# endif
2014-11-08 23:42:11 +03:00
seq_puts ( m , " # echo 2 > snapshot : Clears this cpu's snapshot buffer (but does not allocate) \n "
" # (Doesn't have to be '2' works with any number that \n "
" # is not a '0' or '1') \n " ) ;
2013-03-05 23:35:11 +04:00
}
2013-03-05 19:25:16 +04:00
static void print_snapshot_help ( struct seq_file * m , struct trace_iterator * iter )
{
2013-03-06 03:25:02 +04:00
if ( iter - > tr - > allocated_snapshot )
2014-11-08 23:42:10 +03:00
seq_puts ( m , " # \n # * Snapshot is allocated * \n # \n " ) ;
2013-03-05 19:25:16 +04:00
else
2014-11-08 23:42:10 +03:00
seq_puts ( m , " # \n # * Snapshot is freed * \n # \n " ) ;
2013-03-05 19:25:16 +04:00
2014-11-08 23:42:10 +03:00
seq_puts ( m , " # Snapshot commands: \n " ) ;
2013-03-05 23:35:11 +04:00
if ( iter - > cpu_file = = RING_BUFFER_ALL_CPUS )
show_snapshot_main_help ( m ) ;
else
show_snapshot_percpu_help ( m ) ;
2013-03-05 19:25:16 +04:00
}
# else
/* Should never be called */
static inline void print_snapshot_help ( struct seq_file * m , struct trace_iterator * iter ) { }
# endif
2008-05-12 23:20:42 +04:00
static int s_show ( struct seq_file * m , void * v )
{
struct trace_iterator * iter = v ;
2009-12-07 17:11:39 +03:00
int ret ;
2008-05-12 23:20:42 +04:00
if ( iter - > ent = = NULL ) {
if ( iter - > tr ) {
seq_printf ( m , " # tracer: %s \n " , iter - > trace - > name ) ;
seq_puts ( m , " # \n " ) ;
2011-09-30 05:26:16 +04:00
test_ftrace_alive ( m ) ;
2008-05-12 23:20:42 +04:00
}
2013-03-05 19:25:16 +04:00
if ( iter - > snapshot & & trace_empty ( iter ) )
print_snapshot_help ( m , iter ) ;
else if ( iter - > trace & & iter - > trace - > print_header )
2008-11-25 11:12:31 +03:00
iter - > trace - > print_header ( m ) ;
2010-04-02 21:01:22 +04:00
else
trace_default_header ( m ) ;
2009-12-07 17:11:39 +03:00
} else if ( iter - > leftover ) {
/*
* If we filled the seq_file buffer earlier , we
* want to just show it now .
*/
ret = trace_print_seq ( m , & iter - > seq ) ;
/* ret should this time be zero, but you never know */
iter - > leftover = ret ;
2008-05-12 23:20:42 +04:00
} else {
2008-05-12 23:20:47 +04:00
print_trace_line ( iter ) ;
2009-12-07 17:11:39 +03:00
ret = trace_print_seq ( m , & iter - > seq ) ;
/*
* If we overflow the seq_file buffer , then it will
* ask us for this data again at start up .
* Use that instead .
* ret is 0 if seq_file write succeeded .
* - 1 otherwise .
*/
iter - > leftover = ret ;
2008-05-12 23:20:42 +04:00
}
return 0 ;
}
tracing: Introduce trace_create_cpu_file() and tracing_get_cpu()
Every "file_operations" used by tracing_init_debugfs_percpu is buggy.
f_op->open/etc does:
1. struct trace_cpu *tc = inode->i_private;
struct trace_array *tr = tc->tr;
2. trace_array_get(tr) or fail;
3. do_something(tc);
But tc (and tr) can be already freed before trace_array_get() is called.
And it doesn't matter whether this file is per-cpu or it was created by
init_tracer_debugfs(), free_percpu() or kfree() are equally bad.
Note that even 1. is not safe, the freed memory can be unmapped. But even
if it was safe trace_array_get() can wrongly succeed if we also race with
the next new_instance_create() which can re-allocate the same tr, or tc
was overwritten and ->tr points to the valid tr. In this case 3. uses the
freed/reused memory.
Add the new trivial helper, trace_create_cpu_file() which simply calls
trace_create_file() and encodes "cpu" in "struct inode". Another helper,
tracing_get_cpu() will be used to read cpu_nr-or-RING_BUFFER_ALL_CPUS.
The patch abuses ->i_cdev to encode the number, it is never used unless
the file is S_ISCHR(). But we could use something else, say, i_bytes or
even ->d_fsdata. In any case this hack is hidden inside these 2 helpers,
it would be trivial to change them if needed.
This patch only changes tracing_init_debugfs_percpu() to use the new
trace_create_cpu_file(), the next patches will change file_operations.
Note: tracing_get_cpu(inode) is always safe but you can't trust the
result unless trace_array_get() was called, without trace_types_lock
which acts as a barrier it can wrongly return RING_BUFFER_ALL_CPUS.
Link: http://lkml.kernel.org/r/20130723152554.GA23710@redhat.com
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-23 19:25:54 +04:00
/*
* Should be used after trace_array_get ( ) , trace_types_lock
* ensures that i_cdev was already initialized .
*/
static inline int tracing_get_cpu ( struct inode * inode )
{
if ( inode - > i_cdev ) /* See trace_create_cpu_file() */
return ( long ) inode - > i_cdev - 1 ;
return RING_BUFFER_ALL_CPUS ;
}
2009-09-23 03:43:43 +04:00
static const struct seq_operations tracer_seq_ops = {
2008-05-12 23:20:46 +04:00
. start = s_start ,
. next = s_next ,
. stop = s_stop ,
. show = s_show ,
2008-05-12 23:20:42 +04:00
} ;
2008-05-12 23:20:51 +04:00
static struct trace_iterator *
2013-07-23 19:26:10 +04:00
__tracing_open ( struct inode * inode , struct file * file , bool snapshot )
2008-05-12 23:20:42 +04:00
{
2013-07-23 19:26:10 +04:00
struct trace_array * tr = inode - > i_private ;
2008-05-12 23:20:42 +04:00
struct trace_iterator * iter ;
2012-04-25 12:23:39 +04:00
int cpu ;
2008-05-12 23:20:42 +04:00
2009-02-27 08:12:38 +03:00
if ( tracing_disabled )
return ERR_PTR ( - ENODEV ) ;
2008-05-12 23:20:44 +04:00
2012-04-25 12:23:39 +04:00
iter = __seq_open_private ( file , & tracer_seq_ops , sizeof ( * iter ) ) ;
2009-02-27 08:12:38 +03:00
if ( ! iter )
return ERR_PTR ( - ENOMEM ) ;
2008-05-12 23:20:42 +04:00
2015-06-09 10:32:35 +03:00
iter - > buffer_iter = kcalloc ( nr_cpu_ids , sizeof ( * iter - > buffer_iter ) ,
2012-06-28 04:46:14 +04:00
GFP_KERNEL ) ;
2012-07-11 10:35:08 +04:00
if ( ! iter - > buffer_iter )
goto release ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
/*
* We make a copy of the current tracer to avoid concurrent
* changes on it while we are reading .
*/
2008-05-12 23:20:42 +04:00
mutex_lock ( & trace_types_lock ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
iter - > trace = kzalloc ( sizeof ( * iter - > trace ) , GFP_KERNEL ) ;
2009-02-27 08:12:38 +03:00
if ( ! iter - > trace )
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
goto fail ;
2009-02-27 08:12:38 +03:00
2012-05-11 21:29:49 +04:00
* iter - > trace = * tr - > current_trace ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
2009-06-15 10:58:26 +04:00
if ( ! zalloc_cpumask_var ( & iter - > started , GFP_KERNEL ) )
2009-04-02 00:53:08 +04:00
goto fail ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
iter - > tr = tr ;
# ifdef CONFIG_TRACER_MAX_TRACE
2012-05-11 21:29:49 +04:00
/* Currently only the top directory has a snapshot */
if ( tr - > current_trace - > print_max | | snapshot )
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
iter - > trace_buffer = & tr - > max_buffer ;
2008-05-12 23:20:42 +04:00
else
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# endif
iter - > trace_buffer = & tr - > trace_buffer ;
2012-12-26 06:53:00 +04:00
iter - > snapshot = snapshot ;
2008-05-12 23:20:42 +04:00
iter - > pos = - 1 ;
2013-07-23 19:26:10 +04:00
iter - > cpu_file = tracing_get_cpu ( inode ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
mutex_init ( & iter - > mutex ) ;
2008-05-12 23:20:42 +04:00
2008-11-25 11:12:31 +03:00
/* Notify the tracer early; before we stop tracing. */
if ( iter - > trace & & iter - > trace - > open )
2008-12-11 15:53:26 +03:00
iter - > trace - > open ( iter ) ;
2008-11-25 11:12:31 +03:00
2008-11-13 01:52:38 +03:00
/* Annotate start of buffers if we had overruns */
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
if ( ring_buffer_overruns ( iter - > trace_buffer - > buffer ) )
2008-11-13 01:52:38 +03:00
iter - > iter_flags | = TRACE_FILE_ANNOTATE ;
2012-11-14 00:18:22 +04:00
/* Output in nanoseconds only if we are using a clock in nanoseconds. */
2013-04-23 05:32:39 +04:00
if ( trace_clocks [ tr - > clock_id ] . in_ns )
2012-11-14 00:18:22 +04:00
iter - > iter_flags | = TRACE_FILE_TIME_IN_NS ;
2012-12-26 06:53:00 +04:00
/* stop the trace while dumping if we are not opening "snapshot" */
if ( ! iter - > snapshot )
2012-05-11 21:29:49 +04:00
tracing_stop_tr ( tr ) ;
2009-09-01 19:06:29 +04:00
2013-01-24 00:22:59 +04:00
if ( iter - > cpu_file = = RING_BUFFER_ALL_CPUS ) {
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
for_each_tracing_cpu ( cpu ) {
iter - > buffer_iter [ cpu ] =
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ring_buffer_read_prepare ( iter - > trace_buffer - > buffer , cpu ) ;
2010-04-21 02:47:11 +04:00
}
ring_buffer_read_prepare_sync ( ) ;
for_each_tracing_cpu ( cpu ) {
ring_buffer_read_start ( iter - > buffer_iter [ cpu ] ) ;
2009-09-01 19:06:29 +04:00
tracing_iter_reset ( iter , cpu ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
}
} else {
cpu = iter - > cpu_file ;
2008-09-30 07:02:41 +04:00
iter - > buffer_iter [ cpu ] =
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ring_buffer_read_prepare ( iter - > trace_buffer - > buffer , cpu ) ;
2010-04-21 02:47:11 +04:00
ring_buffer_read_prepare_sync ( ) ;
ring_buffer_read_start ( iter - > buffer_iter [ cpu ] ) ;
2009-09-01 19:06:29 +04:00
tracing_iter_reset ( iter , cpu ) ;
2008-09-30 07:02:41 +04:00
}
2008-05-12 23:20:42 +04:00
mutex_unlock ( & trace_types_lock ) ;
return iter ;
2008-09-30 07:02:41 +04:00
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
fail :
2008-09-30 07:02:41 +04:00
mutex_unlock ( & trace_types_lock ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
kfree ( iter - > trace ) ;
2012-06-28 04:46:14 +04:00
kfree ( iter - > buffer_iter ) ;
2012-07-11 10:35:08 +04:00
release :
2012-04-25 12:23:39 +04:00
seq_release_private ( inode , file ) ;
return ERR_PTR ( - ENOMEM ) ;
2008-05-12 23:20:42 +04:00
}
int tracing_open_generic ( struct inode * inode , struct file * filp )
{
2008-05-12 23:20:44 +04:00
if ( tracing_disabled )
return - ENODEV ;
2008-05-12 23:20:42 +04:00
filp - > private_data = inode - > i_private ;
return 0 ;
}
2013-10-19 04:15:54 +04:00
bool tracing_is_disabled ( void )
{
return ( tracing_disabled ) ? true : false ;
}
2013-07-02 07:34:22 +04:00
/*
* Open and update trace_array ref count .
* Must have the current trace_array passed to it .
*/
2013-07-03 04:30:52 +04:00
static int tracing_open_generic_tr ( struct inode * inode , struct file * filp )
2013-07-02 07:34:22 +04:00
{
struct trace_array * tr = inode - > i_private ;
if ( tracing_disabled )
return - ENODEV ;
if ( trace_array_get ( tr ) < 0 )
return - ENODEV ;
filp - > private_data = inode - > i_private ;
return 0 ;
}
2009-02-10 21:44:12 +03:00
static int tracing_release ( struct inode * inode , struct file * file )
2008-05-12 23:20:42 +04:00
{
2013-07-23 19:26:10 +04:00
struct trace_array * tr = inode - > i_private ;
2010-09-28 06:04:53 +04:00
struct seq_file * m = file - > private_data ;
2009-03-18 17:40:24 +03:00
struct trace_iterator * iter ;
2008-09-30 07:02:41 +04:00
int cpu ;
2008-05-12 23:20:42 +04:00
2013-07-02 06:50:29 +04:00
if ( ! ( file - > f_mode & FMODE_READ ) ) {
2013-07-23 19:26:10 +04:00
trace_array_put ( tr ) ;
2009-03-18 17:40:24 +03:00
return 0 ;
2013-07-02 06:50:29 +04:00
}
2009-03-18 17:40:24 +03:00
2013-07-23 19:26:10 +04:00
/* Writes do not use seq_file */
2009-03-18 17:40:24 +03:00
iter = m - > private ;
2008-05-12 23:20:42 +04:00
mutex_lock ( & trace_types_lock ) ;
2013-03-07 00:27:24 +04:00
2008-09-30 07:02:41 +04:00
for_each_tracing_cpu ( cpu ) {
if ( iter - > buffer_iter [ cpu ] )
ring_buffer_read_finish ( iter - > buffer_iter [ cpu ] ) ;
}
2008-05-12 23:20:42 +04:00
if ( iter - > trace & & iter - > trace - > close )
iter - > trace - > close ( iter ) ;
2012-12-26 06:53:00 +04:00
if ( ! iter - > snapshot )
/* reenable tracing if it was previously enabled */
2012-05-11 21:29:49 +04:00
tracing_start_tr ( tr ) ;
2013-07-18 22:18:44 +04:00
__trace_array_put ( tr ) ;
2008-05-12 23:20:42 +04:00
mutex_unlock ( & trace_types_lock ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
mutex_destroy ( & iter - > mutex ) ;
2009-04-02 00:53:08 +04:00
free_cpumask_var ( iter - > started ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
kfree ( iter - > trace ) ;
2012-06-28 04:46:14 +04:00
kfree ( iter - > buffer_iter ) ;
2012-04-25 12:23:39 +04:00
seq_release_private ( inode , file ) ;
2013-07-02 06:50:29 +04:00
2008-05-12 23:20:42 +04:00
return 0 ;
}
2013-07-02 07:34:22 +04:00
static int tracing_release_generic_tr ( struct inode * inode , struct file * file )
{
struct trace_array * tr = inode - > i_private ;
trace_array_put ( tr ) ;
2008-05-12 23:20:42 +04:00
return 0 ;
}
2013-07-02 07:34:22 +04:00
static int tracing_single_release_tr ( struct inode * inode , struct file * file )
{
struct trace_array * tr = inode - > i_private ;
trace_array_put ( tr ) ;
return single_release ( inode , file ) ;
}
2008-05-12 23:20:42 +04:00
static int tracing_open ( struct inode * inode , struct file * file )
{
2013-07-23 19:26:10 +04:00
struct trace_array * tr = inode - > i_private ;
2009-02-27 08:12:38 +03:00
struct trace_iterator * iter ;
int ret = 0 ;
2008-05-12 23:20:42 +04:00
2013-07-02 06:50:29 +04:00
if ( trace_array_get ( tr ) < 0 )
return - ENODEV ;
2009-03-18 17:40:24 +03:00
/* If this file was open for write, then erase contents */
2013-07-23 19:26:10 +04:00
if ( ( file - > f_mode & FMODE_WRITE ) & & ( file - > f_flags & O_TRUNC ) ) {
int cpu = tracing_get_cpu ( inode ) ;
if ( cpu = = RING_BUFFER_ALL_CPUS )
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
tracing_reset_online_cpus ( & tr - > trace_buffer ) ;
2009-03-18 17:40:24 +03:00
else
2013-07-23 19:26:10 +04:00
tracing_reset ( & tr - > trace_buffer , cpu ) ;
2009-03-18 17:40:24 +03:00
}
2008-05-12 23:20:42 +04:00
2009-03-18 17:40:24 +03:00
if ( file - > f_mode & FMODE_READ ) {
2013-07-23 19:26:10 +04:00
iter = __tracing_open ( inode , file , false ) ;
2009-03-18 17:40:24 +03:00
if ( IS_ERR ( iter ) )
ret = PTR_ERR ( iter ) ;
else if ( trace_flags & TRACE_ITER_LATENCY_FMT )
iter - > iter_flags | = TRACE_FILE_LAT_FMT ;
}
2013-07-02 06:50:29 +04:00
if ( ret < 0 )
trace_array_put ( tr ) ;
2008-05-12 23:20:42 +04:00
return ret ;
}
2013-11-07 07:42:48 +04:00
/*
* Some tracers are not suitable for instance buffers .
* A tracer is always available for the global array ( toplevel )
* or if it explicitly states that it is .
*/
static bool
trace_ok_for_array ( struct tracer * t , struct trace_array * tr )
{
return ( tr - > flags & TRACE_ARRAY_FL_GLOBAL ) | | t - > allow_instances ;
}
/* Find the next tracer that this trace array may use */
static struct tracer *
get_tracer_for_array ( struct trace_array * tr , struct tracer * t )
{
while ( t & & ! trace_ok_for_array ( t , tr ) )
t = t - > next ;
return t ;
}
2008-05-12 23:20:51 +04:00
static void *
2008-05-12 23:20:42 +04:00
t_next ( struct seq_file * m , void * v , loff_t * pos )
{
2013-11-07 07:42:48 +04:00
struct trace_array * tr = m - > private ;
2009-06-24 05:53:44 +04:00
struct tracer * t = v ;
2008-05-12 23:20:42 +04:00
( * pos ) + + ;
if ( t )
2013-11-07 07:42:48 +04:00
t = get_tracer_for_array ( tr , t - > next ) ;
2008-05-12 23:20:42 +04:00
return t ;
}
static void * t_start ( struct seq_file * m , loff_t * pos )
{
2013-11-07 07:42:48 +04:00
struct trace_array * tr = m - > private ;
2009-06-24 05:53:44 +04:00
struct tracer * t ;
2008-05-12 23:20:42 +04:00
loff_t l = 0 ;
mutex_lock ( & trace_types_lock ) ;
2013-11-07 07:42:48 +04:00
t = get_tracer_for_array ( tr , trace_types ) ;
for ( ; t & & l < * pos ; t = t_next ( m , t , & l ) )
;
2008-05-12 23:20:42 +04:00
return t ;
}
static void t_stop ( struct seq_file * m , void * p )
{
mutex_unlock ( & trace_types_lock ) ;
}
static int t_show ( struct seq_file * m , void * v )
{
struct tracer * t = v ;
if ( ! t )
return 0 ;
2014-11-08 23:42:10 +03:00
seq_puts ( m , t - > name ) ;
2008-05-12 23:20:42 +04:00
if ( t - > next )
seq_putc ( m , ' ' ) ;
else
seq_putc ( m , ' \n ' ) ;
return 0 ;
}
2009-09-23 03:43:43 +04:00
static const struct seq_operations show_traces_seq_ops = {
2008-05-12 23:20:46 +04:00
. start = t_start ,
. next = t_next ,
. stop = t_stop ,
. show = t_show ,
2008-05-12 23:20:42 +04:00
} ;
static int show_traces_open ( struct inode * inode , struct file * file )
{
2013-11-07 07:42:48 +04:00
struct trace_array * tr = inode - > i_private ;
struct seq_file * m ;
int ret ;
2008-05-12 23:20:44 +04:00
if ( tracing_disabled )
return - ENODEV ;
2013-11-07 07:42:48 +04:00
ret = seq_open ( file , & show_traces_seq_ops ) ;
if ( ret )
return ret ;
m = file - > private_data ;
m - > private = tr ;
return 0 ;
2008-05-12 23:20:42 +04:00
}
2009-03-18 17:40:24 +03:00
static ssize_t
tracing_write_stub ( struct file * filp , const char __user * ubuf ,
size_t count , loff_t * ppos )
{
return count ;
}
2013-12-22 02:39:40 +04:00
loff_t tracing_lseek ( struct file * file , loff_t offset , int whence )
2010-11-25 02:13:16 +03:00
{
2013-12-22 02:39:40 +04:00
int ret ;
2010-11-25 02:13:16 +03:00
if ( file - > f_mode & FMODE_READ )
2013-12-22 02:39:40 +04:00
ret = seq_lseek ( file , offset , whence ) ;
2010-11-25 02:13:16 +03:00
else
2013-12-22 02:39:40 +04:00
file - > f_pos = ret = 0 ;
return ret ;
2010-11-25 02:13:16 +03:00
}
2009-03-06 05:44:55 +03:00
static const struct file_operations tracing_fops = {
2008-05-12 23:20:46 +04:00
. open = tracing_open ,
. read = seq_read ,
2009-03-18 17:40:24 +03:00
. write = tracing_write_stub ,
2013-12-22 02:39:40 +04:00
. llseek = tracing_lseek ,
2008-05-12 23:20:46 +04:00
. release = tracing_release ,
2008-05-12 23:20:42 +04:00
} ;
2009-03-06 05:44:55 +03:00
static const struct file_operations show_traces_fops = {
2008-05-12 23:20:52 +04:00
. open = show_traces_open ,
. read = seq_read ,
. release = seq_release ,
2010-07-08 01:40:11 +04:00
. llseek = seq_lseek ,
2008-05-12 23:20:52 +04:00
} ;
2008-05-12 23:20:52 +04:00
/*
* The tracer itself will not take this lock , but still we want
* to provide a consistent cpumask to user - space :
*/
static DEFINE_MUTEX ( tracing_cpumask_update_lock ) ;
/*
* Temporary storage for the character representation of the
* CPU bitmask ( and one more byte for the newline ) :
*/
static char mask_str [ NR_CPUS + 1 ] ;
2008-05-12 23:20:52 +04:00
static ssize_t
tracing_cpumask_read ( struct file * filp , char __user * ubuf ,
size_t count , loff_t * ppos )
{
2013-08-08 20:47:45 +04:00
struct trace_array * tr = file_inode ( filp ) - > i_private ;
2008-05-12 23:20:52 +04:00
int len ;
2008-05-12 23:20:52 +04:00
mutex_lock ( & tracing_cpumask_update_lock ) ;
2008-05-12 23:20:52 +04:00
2015-02-14 01:37:39 +03:00
len = snprintf ( mask_str , count , " %*pb \n " ,
cpumask_pr_args ( tr - > tracing_cpumask ) ) ;
if ( len > = count ) {
2008-05-12 23:20:52 +04:00
count = - EINVAL ;
goto out_err ;
}
count = simple_read_from_buffer ( ubuf , count , ppos , mask_str , NR_CPUS + 1 ) ;
out_err :
2008-05-12 23:20:52 +04:00
mutex_unlock ( & tracing_cpumask_update_lock ) ;
return count ;
}
static ssize_t
tracing_cpumask_write ( struct file * filp , const char __user * ubuf ,
size_t count , loff_t * ppos )
{
2013-08-08 20:47:45 +04:00
struct trace_array * tr = file_inode ( filp ) - > i_private ;
2009-01-01 02:42:22 +03:00
cpumask_var_t tracing_cpumask_new ;
2012-05-11 21:29:49 +04:00
int err , cpu ;
2009-01-01 02:42:22 +03:00
if ( ! alloc_cpumask_var ( & tracing_cpumask_new , GFP_KERNEL ) )
return - ENOMEM ;
2008-05-12 23:20:52 +04:00
2009-01-01 02:42:22 +03:00
err = cpumask_parse_user ( ubuf , count , tracing_cpumask_new ) ;
2008-05-12 23:20:52 +04:00
if ( err )
2008-05-12 23:20:52 +04:00
goto err_unlock ;
2009-06-15 06:56:42 +04:00
mutex_lock ( & tracing_cpumask_update_lock ) ;
2008-12-02 23:34:05 +03:00
local_irq_disable ( ) ;
2014-01-14 19:04:59 +04:00
arch_spin_lock ( & tr - > max_lock ) ;
2008-05-12 23:21:00 +04:00
for_each_tracing_cpu ( cpu ) {
2008-05-12 23:20:52 +04:00
/*
* Increase / decrease the disabled counter if we are
* about to flip a bit in the cpumask :
*/
2013-08-08 20:47:45 +04:00
if ( cpumask_test_cpu ( cpu , tr - > tracing_cpumask ) & &
2009-01-01 02:42:22 +03:00
! cpumask_test_cpu ( cpu , tracing_cpumask_new ) ) {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
atomic_inc ( & per_cpu_ptr ( tr - > trace_buffer . data , cpu ) - > disabled ) ;
ring_buffer_record_disable_cpu ( tr - > trace_buffer . buffer , cpu ) ;
2008-05-12 23:20:52 +04:00
}
2013-08-08 20:47:45 +04:00
if ( ! cpumask_test_cpu ( cpu , tr - > tracing_cpumask ) & &
2009-01-01 02:42:22 +03:00
cpumask_test_cpu ( cpu , tracing_cpumask_new ) ) {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
atomic_dec ( & per_cpu_ptr ( tr - > trace_buffer . data , cpu ) - > disabled ) ;
ring_buffer_record_enable_cpu ( tr - > trace_buffer . buffer , cpu ) ;
2008-05-12 23:20:52 +04:00
}
}
2014-01-14 19:04:59 +04:00
arch_spin_unlock ( & tr - > max_lock ) ;
2008-12-02 23:34:05 +03:00
local_irq_enable ( ) ;
2008-05-12 23:20:52 +04:00
2013-08-08 20:47:45 +04:00
cpumask_copy ( tr - > tracing_cpumask , tracing_cpumask_new ) ;
2008-05-12 23:20:52 +04:00
mutex_unlock ( & tracing_cpumask_update_lock ) ;
2009-01-01 02:42:22 +03:00
free_cpumask_var ( tracing_cpumask_new ) ;
2008-05-12 23:20:52 +04:00
return count ;
2008-05-12 23:20:52 +04:00
err_unlock :
2009-06-15 06:56:42 +04:00
free_cpumask_var ( tracing_cpumask_new ) ;
2008-05-12 23:20:52 +04:00
return err ;
2008-05-12 23:20:52 +04:00
}
2009-03-06 05:44:55 +03:00
static const struct file_operations tracing_cpumask_fops = {
2013-08-08 20:47:45 +04:00
. open = tracing_open_generic_tr ,
2008-05-12 23:20:52 +04:00
. read = tracing_cpumask_read ,
. write = tracing_cpumask_write ,
2013-08-08 20:47:45 +04:00
. release = tracing_release_generic_tr ,
2010-07-08 01:40:11 +04:00
. llseek = generic_file_llseek ,
2008-05-12 23:20:42 +04:00
} ;
2009-12-08 06:15:59 +03:00
static int tracing_trace_options_show ( struct seq_file * m , void * v )
2008-05-12 23:20:42 +04:00
{
2009-02-27 07:55:58 +03:00
struct tracer_opt * trace_opts ;
2012-05-11 21:29:49 +04:00
struct trace_array * tr = m - > private ;
2009-02-27 07:55:58 +03:00
u32 tracer_flags ;
int i ;
2008-11-17 21:23:42 +03:00
2009-02-27 07:55:58 +03:00
mutex_lock ( & trace_types_lock ) ;
2012-05-11 21:29:49 +04:00
tracer_flags = tr - > current_trace - > flags - > val ;
trace_opts = tr - > current_trace - > flags - > opts ;
2009-02-27 07:55:58 +03:00
2008-05-12 23:20:42 +04:00
for ( i = 0 ; trace_options [ i ] ; i + + ) {
if ( trace_flags & ( 1 < < i ) )
2009-12-08 06:15:59 +03:00
seq_printf ( m , " %s \n " , trace_options [ i ] ) ;
2008-05-12 23:20:42 +04:00
else
2009-12-08 06:15:59 +03:00
seq_printf ( m , " no%s \n " , trace_options [ i ] ) ;
2008-05-12 23:20:42 +04:00
}
2008-11-17 21:23:42 +03:00
for ( i = 0 ; trace_opts [ i ] . name ; i + + ) {
if ( tracer_flags & trace_opts [ i ] . bit )
2009-12-08 06:15:59 +03:00
seq_printf ( m , " %s \n " , trace_opts [ i ] . name ) ;
2008-11-17 21:23:42 +03:00
else
2009-12-08 06:15:59 +03:00
seq_printf ( m , " no%s \n " , trace_opts [ i ] . name ) ;
2008-11-17 21:23:42 +03:00
}
2009-02-27 07:55:58 +03:00
mutex_unlock ( & trace_types_lock ) ;
2008-11-17 21:23:42 +03:00
2009-12-08 06:15:59 +03:00
return 0 ;
2008-05-12 23:20:42 +04:00
}
2014-01-10 20:13:54 +04:00
static int __set_tracer_option ( struct trace_array * tr ,
2009-12-08 06:17:06 +03:00
struct tracer_flags * tracer_flags ,
struct tracer_opt * opts , int neg )
{
2014-01-10 20:13:54 +04:00
struct tracer * trace = tr - > current_trace ;
2009-12-08 06:17:06 +03:00
int ret ;
2008-05-12 23:20:42 +04:00
2014-01-10 20:13:54 +04:00
ret = trace - > set_flag ( tr , tracer_flags - > val , opts - > bit , ! neg ) ;
2009-12-08 06:17:06 +03:00
if ( ret )
return ret ;
if ( neg )
tracer_flags - > val & = ~ opts - > bit ;
else
tracer_flags - > val | = opts - > bit ;
return 0 ;
2008-05-12 23:20:42 +04:00
}
2008-11-17 21:23:42 +03:00
/* Try to assign a tracer specific option */
2014-01-10 20:13:54 +04:00
static int set_tracer_option ( struct trace_array * tr , char * cmp , int neg )
2008-11-17 21:23:42 +03:00
{
2014-01-10 20:13:54 +04:00
struct tracer * trace = tr - > current_trace ;
2009-08-07 14:53:21 +04:00
struct tracer_flags * tracer_flags = trace - > flags ;
2008-11-17 21:23:42 +03:00
struct tracer_opt * opts = NULL ;
2009-12-08 06:17:06 +03:00
int i ;
2008-11-17 21:23:42 +03:00
2009-08-07 14:53:21 +04:00
for ( i = 0 ; tracer_flags - > opts [ i ] . name ; i + + ) {
opts = & tracer_flags - > opts [ i ] ;
2008-11-17 21:23:42 +03:00
2009-12-08 06:17:06 +03:00
if ( strcmp ( cmp , opts - > name ) = = 0 )
2014-01-10 20:13:54 +04:00
return __set_tracer_option ( tr , trace - > flags , opts , neg ) ;
2008-11-17 21:23:42 +03:00
}
2009-12-08 06:17:06 +03:00
return - EINVAL ;
2008-11-17 21:23:42 +03:00
}
2013-03-14 23:03:53 +04:00
/* Some tracers require overwrite to stay enabled */
int trace_keep_overwrite ( struct tracer * tracer , u32 mask , int set )
{
if ( tracer - > enabled & & ( mask & TRACE_ITER_OVERWRITE ) & & ! set )
return - 1 ;
return 0 ;
}
2012-05-11 21:29:49 +04:00
int set_tracer_flag ( struct trace_array * tr , unsigned int mask , int enabled )
2009-03-18 01:09:55 +03:00
{
/* do nothing if flag is already set */
if ( ! ! ( trace_flags & mask ) = = ! ! enabled )
2013-03-14 23:03:53 +04:00
return 0 ;
/* Give the tracer a chance to approve the change */
2012-05-11 21:29:49 +04:00
if ( tr - > current_trace - > flag_changed )
2014-01-11 02:51:01 +04:00
if ( tr - > current_trace - > flag_changed ( tr , mask , ! ! enabled ) )
2013-03-14 23:03:53 +04:00
return - EINVAL ;
2009-03-18 01:09:55 +03:00
if ( enabled )
trace_flags | = mask ;
else
trace_flags & = ~ mask ;
2010-07-02 07:07:32 +04:00
if ( mask = = TRACE_ITER_RECORD_CMD )
trace_event_enable_cmd_record ( enabled ) ;
2010-12-09 00:46:47 +03:00
2013-03-14 22:20:54 +04:00
if ( mask = = TRACE_ITER_OVERWRITE ) {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ring_buffer_change_overwrite ( tr - > trace_buffer . buffer , enabled ) ;
2013-03-14 22:20:54 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ring_buffer_change_overwrite ( tr - > max_buffer . buffer , enabled ) ;
2013-03-14 22:20:54 +04:00
# endif
}
2012-10-11 18:15:05 +04:00
if ( mask = = TRACE_ITER_PRINTK )
trace_printk_start_stop_comm ( enabled ) ;
2013-03-14 23:03:53 +04:00
return 0 ;
2009-03-18 01:09:55 +03:00
}
2012-05-11 21:29:49 +04:00
static int trace_set_options ( struct trace_array * tr , char * option )
2008-05-12 23:20:42 +04:00
{
2009-12-08 06:17:06 +03:00
char * cmp ;
2008-05-12 23:20:42 +04:00
int neg = 0 ;
2013-03-14 23:03:53 +04:00
int ret = - ENODEV ;
2008-05-12 23:20:42 +04:00
int i ;
2012-11-02 06:56:07 +04:00
cmp = strstrip ( option ) ;
2008-05-12 23:20:42 +04:00
2009-12-08 06:17:06 +03:00
if ( strncmp ( cmp , " no " , 2 ) = = 0 ) {
2008-05-12 23:20:42 +04:00
neg = 1 ;
cmp + = 2 ;
}
2013-03-14 21:50:56 +04:00
mutex_lock ( & trace_types_lock ) ;
2008-05-12 23:20:42 +04:00
for ( i = 0 ; trace_options [ i ] ; i + + ) {
2009-12-08 06:17:06 +03:00
if ( strcmp ( cmp , trace_options [ i ] ) = = 0 ) {
2012-05-11 21:29:49 +04:00
ret = set_tracer_flag ( tr , 1 < < i , ! neg ) ;
2008-05-12 23:20:42 +04:00
break ;
}
}
2008-11-17 21:23:42 +03:00
/* If no option could be set, test the specific tracer options */
2013-03-14 21:50:56 +04:00
if ( ! trace_options [ i ] )
2014-01-10 20:13:54 +04:00
ret = set_tracer_option ( tr , cmp , neg ) ;
2013-03-14 21:50:56 +04:00
mutex_unlock ( & trace_types_lock ) ;
2008-05-12 23:20:42 +04:00
2012-11-02 06:56:07 +04:00
return ret ;
}
static ssize_t
tracing_trace_options_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2012-05-11 21:29:49 +04:00
struct seq_file * m = filp - > private_data ;
struct trace_array * tr = m - > private ;
2012-11-02 06:56:07 +04:00
char buf [ 64 ] ;
2013-03-14 23:03:53 +04:00
int ret ;
2012-11-02 06:56:07 +04:00
if ( cnt > = sizeof ( buf ) )
return - EINVAL ;
if ( copy_from_user ( & buf , ubuf , cnt ) )
return - EFAULT ;
2013-01-10 05:54:17 +04:00
buf [ cnt ] = 0 ;
2012-05-11 21:29:49 +04:00
ret = trace_set_options ( tr , buf ) ;
2013-03-14 23:03:53 +04:00
if ( ret < 0 )
return ret ;
2012-11-02 06:56:07 +04:00
2009-10-24 03:36:16 +04:00
* ppos + = cnt ;
2008-05-12 23:20:42 +04:00
return cnt ;
}
2009-12-08 06:15:59 +03:00
static int tracing_trace_options_open ( struct inode * inode , struct file * file )
{
2013-07-02 07:34:22 +04:00
struct trace_array * tr = inode - > i_private ;
2013-07-18 22:18:44 +04:00
int ret ;
2013-07-02 07:34:22 +04:00
2009-12-08 06:15:59 +03:00
if ( tracing_disabled )
return - ENODEV ;
2012-05-11 21:29:49 +04:00
2013-07-02 07:34:22 +04:00
if ( trace_array_get ( tr ) < 0 )
return - ENODEV ;
2013-07-18 22:18:44 +04:00
ret = single_open ( file , tracing_trace_options_show , inode - > i_private ) ;
if ( ret < 0 )
trace_array_put ( tr ) ;
return ret ;
2009-12-08 06:15:59 +03:00
}
2009-03-06 05:44:55 +03:00
static const struct file_operations tracing_iter_fops = {
2009-12-08 06:15:59 +03:00
. open = tracing_trace_options_open ,
. read = seq_read ,
. llseek = seq_lseek ,
2013-07-02 07:34:22 +04:00
. release = tracing_single_release_tr ,
2008-11-13 01:52:37 +03:00
. write = tracing_trace_options_write ,
2008-05-12 23:20:42 +04:00
} ;
2008-05-12 23:20:45 +04:00
static const char readme_msg [ ] =
" tracing mini-HOWTO: \n \n "
2013-03-16 01:23:20 +04:00
" # echo 0 > tracing_on : quick way to disable tracing \n "
" # echo 1 > tracing_on : quick way to re-enable tracing \n \n "
" Important files: \n "
" trace \t \t \t - The static contents of the buffer \n "
" \t \t \t To clear the buffer write into this file: echo > trace \n "
" trace_pipe \t \t - A consuming read to see the contents of the buffer \n "
" current_tracer \t - function and latency tracers \n "
" available_tracers \t - list of configured tracers for current_tracer \n "
" buffer_size_kb \t - view and modify size of per cpu buffer \n "
" buffer_total_size_kb - view total size of all cpu buffers \n \n "
" trace_clock \t \t -change the clock used to order events \n "
" local: Per cpu clock but may not be synced across CPUs \n "
" global: Synced across CPUs but slows tracing down. \n "
" counter: Not a clock, but just an increment \n "
" uptime: Jiffy counter from time of boot \n "
" perf: Same clock that perf events use \n "
# ifdef CONFIG_X86_64
" x86-tsc: TSC cycle counter \n "
# endif
" \n trace_marker \t \t - Writes into this file writes into the kernel buffer \n "
" tracing_cpumask \t - Limit which CPUs to trace \n "
" instances \t \t - Make sub-buffers with: mkdir instances/foo \n "
" \t \t \t Remove sub-buffer with rmdir \n "
" trace_options \t \t - Set format or modify how tracing happens \n "
2014-01-23 09:10:04 +04:00
" \t \t \t Disable an option by adding a suffix 'no' to the \n "
" \t \t \t option name \n "
2014-06-05 05:24:27 +04:00
" saved_cmdlines_size \t - echo command number in here to store comm-pid list \n "
2013-03-16 01:23:20 +04:00
# ifdef CONFIG_DYNAMIC_FTRACE
" \n available_filter_functions - list of functions that can be filtered on \n "
2014-01-23 09:10:04 +04:00
" set_ftrace_filter \t - echo function name in here to only trace these \n "
" \t \t \t functions \n "
" \t accepts: func_full_name, *func_end, func_begin*, *func_middle* \n "
" \t modules: Can select a group via module \n "
" \t Format: :mod:<module-name> \n "
" \t example: echo :mod:ext3 > set_ftrace_filter \n "
" \t triggers: a command to perform when function is hit \n "
" \t Format: <function>:<trigger>[:count] \n "
" \t trigger: traceon, traceoff \n "
" \t \t enable_event:<system>:<event> \n "
" \t \t disable_event:<system>:<event> \n "
2013-03-16 01:23:20 +04:00
# ifdef CONFIG_STACKTRACE
2014-01-23 09:10:04 +04:00
" \t \t stacktrace \n "
2013-03-16 01:23:20 +04:00
# endif
# ifdef CONFIG_TRACER_SNAPSHOT
2014-01-23 09:10:04 +04:00
" \t \t snapshot \n "
2013-03-16 01:23:20 +04:00
# endif
2014-04-11 06:43:37 +04:00
" \t \t dump \n "
" \t \t cpudump \n "
2014-01-23 09:10:04 +04:00
" \t example: echo do_fault:traceoff > set_ftrace_filter \n "
" \t echo do_trap:traceoff:3 > set_ftrace_filter \n "
" \t The first one will disable tracing every time do_fault is hit \n "
" \t The second will disable tracing at most 3 times when do_trap is hit \n "
" \t The first time do trap is hit and it disables tracing, the \n "
" \t counter will decrement to 2. If tracing is already disabled, \n "
" \t the counter will not decrement. It only decrements when the \n "
" \t trigger did work \n "
" \t To remove trigger without count: \n "
" \t echo '!<function>:<trigger> > set_ftrace_filter \n "
" \t To remove trigger with a count: \n "
" \t echo '!<function>:<trigger>:0 > set_ftrace_filter \n "
2013-03-16 01:23:20 +04:00
" set_ftrace_notrace \t - echo function name in here to never trace. \n "
2014-01-23 09:10:04 +04:00
" \t accepts: func_full_name, *func_end, func_begin*, *func_middle* \n "
" \t modules: Can select a group via module command :mod: \n "
" \t Does not accept triggers \n "
2013-03-16 01:23:20 +04:00
# endif /* CONFIG_DYNAMIC_FTRACE */
# ifdef CONFIG_FUNCTION_TRACER
2014-01-23 09:10:04 +04:00
" set_ftrace_pid \t - Write pid(s) to only function trace those pids \n "
" \t \t (function) \n "
2013-03-16 01:23:20 +04:00
# endif
# ifdef CONFIG_FUNCTION_GRAPH_TRACER
" set_graph_function \t - Trace the nested calls of a function (function_graph) \n "
2014-06-12 20:23:53 +04:00
" set_graph_notrace \t - Do not trace the nested calls of a function (function_graph) \n "
2013-03-16 01:23:20 +04:00
" max_graph_depth \t - Trace a limited depth of nested calls (0 is unlimited) \n "
# endif
# ifdef CONFIG_TRACER_SNAPSHOT
2014-01-23 09:10:04 +04:00
" \n snapshot \t \t - Like 'trace' but shows the content of the static \n "
" \t \t \t snapshot buffer. Read the contents for more \n "
" \t \t \t information \n "
2013-03-16 01:23:20 +04:00
# endif
2013-07-15 12:32:34 +04:00
# ifdef CONFIG_STACK_TRACER
2013-03-16 01:23:20 +04:00
" stack_trace \t \t - Shows the max stack trace when active \n "
" stack_max_size \t - Shows current max stack size that was traced \n "
2014-01-23 09:10:04 +04:00
" \t \t \t Write into this file to reset the max size (trigger a \n "
" \t \t \t new trace) \n "
2013-03-16 01:23:20 +04:00
# ifdef CONFIG_DYNAMIC_FTRACE
2014-01-23 09:10:04 +04:00
" stack_trace_filter \t - Like set_ftrace_filter but limits what stack_trace \n "
" \t \t \t traces \n "
2013-03-16 01:23:20 +04:00
# endif
2013-07-15 12:32:34 +04:00
# endif /* CONFIG_STACK_TRACER */
2014-01-18 01:11:44 +04:00
" events/ \t \t - Directory containing all trace event subsystems: \n "
" enable \t \t - Write 0/1 to enable/disable tracing of all events \n "
" events/<system>/ \t - Directory containing all trace events for <system>: \n "
2014-01-23 09:10:04 +04:00
" enable \t \t - Write 0/1 to enable/disable tracing of all <system> \n "
" \t \t \t events \n "
2014-01-18 01:11:44 +04:00
" filter \t \t - If set, only events passing filter are traced \n "
2014-01-23 09:10:04 +04:00
" events/<system>/<event>/ \t - Directory containing control files for \n "
" \t \t \t <event>: \n "
2014-01-18 01:11:44 +04:00
" enable \t \t - Write 0/1 to enable/disable tracing of <event> \n "
" filter \t \t - If set, only events passing filter are traced \n "
" trigger \t \t - If set, a command to perform when event is hit \n "
2014-01-23 09:10:04 +04:00
" \t Format: <trigger>[:count][if <filter>] \n "
" \t trigger: traceon, traceoff \n "
" \t enable_event:<system>:<event> \n "
" \t disable_event:<system>:<event> \n "
2014-01-18 01:11:44 +04:00
# ifdef CONFIG_STACKTRACE
2014-01-23 09:10:04 +04:00
" \t \t stacktrace \n "
2014-01-18 01:11:44 +04:00
# endif
# ifdef CONFIG_TRACER_SNAPSHOT
2014-01-23 09:10:04 +04:00
" \t \t snapshot \n "
2014-01-18 01:11:44 +04:00
# endif
2014-01-23 09:10:04 +04:00
" \t example: echo traceoff > events/block/block_unplug/trigger \n "
" \t echo traceoff:3 > events/block/block_unplug/trigger \n "
" \t echo 'enable_event:kmem:kmalloc:3 if nr_rq > 1' > \\ \n "
" \t events/block/block_unplug/trigger \n "
" \t The first disables tracing every time block_unplug is hit. \n "
" \t The second disables tracing the first 3 times block_unplug is hit. \n "
" \t The third enables the kmalloc event the first 3 times block_unplug \n "
" \t is hit and has value of greater than 1 for the 'nr_rq' event field. \n "
" \t Like function triggers, the counter is only decremented if it \n "
" \t enabled or disabled tracing. \n "
" \t To remove a trigger without a count: \n "
" \t echo '!<trigger> > <system>/<event>/trigger \n "
" \t To remove a trigger with a count: \n "
" \t echo '!<trigger>:0 > <system>/<event>/trigger \n "
" \t Filters can be ignored when removing a trigger. \n "
2008-05-12 23:20:45 +04:00
;
static ssize_t
tracing_readme_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
return simple_read_from_buffer ( ubuf , cnt , ppos ,
readme_msg , strlen ( readme_msg ) ) ;
}
2009-03-06 05:44:55 +03:00
static const struct file_operations tracing_readme_fops = {
2008-05-12 23:20:52 +04:00
. open = tracing_open_generic ,
. read = tracing_readme_read ,
2010-07-08 01:40:11 +04:00
. llseek = generic_file_llseek ,
2008-05-12 23:20:45 +04:00
} ;
2014-02-20 12:44:31 +04:00
static void * saved_cmdlines_next ( struct seq_file * m , void * v , loff_t * pos )
{
unsigned int * ptr = v ;
2009-04-11 00:04:48 +04:00
2014-02-20 12:44:31 +04:00
if ( * pos | | m - > count )
ptr + + ;
2009-04-11 00:04:48 +04:00
2014-02-20 12:44:31 +04:00
( * pos ) + + ;
2009-04-11 00:04:48 +04:00
2014-06-05 05:24:27 +04:00
for ( ; ptr < & savedcmd - > map_cmdline_to_pid [ savedcmd - > cmdline_num ] ;
ptr + + ) {
2014-02-20 12:44:31 +04:00
if ( * ptr = = - 1 | | * ptr = = NO_CMDLINE_MAP )
continue ;
2009-04-11 00:04:48 +04:00
2014-02-20 12:44:31 +04:00
return ptr ;
}
2009-04-11 00:04:48 +04:00
2014-02-20 12:44:31 +04:00
return NULL ;
}
static void * saved_cmdlines_start ( struct seq_file * m , loff_t * pos )
{
void * v ;
loff_t l = 0 ;
2009-04-11 00:04:48 +04:00
2014-05-30 18:49:46 +04:00
preempt_disable ( ) ;
arch_spin_lock ( & trace_cmdline_lock ) ;
2014-06-05 05:24:27 +04:00
v = & savedcmd - > map_cmdline_to_pid [ 0 ] ;
2014-02-20 12:44:31 +04:00
while ( l < = * pos ) {
v = saved_cmdlines_next ( m , v , & l ) ;
if ( ! v )
return NULL ;
2009-04-11 00:04:48 +04:00
}
2014-02-20 12:44:31 +04:00
return v ;
}
static void saved_cmdlines_stop ( struct seq_file * m , void * v )
{
2014-05-30 18:49:46 +04:00
arch_spin_unlock ( & trace_cmdline_lock ) ;
preempt_enable ( ) ;
2014-02-20 12:44:31 +04:00
}
2009-04-11 00:04:48 +04:00
2014-02-20 12:44:31 +04:00
static int saved_cmdlines_show ( struct seq_file * m , void * v )
{
char buf [ TASK_COMM_LEN ] ;
unsigned int * pid = v ;
2009-04-11 00:04:48 +04:00
2014-05-30 18:49:46 +04:00
__trace_find_cmdline ( * pid , buf ) ;
2014-02-20 12:44:31 +04:00
seq_printf ( m , " %d %s \n " , * pid , buf ) ;
return 0 ;
}
static const struct seq_operations tracing_saved_cmdlines_seq_ops = {
. start = saved_cmdlines_start ,
. next = saved_cmdlines_next ,
. stop = saved_cmdlines_stop ,
. show = saved_cmdlines_show ,
} ;
static int tracing_saved_cmdlines_open ( struct inode * inode , struct file * filp )
{
if ( tracing_disabled )
return - ENODEV ;
return seq_open ( filp , & tracing_saved_cmdlines_seq_ops ) ;
2009-04-11 00:04:48 +04:00
}
static const struct file_operations tracing_saved_cmdlines_fops = {
2014-02-20 12:44:31 +04:00
. open = tracing_saved_cmdlines_open ,
. read = seq_read ,
. llseek = seq_lseek ,
. release = seq_release ,
2009-04-11 00:04:48 +04:00
} ;
2014-06-05 05:24:27 +04:00
static ssize_t
tracing_saved_cmdlines_size_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
char buf [ 64 ] ;
int r ;
arch_spin_lock ( & trace_cmdline_lock ) ;
2014-06-10 11:11:35 +04:00
r = scnprintf ( buf , sizeof ( buf ) , " %u \n " , savedcmd - > cmdline_num ) ;
2014-06-05 05:24:27 +04:00
arch_spin_unlock ( & trace_cmdline_lock ) ;
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
}
static void free_saved_cmdlines_buffer ( struct saved_cmdlines_buffer * s )
{
kfree ( s - > saved_cmdlines ) ;
kfree ( s - > map_cmdline_to_pid ) ;
kfree ( s ) ;
}
static int tracing_resize_saved_cmdlines ( unsigned int val )
{
struct saved_cmdlines_buffer * s , * savedcmd_temp ;
2014-06-10 11:11:35 +04:00
s = kmalloc ( sizeof ( * s ) , GFP_KERNEL ) ;
2014-06-05 05:24:27 +04:00
if ( ! s )
return - ENOMEM ;
if ( allocate_cmdlines_buffer ( val , s ) < 0 ) {
kfree ( s ) ;
return - ENOMEM ;
}
arch_spin_lock ( & trace_cmdline_lock ) ;
savedcmd_temp = savedcmd ;
savedcmd = s ;
arch_spin_unlock ( & trace_cmdline_lock ) ;
free_saved_cmdlines_buffer ( savedcmd_temp ) ;
return 0 ;
}
static ssize_t
tracing_saved_cmdlines_size_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
unsigned long val ;
int ret ;
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
return ret ;
/* must have at least 1 entry or less than PID_MAX_DEFAULT */
if ( ! val | | val > PID_MAX_DEFAULT )
return - EINVAL ;
ret = tracing_resize_saved_cmdlines ( ( unsigned int ) val ) ;
if ( ret < 0 )
return ret ;
* ppos + = cnt ;
return cnt ;
}
static const struct file_operations tracing_saved_cmdlines_size_fops = {
. open = tracing_open_generic ,
. read = tracing_saved_cmdlines_size_read ,
. write = tracing_saved_cmdlines_size_write ,
} ;
2015-04-01 00:23:45 +03:00
# ifdef CONFIG_TRACE_ENUM_MAP_FILE
static union trace_enum_map_item *
update_enum_map ( union trace_enum_map_item * ptr )
{
if ( ! ptr - > map . enum_string ) {
if ( ptr - > tail . next ) {
ptr = ptr - > tail . next ;
/* Set ptr to the next real item (skip head) */
ptr + + ;
} else
return NULL ;
}
return ptr ;
}
static void * enum_map_next ( struct seq_file * m , void * v , loff_t * pos )
{
union trace_enum_map_item * ptr = v ;
/*
* Paranoid ! If ptr points to end , we don ' t want to increment past it .
* This really should never happen .
*/
ptr = update_enum_map ( ptr ) ;
if ( WARN_ON_ONCE ( ! ptr ) )
return NULL ;
ptr + + ;
( * pos ) + + ;
ptr = update_enum_map ( ptr ) ;
return ptr ;
}
static void * enum_map_start ( struct seq_file * m , loff_t * pos )
{
union trace_enum_map_item * v ;
loff_t l = 0 ;
mutex_lock ( & trace_enum_mutex ) ;
v = trace_enum_maps ;
if ( v )
v + + ;
while ( v & & l < * pos ) {
v = enum_map_next ( m , v , & l ) ;
}
return v ;
}
static void enum_map_stop ( struct seq_file * m , void * v )
{
mutex_unlock ( & trace_enum_mutex ) ;
}
static int enum_map_show ( struct seq_file * m , void * v )
{
union trace_enum_map_item * ptr = v ;
seq_printf ( m , " %s %ld (%s) \n " ,
ptr - > map . enum_string , ptr - > map . enum_value ,
ptr - > map . system ) ;
return 0 ;
}
static const struct seq_operations tracing_enum_map_seq_ops = {
. start = enum_map_start ,
. next = enum_map_next ,
. stop = enum_map_stop ,
. show = enum_map_show ,
} ;
static int tracing_enum_map_open ( struct inode * inode , struct file * filp )
{
if ( tracing_disabled )
return - ENODEV ;
return seq_open ( filp , & tracing_enum_map_seq_ops ) ;
}
static const struct file_operations tracing_enum_map_fops = {
. open = tracing_enum_map_open ,
. read = seq_read ,
. llseek = seq_lseek ,
. release = seq_release ,
} ;
static inline union trace_enum_map_item *
trace_enum_jmp_to_tail ( union trace_enum_map_item * ptr )
{
/* Return tail of array given the head */
return ptr + ptr - > head . length + 1 ;
}
static void
trace_insert_enum_map_file ( struct module * mod , struct trace_enum_map * * start ,
int len )
{
struct trace_enum_map * * stop ;
struct trace_enum_map * * map ;
union trace_enum_map_item * map_array ;
union trace_enum_map_item * ptr ;
stop = start + len ;
/*
* The trace_enum_maps contains the map plus a head and tail item ,
* where the head holds the module and length of array , and the
* tail holds a pointer to the next list .
*/
map_array = kmalloc ( sizeof ( * map_array ) * ( len + 2 ) , GFP_KERNEL ) ;
if ( ! map_array ) {
pr_warning ( " Unable to allocate trace enum mapping \n " ) ;
return ;
}
mutex_lock ( & trace_enum_mutex ) ;
if ( ! trace_enum_maps )
trace_enum_maps = map_array ;
else {
ptr = trace_enum_maps ;
for ( ; ; ) {
ptr = trace_enum_jmp_to_tail ( ptr ) ;
if ( ! ptr - > tail . next )
break ;
ptr = ptr - > tail . next ;
}
ptr - > tail . next = map_array ;
}
map_array - > head . mod = mod ;
map_array - > head . length = len ;
map_array + + ;
for ( map = start ; ( unsigned long ) map < ( unsigned long ) stop ; map + + ) {
map_array - > map = * * map ;
map_array + + ;
}
memset ( map_array , 0 , sizeof ( * map_array ) ) ;
mutex_unlock ( & trace_enum_mutex ) ;
}
static void trace_create_enum_file ( struct dentry * d_tracer )
{
trace_create_file ( " enum_map " , 0444 , d_tracer ,
NULL , & tracing_enum_map_fops ) ;
}
# else /* CONFIG_TRACE_ENUM_MAP_FILE */
static inline void trace_create_enum_file ( struct dentry * d_tracer ) { }
static inline void trace_insert_enum_map_file ( struct module * mod ,
struct trace_enum_map * * start , int len ) { }
# endif /* !CONFIG_TRACE_ENUM_MAP_FILE */
static void trace_insert_enum_map ( struct module * mod ,
struct trace_enum_map * * start , int len )
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
Several tracepoints use the helper functions __print_symbolic() or
__print_flags() and pass in enums that do the mapping between the
binary data stored and the value to print. This works well for reading
the ASCII trace files, but when the data is read via userspace tools
such as perf and trace-cmd, the conversion of the binary value to a
human string format is lost if an enum is used, as userspace does not
have access to what the ENUM is.
For example, the tracepoint trace_tlb_flush() has:
__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })
Which maps the enum values to the strings they represent. But perf and
trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
not be able to map it.
With TRACE_DEFINE_ENUM(), developers can place these in the event header
files and ftrace will convert the enums to their values:
By adding:
TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);
$ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
[...]
__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })
The above is what userspace expects to see, and tools do not need to
be modified to parse them.
Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org
Cc: Guilherme Cox <cox@computer.org>
Cc: Tony Luck <tony.luck@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-03-25 00:58:09 +03:00
{
struct trace_enum_map * * map ;
if ( len < = 0 )
return ;
map = start ;
trace_event_enum_update ( map , len ) ;
2015-04-01 00:23:45 +03:00
trace_insert_enum_map_file ( mod , start , len ) ;
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
Several tracepoints use the helper functions __print_symbolic() or
__print_flags() and pass in enums that do the mapping between the
binary data stored and the value to print. This works well for reading
the ASCII trace files, but when the data is read via userspace tools
such as perf and trace-cmd, the conversion of the binary value to a
human string format is lost if an enum is used, as userspace does not
have access to what the ENUM is.
For example, the tracepoint trace_tlb_flush() has:
__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })
Which maps the enum values to the strings they represent. But perf and
trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
not be able to map it.
With TRACE_DEFINE_ENUM(), developers can place these in the event header
files and ftrace will convert the enums to their values:
By adding:
TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);
$ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
[...]
__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })
The above is what userspace expects to see, and tools do not need to
be modified to parse them.
Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org
Cc: Guilherme Cox <cox@computer.org>
Cc: Tony Luck <tony.luck@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-03-25 00:58:09 +03:00
}
2008-05-12 23:20:42 +04:00
static ssize_t
tracing_set_trace_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2012-05-11 21:29:49 +04:00
struct trace_array * tr = filp - > private_data ;
2009-09-18 10:06:47 +04:00
char buf [ MAX_TRACER_SIZE + 2 ] ;
2008-05-12 23:20:42 +04:00
int r ;
mutex_lock ( & trace_types_lock ) ;
2012-05-11 21:29:49 +04:00
r = sprintf ( buf , " %s \n " , tr - > current_trace - > name ) ;
2008-05-12 23:20:42 +04:00
mutex_unlock ( & trace_types_lock ) ;
2008-05-12 23:20:46 +04:00
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
2008-05-12 23:20:42 +04:00
}
2009-02-05 23:02:00 +03:00
int tracer_init ( struct tracer * t , struct trace_array * tr )
{
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
tracing_reset_online_cpus ( & tr - > trace_buffer ) ;
2009-02-05 23:02:00 +03:00
return t - > init ( tr ) ;
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
static void set_buffer_entries ( struct trace_buffer * buf , unsigned long val )
2012-02-03 00:00:41 +04:00
{
int cpu ;
2013-03-06 06:13:47 +04:00
2012-02-03 00:00:41 +04:00
for_each_tracing_cpu ( cpu )
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
per_cpu_ptr ( buf - > data , cpu ) - > entries = val ;
2012-02-03 00:00:41 +04:00
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
2012-10-17 06:56:16 +04:00
/* resize @tr's buffer to the size of @size_tr's entries */
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
static int resize_buffer_duplicate_size ( struct trace_buffer * trace_buf ,
struct trace_buffer * size_buf , int cpu_id )
2012-10-17 06:56:16 +04:00
{
int cpu , ret = 0 ;
if ( cpu_id = = RING_BUFFER_ALL_CPUS ) {
for_each_tracing_cpu ( cpu ) {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ret = ring_buffer_resize ( trace_buf - > buffer ,
per_cpu_ptr ( size_buf - > data , cpu ) - > entries , cpu ) ;
2012-10-17 06:56:16 +04:00
if ( ret < 0 )
break ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
per_cpu_ptr ( trace_buf - > data , cpu ) - > entries =
per_cpu_ptr ( size_buf - > data , cpu ) - > entries ;
2012-10-17 06:56:16 +04:00
}
} else {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ret = ring_buffer_resize ( trace_buf - > buffer ,
per_cpu_ptr ( size_buf - > data , cpu_id ) - > entries , cpu_id ) ;
2012-10-17 06:56:16 +04:00
if ( ret = = 0 )
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
per_cpu_ptr ( trace_buf - > data , cpu_id ) - > entries =
per_cpu_ptr ( size_buf - > data , cpu_id ) - > entries ;
2012-10-17 06:56:16 +04:00
}
return ret ;
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# endif /* CONFIG_TRACER_MAX_TRACE */
2012-10-17 06:56:16 +04:00
2012-05-11 21:29:49 +04:00
static int __tracing_resize_ring_buffer ( struct trace_array * tr ,
unsigned long size , int cpu )
2009-03-11 20:42:01 +03:00
{
int ret ;
/*
* If kernel or user changes the size of the ring buffer
2009-03-12 18:21:08 +03:00
* we use the size that was given , and we can forget about
* expanding it later .
2009-03-11 20:42:01 +03:00
*/
2013-03-08 07:48:09 +04:00
ring_buffer_expanded = true ;
2009-03-11 20:42:01 +03:00
2012-10-11 05:44:34 +04:00
/* May be called before buffers are initialized */
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
if ( ! tr - > trace_buffer . buffer )
2012-10-11 05:44:34 +04:00
return 0 ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ret = ring_buffer_resize ( tr - > trace_buffer . buffer , size , cpu ) ;
2009-03-11 20:42:01 +03:00
if ( ret < 0 )
return ret ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
2012-05-11 21:29:49 +04:00
if ( ! ( tr - > flags & TRACE_ARRAY_FL_GLOBAL ) | |
! tr - > current_trace - > use_max_tr )
2010-07-01 09:34:35 +04:00
goto out ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ret = ring_buffer_resize ( tr - > max_buffer . buffer , size , cpu ) ;
2009-03-11 20:42:01 +03:00
if ( ret < 0 ) {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
int r = resize_buffer_duplicate_size ( & tr - > trace_buffer ,
& tr - > trace_buffer , cpu ) ;
2009-03-11 20:42:01 +03:00
if ( r < 0 ) {
2009-03-12 18:21:08 +03:00
/*
* AARGH ! We are left with different
* size max buffer ! ! ! !
* The max buffer is our " snapshot " buffer .
* When a tracer needs a snapshot ( one of the
* latency tracers ) , it swaps the max buffer
* with the saved snap shot . We succeeded to
* update the size of the main buffer , but failed to
* update the size of the max buffer . But when we tried
* to reset the main buffer to the original size , we
* failed there too . This is very unlikely to
* happen , but if it does , warn and kill all
* tracing .
*/
2009-03-11 20:42:01 +03:00
WARN_ON ( 1 ) ;
tracing_disabled = 1 ;
}
return ret ;
}
2012-02-03 00:00:41 +04:00
if ( cpu = = RING_BUFFER_ALL_CPUS )
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
set_buffer_entries ( & tr - > max_buffer , size ) ;
2012-02-03 00:00:41 +04:00
else
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
per_cpu_ptr ( tr - > max_buffer . data , cpu ) - > entries = size ;
2012-02-03 00:00:41 +04:00
2010-07-01 09:34:35 +04:00
out :
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# endif /* CONFIG_TRACER_MAX_TRACE */
2012-02-03 00:00:41 +04:00
if ( cpu = = RING_BUFFER_ALL_CPUS )
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
set_buffer_entries ( & tr - > trace_buffer , size ) ;
2012-02-03 00:00:41 +04:00
else
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
per_cpu_ptr ( tr - > trace_buffer . data , cpu ) - > entries = size ;
2009-03-11 20:42:01 +03:00
return ret ;
}
2012-05-11 21:29:49 +04:00
static ssize_t tracing_resize_ring_buffer ( struct trace_array * tr ,
unsigned long size , int cpu_id )
2011-06-14 04:51:57 +04:00
{
2012-05-04 05:59:50 +04:00
int ret = size ;
2011-06-14 04:51:57 +04:00
mutex_lock ( & trace_types_lock ) ;
2012-02-03 00:00:41 +04:00
if ( cpu_id ! = RING_BUFFER_ALL_CPUS ) {
/* make sure, this cpu is enabled in the mask */
if ( ! cpumask_test_cpu ( cpu_id , tracing_buffer_mask ) ) {
ret = - EINVAL ;
goto out ;
}
}
2011-06-14 04:51:57 +04:00
2012-05-11 21:29:49 +04:00
ret = __tracing_resize_ring_buffer ( tr , size , cpu_id ) ;
2011-06-14 04:51:57 +04:00
if ( ret < 0 )
ret = - ENOMEM ;
2012-02-03 00:00:41 +04:00
out :
2011-06-14 04:51:57 +04:00
mutex_unlock ( & trace_types_lock ) ;
return ret ;
}
2010-07-01 09:34:35 +04:00
2009-03-11 21:33:00 +03:00
/**
* tracing_update_buffers - used by tracing facility to expand ring buffers
*
* To save on memory when the tracing is never used on a system with it
* configured in . The ring buffers are set to a minimum size . But once
* a user starts to use the tracing facility , then they need to grow
* to their default size .
*
* This function is to be called when a tracer is about to be used .
*/
int tracing_update_buffers ( void )
{
int ret = 0 ;
2009-03-12 18:33:20 +03:00
mutex_lock ( & trace_types_lock ) ;
2009-03-11 21:33:00 +03:00
if ( ! ring_buffer_expanded )
2012-05-11 21:29:49 +04:00
ret = __tracing_resize_ring_buffer ( & global_trace , trace_buf_size ,
2012-02-03 00:00:41 +04:00
RING_BUFFER_ALL_CPUS ) ;
2009-03-12 18:33:20 +03:00
mutex_unlock ( & trace_types_lock ) ;
2009-03-11 21:33:00 +03:00
return ret ;
}
2009-02-27 07:43:05 +03:00
struct trace_option_dentry ;
static struct trace_option_dentry *
2012-05-11 21:29:49 +04:00
create_trace_option_files ( struct trace_array * tr , struct tracer * tracer ) ;
2009-02-27 07:43:05 +03:00
static void
destroy_trace_option_files ( struct trace_option_dentry * topts ) ;
2014-01-14 17:43:01 +04:00
/*
* Used to clear out the tracer before deletion of an instance .
* Must have trace_types_lock held .
*/
static void tracing_set_nop ( struct trace_array * tr )
{
if ( tr - > current_trace = = & nop_trace )
return ;
2014-01-14 17:52:35 +04:00
tr - > current_trace - > enabled - - ;
2014-01-14 17:43:01 +04:00
if ( tr - > current_trace - > reset )
tr - > current_trace - > reset ( tr ) ;
tr - > current_trace = & nop_trace ;
}
2015-02-03 20:45:53 +03:00
static void update_tracer_options ( struct trace_array * tr , struct tracer * t )
2008-05-12 23:20:42 +04:00
{
2009-02-27 07:43:05 +03:00
static struct trace_option_dentry * topts ;
2015-02-03 20:45:53 +03:00
/* Only enable if the directory has been created already. */
if ( ! tr - > dir )
return ;
/* Currently, only the top instance has options */
if ( ! ( tr - > flags & TRACE_ARRAY_FL_GLOBAL ) )
return ;
destroy_trace_option_files ( topts ) ;
topts = create_trace_option_files ( tr , t ) ;
}
static int tracing_set_tracer ( struct trace_array * tr , const char * buf )
{
2008-05-12 23:20:42 +04:00
struct tracer * t ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
2013-01-22 22:35:11 +04:00
bool had_max_tr ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# endif
2008-11-01 21:57:37 +03:00
int ret = 0 ;
2008-05-12 23:20:42 +04:00
2009-03-12 18:33:20 +03:00
mutex_lock ( & trace_types_lock ) ;
2009-03-11 20:42:01 +03:00
if ( ! ring_buffer_expanded ) {
2012-05-11 21:29:49 +04:00
ret = __tracing_resize_ring_buffer ( tr , trace_buf_size ,
2012-02-03 00:00:41 +04:00
RING_BUFFER_ALL_CPUS ) ;
2009-03-11 20:42:01 +03:00
if ( ret < 0 )
2009-03-16 00:10:39 +03:00
goto out ;
2009-03-11 20:42:01 +03:00
ret = 0 ;
}
2008-05-12 23:20:42 +04:00
for ( t = trace_types ; t ; t = t - > next ) {
if ( strcmp ( t - > name , buf ) = = 0 )
break ;
}
2008-10-05 00:04:44 +04:00
if ( ! t ) {
ret = - EINVAL ;
goto out ;
}
2012-05-11 21:29:49 +04:00
if ( t = = tr - > current_trace )
2008-05-12 23:20:42 +04:00
goto out ;
2013-11-07 07:42:48 +04:00
/* Some tracers are only allowed for the top level buffer */
if ( ! trace_ok_for_array ( t , tr ) ) {
ret = - EINVAL ;
goto out ;
}
2014-12-16 04:13:31 +03:00
/* If trace pipe files are being read, we can't change the tracer */
if ( tr - > current_trace - > ref ) {
ret = - EBUSY ;
goto out ;
}
2008-11-12 23:24:24 +03:00
trace_branch_disable ( ) ;
2013-03-14 23:03:53 +04:00
2014-01-14 17:52:35 +04:00
tr - > current_trace - > enabled - - ;
2013-03-14 23:03:53 +04:00
2012-05-11 21:29:49 +04:00
if ( tr - > current_trace - > reset )
tr - > current_trace - > reset ( tr ) ;
2013-01-22 22:35:11 +04:00
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
/* Current trace needs to be nop_trace before synchronize_sched */
2012-05-11 21:29:49 +04:00
tr - > current_trace = & nop_trace ;
2013-01-22 22:35:11 +04:00
2013-03-06 03:25:02 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
had_max_tr = tr - > allocated_snapshot ;
2013-01-22 22:35:11 +04:00
if ( had_max_tr & & ! t - > use_max_tr ) {
/*
* We need to make sure that the update_max_tr sees that
* current_trace changed to nop_trace to keep it from
* swapping the buffers after we resize it .
* The update_max_tr is called from interrupts disabled
* so a synchronized_sched ( ) is sufficient .
*/
synchronize_sched ( ) ;
2013-03-12 19:17:54 +04:00
free_snapshot ( tr ) ;
2010-07-01 09:34:35 +04:00
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# endif
2015-02-03 20:45:53 +03:00
update_tracer_options ( tr , t ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
2013-01-22 22:35:11 +04:00
if ( t - > use_max_tr & & ! had_max_tr ) {
2013-03-12 19:17:54 +04:00
ret = alloc_snapshot ( tr ) ;
2012-10-17 06:56:16 +04:00
if ( ret < 0 )
goto out ;
2010-07-01 09:34:35 +04:00
}
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# endif
2009-02-27 07:43:05 +03:00
2008-11-16 07:57:26 +03:00
if ( t - > init ) {
2009-02-05 23:02:00 +03:00
ret = tracer_init ( t , tr ) ;
2008-11-16 07:57:26 +03:00
if ( ret )
goto out ;
}
2008-05-12 23:20:42 +04:00
2012-05-11 21:29:49 +04:00
tr - > current_trace = t ;
2014-01-14 17:52:35 +04:00
tr - > current_trace - > enabled + + ;
2008-11-12 23:24:24 +03:00
trace_branch_enable ( tr ) ;
2008-05-12 23:20:42 +04:00
out :
mutex_unlock ( & trace_types_lock ) ;
2008-11-01 21:57:37 +03:00
return ret ;
}
static ssize_t
tracing_set_trace_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2013-11-07 07:42:48 +04:00
struct trace_array * tr = filp - > private_data ;
2009-09-18 10:06:47 +04:00
char buf [ MAX_TRACER_SIZE + 1 ] ;
2008-11-01 21:57:37 +03:00
int i ;
size_t ret ;
2008-11-16 07:53:19 +03:00
int err ;
ret = cnt ;
2008-11-01 21:57:37 +03:00
2009-09-18 10:06:47 +04:00
if ( cnt > MAX_TRACER_SIZE )
cnt = MAX_TRACER_SIZE ;
2008-11-01 21:57:37 +03:00
if ( copy_from_user ( & buf , ubuf , cnt ) )
return - EFAULT ;
buf [ cnt ] = 0 ;
/* strip ending whitespace. */
for ( i = cnt - 1 ; i > 0 & & isspace ( buf [ i ] ) ; i - - )
buf [ i ] = 0 ;
2013-11-07 07:42:48 +04:00
err = tracing_set_tracer ( tr , buf ) ;
2008-11-16 07:53:19 +03:00
if ( err )
return err ;
2008-11-01 21:57:37 +03:00
2009-10-24 03:36:16 +04:00
* ppos + = ret ;
2008-05-12 23:20:42 +04:00
2008-10-05 00:04:44 +04:00
return ret ;
2008-05-12 23:20:42 +04:00
}
static ssize_t
2014-07-18 15:17:27 +04:00
tracing_nsecs_read ( unsigned long * ptr , char __user * ubuf ,
size_t cnt , loff_t * ppos )
2008-05-12 23:20:42 +04:00
{
char buf [ 64 ] ;
int r ;
2008-05-12 23:21:00 +04:00
r = snprintf ( buf , sizeof ( buf ) , " %ld \n " ,
2008-05-12 23:20:42 +04:00
* ptr = = ( unsigned long ) - 1 ? - 1 : nsecs_to_usecs ( * ptr ) ) ;
2008-05-12 23:21:00 +04:00
if ( r > sizeof ( buf ) )
r = sizeof ( buf ) ;
2008-05-12 23:20:46 +04:00
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
2008-05-12 23:20:42 +04:00
}
static ssize_t
2014-07-18 15:17:27 +04:00
tracing_nsecs_write ( unsigned long * ptr , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
2008-05-12 23:20:42 +04:00
{
2009-02-10 21:44:34 +03:00
unsigned long val ;
2008-05-12 23:21:00 +04:00
int ret ;
2008-05-12 23:20:42 +04:00
2011-06-07 23:58:27 +04:00
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
2008-05-12 23:21:00 +04:00
return ret ;
2008-05-12 23:20:42 +04:00
* ptr = val * 1000 ;
return cnt ;
}
2014-07-18 15:17:27 +04:00
static ssize_t
tracing_thresh_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
return tracing_nsecs_read ( & tracing_thresh , ubuf , cnt , ppos ) ;
}
static ssize_t
tracing_thresh_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
struct trace_array * tr = filp - > private_data ;
int ret ;
mutex_lock ( & trace_types_lock ) ;
ret = tracing_nsecs_write ( & tracing_thresh , ubuf , cnt , ppos ) ;
if ( ret < 0 )
goto out ;
if ( tr - > current_trace - > update_thresh ) {
ret = tr - > current_trace - > update_thresh ( tr ) ;
if ( ret < 0 )
goto out ;
}
ret = cnt ;
out :
mutex_unlock ( & trace_types_lock ) ;
return ret ;
}
static ssize_t
tracing_max_lat_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
return tracing_nsecs_read ( filp - > private_data , ubuf , cnt , ppos ) ;
}
static ssize_t
tracing_max_lat_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
return tracing_nsecs_write ( filp - > private_data , ubuf , cnt , ppos ) ;
}
2008-05-12 23:20:46 +04:00
static int tracing_open_pipe ( struct inode * inode , struct file * filp )
{
2013-07-23 19:25:57 +04:00
struct trace_array * tr = inode - > i_private ;
2008-05-12 23:20:46 +04:00
struct trace_iterator * iter ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
int ret = 0 ;
2008-05-12 23:20:46 +04:00
if ( tracing_disabled )
return - ENODEV ;
2013-07-02 07:34:22 +04:00
if ( trace_array_get ( tr ) < 0 )
return - ENODEV ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
mutex_lock ( & trace_types_lock ) ;
2008-05-12 23:20:46 +04:00
/* create a buffer to store the information to pass to userspace */
iter = kzalloc ( sizeof ( * iter ) , GFP_KERNEL ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
if ( ! iter ) {
ret = - ENOMEM ;
2013-07-18 22:18:44 +04:00
__trace_array_put ( tr ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
goto out ;
}
2008-05-12 23:20:46 +04:00
2014-06-25 23:54:42 +04:00
trace_seq_init ( & iter - > seq ) ;
2014-12-16 06:31:07 +03:00
iter - > trace = tr - > current_trace ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
2009-01-01 02:42:23 +03:00
if ( ! alloc_cpumask_var ( & iter - > started , GFP_KERNEL ) ) {
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
ret = - ENOMEM ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
goto fail ;
2009-01-01 02:42:23 +03:00
}
2008-11-08 06:36:02 +03:00
/* trace pipe does not show start of buffer */
2009-01-01 02:42:23 +03:00
cpumask_setall ( iter - > started ) ;
2008-11-08 06:36:02 +03:00
2009-06-01 23:16:05 +04:00
if ( trace_flags & TRACE_ITER_LATENCY_FMT )
iter - > iter_flags | = TRACE_FILE_LAT_FMT ;
2012-11-14 00:18:22 +04:00
/* Output in nanoseconds only if we are using a clock in nanoseconds. */
2013-04-23 05:32:39 +04:00
if ( trace_clocks [ tr - > clock_id ] . in_ns )
2012-11-14 00:18:22 +04:00
iter - > iter_flags | = TRACE_FILE_TIME_IN_NS ;
2013-07-23 19:25:57 +04:00
iter - > tr = tr ;
iter - > trace_buffer = & tr - > trace_buffer ;
iter - > cpu_file = tracing_get_cpu ( inode ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
mutex_init ( & iter - > mutex ) ;
2008-05-12 23:20:46 +04:00
filp - > private_data = iter ;
2008-05-12 23:21:01 +04:00
if ( iter - > trace - > pipe_open )
iter - > trace - > pipe_open ( iter ) ;
2010-07-08 01:40:11 +04:00
nonseekable_open ( inode , filp ) ;
2014-12-16 04:13:31 +03:00
tr - > current_trace - > ref + + ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
out :
mutex_unlock ( & trace_types_lock ) ;
return ret ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
fail :
kfree ( iter - > trace ) ;
kfree ( iter ) ;
2013-07-02 07:34:22 +04:00
__trace_array_put ( tr ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
mutex_unlock ( & trace_types_lock ) ;
return ret ;
2008-05-12 23:20:46 +04:00
}
static int tracing_release_pipe ( struct inode * inode , struct file * file )
{
struct trace_iterator * iter = file - > private_data ;
2013-07-23 19:25:57 +04:00
struct trace_array * tr = inode - > i_private ;
2008-05-12 23:20:46 +04:00
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
mutex_lock ( & trace_types_lock ) ;
2014-12-16 04:13:31 +03:00
tr - > current_trace - > ref - - ;
2009-12-09 20:37:43 +03:00
if ( iter - > trace - > pipe_close )
2009-12-07 17:06:24 +03:00
iter - > trace - > pipe_close ( iter ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
mutex_unlock ( & trace_types_lock ) ;
2009-01-01 02:42:23 +03:00
free_cpumask_var ( iter - > started ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
mutex_destroy ( & iter - > mutex ) ;
2008-05-12 23:20:46 +04:00
kfree ( iter ) ;
2013-07-02 07:34:22 +04:00
trace_array_put ( tr ) ;
2008-05-12 23:20:46 +04:00
return 0 ;
}
2008-05-12 23:20:49 +04:00
static unsigned int
2013-02-28 18:17:16 +04:00
trace_poll ( struct trace_iterator * iter , struct file * filp , poll_table * poll_table )
2008-05-12 23:20:49 +04:00
{
2013-03-01 04:59:17 +04:00
/* Iterators are static, they should be filled or empty */
if ( trace_buffer_iter ( iter , iter - > cpu_file ) )
return POLLIN | POLLRDNORM ;
2008-05-12 23:20:49 +04:00
2013-03-01 04:59:17 +04:00
if ( trace_flags & TRACE_ITER_BLOCK )
2008-05-12 23:20:49 +04:00
/*
* Always select as readable when in blocking mode
*/
return POLLIN | POLLRDNORM ;
2013-03-01 04:59:17 +04:00
else
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
return ring_buffer_poll_wait ( iter - > trace_buffer - > buffer , iter - > cpu_file ,
2013-03-01 04:59:17 +04:00
filp , poll_table ) ;
2008-05-12 23:20:49 +04:00
}
2013-02-28 18:17:16 +04:00
static unsigned int
tracing_poll_pipe ( struct file * filp , poll_table * poll_table )
{
struct trace_iterator * iter = filp - > private_data ;
return trace_poll ( iter , filp , poll_table ) ;
2008-05-12 23:20:49 +04:00
}
2014-12-16 06:31:07 +03:00
/* Must be called with iter->mutex held. */
2009-02-09 09:15:55 +03:00
static int tracing_wait_pipe ( struct file * filp )
2008-05-12 23:20:46 +04:00
{
struct trace_iterator * iter = filp - > private_data ;
2014-06-10 17:46:00 +04:00
int ret ;
2008-05-12 23:20:46 +04:00
while ( trace_empty ( iter ) ) {
2008-05-12 23:20:58 +04:00
2008-05-12 23:21:01 +04:00
if ( ( filp - > f_flags & O_NONBLOCK ) ) {
2009-02-09 09:15:55 +03:00
return - EAGAIN ;
2008-05-12 23:21:01 +04:00
}
2008-05-12 23:20:58 +04:00
2008-05-12 23:20:46 +04:00
/*
2013-01-14 06:54:11 +04:00
* We block until we read something and tracing is disabled .
2008-05-12 23:20:46 +04:00
* We still block if tracing is disabled , but we have never
* read anything . This allows a user to cat this file , and
* then enable tracing . But after we have read something ,
* we give an EOF when tracing is again disabled .
*
* iter - > pos will be 0 if we haven ' t read anything .
*/
2013-07-01 23:58:24 +04:00
if ( ! tracing_is_on ( ) & & iter - > pos )
2008-05-12 23:20:46 +04:00
break ;
2014-04-30 00:07:28 +04:00
mutex_unlock ( & iter - > mutex ) ;
2014-11-10 21:46:34 +03:00
ret = wait_on_pipe ( iter , false ) ;
2014-04-30 00:07:28 +04:00
mutex_lock ( & iter - > mutex ) ;
2014-06-10 17:46:00 +04:00
if ( ret )
return ret ;
2008-05-12 23:20:46 +04:00
}
2009-02-09 09:15:55 +03:00
return 1 ;
}
/*
* Consumer reader .
*/
static ssize_t
tracing_read_pipe ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
struct trace_iterator * iter = filp - > private_data ;
ssize_t sret ;
/* return any leftover data */
sret = trace_seq_to_user ( & iter - > seq , ubuf , cnt ) ;
if ( sret ! = - EBUSY )
return sret ;
2009-03-02 22:04:40 +03:00
trace_seq_init ( & iter - > seq ) ;
2009-02-09 09:15:55 +03:00
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
/*
* Avoid more than one consumer on a single file descriptor
* This is just a matter of traces coherency , the ring buffer itself
* is protected .
*/
mutex_lock ( & iter - > mutex ) ;
2009-02-09 09:15:55 +03:00
if ( iter - > trace - > read ) {
sret = iter - > trace - > read ( iter , filp , ubuf , cnt , ppos ) ;
if ( sret )
goto out ;
}
waitagain :
sret = tracing_wait_pipe ( filp ) ;
if ( sret < = 0 )
goto out ;
2008-05-12 23:20:46 +04:00
/* stop when tracing is finished */
2009-02-09 09:15:55 +03:00
if ( trace_empty ( iter ) ) {
sret = 0 ;
2008-05-12 23:21:01 +04:00
goto out ;
2009-02-09 09:15:55 +03:00
}
2008-05-12 23:20:46 +04:00
if ( cnt > = PAGE_SIZE )
cnt = PAGE_SIZE - 1 ;
2008-05-12 23:21:01 +04:00
/* reset all but tr, trace, and overruns */
memset ( & iter - > seq , 0 ,
sizeof ( struct trace_iterator ) -
offsetof ( struct trace_iterator , seq ) ) ;
2013-08-02 21:16:43 +04:00
cpumask_clear ( iter - > started ) ;
2008-05-12 23:21:01 +04:00
iter - > pos = - 1 ;
2008-05-12 23:20:46 +04:00
2009-05-18 15:35:34 +04:00
trace_event_read_lock ( ) ;
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
trace_access_lock ( iter - > cpu_file ) ;
2010-08-05 18:22:23 +04:00
while ( trace_find_next_entry_inc ( iter ) ! = NULL ) {
2008-09-29 22:18:34 +04:00
enum print_line_t ret ;
2014-11-14 23:49:41 +03:00
int save_len = iter - > seq . seq . len ;
2008-05-12 23:20:48 +04:00
2008-05-12 23:20:47 +04:00
ret = print_trace_line ( iter ) ;
2008-09-29 22:18:34 +04:00
if ( ret = = TRACE_TYPE_PARTIAL_LINE ) {
2008-05-12 23:20:48 +04:00
/* don't print partial lines */
2014-11-14 23:49:41 +03:00
iter - > seq . seq . len = save_len ;
2008-05-12 23:20:46 +04:00
break ;
2008-05-12 23:20:48 +04:00
}
2009-02-06 20:30:44 +03:00
if ( ret ! = TRACE_TYPE_NO_CONSUME )
trace_consume ( iter ) ;
2008-05-12 23:20:46 +04:00
2014-11-14 23:49:41 +03:00
if ( trace_seq_used ( & iter - > seq ) > = cnt )
2008-05-12 23:20:46 +04:00
break ;
2011-03-25 14:05:18 +03:00
/*
* Setting the full flag means we reached the trace_seq buffer
* size and we should leave by partial output condition above .
* One of the trace_seq_ * functions is not used properly .
*/
WARN_ONCE ( iter - > seq . full , " full flag set for trace type %d " ,
iter - > ent - > type ) ;
2008-05-12 23:20:46 +04:00
}
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
trace_access_unlock ( iter - > cpu_file ) ;
2009-05-18 15:35:34 +04:00
trace_event_read_unlock ( ) ;
2008-05-12 23:20:46 +04:00
/* Now copy what we have to the user */
2008-05-12 23:21:02 +04:00
sret = trace_seq_to_user ( & iter - > seq , ubuf , cnt ) ;
2014-11-14 23:49:41 +03:00
if ( iter - > seq . seq . readpos > = trace_seq_used ( & iter - > seq ) )
2009-03-02 22:04:40 +03:00
trace_seq_init ( & iter - > seq ) ;
2008-09-29 22:23:48 +04:00
/*
2011-03-31 05:57:33 +04:00
* If there was nothing to send to user , in spite of consuming trace
2008-09-29 22:23:48 +04:00
* entries , go back to wait for more entries .
*/
2008-05-12 23:21:02 +04:00
if ( sret = = - EBUSY )
2008-09-29 22:23:48 +04:00
goto waitagain ;
2008-05-12 23:20:46 +04:00
2008-05-12 23:21:01 +04:00
out :
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
mutex_unlock ( & iter - > mutex ) ;
2008-05-12 23:21:01 +04:00
2008-05-12 23:21:02 +04:00
return sret ;
2008-05-12 23:20:46 +04:00
}
2009-02-09 09:15:56 +03:00
static void tracing_spd_release_pipe ( struct splice_pipe_desc * spd ,
unsigned int idx )
{
__free_page ( spd - > pages [ idx ] ) ;
}
2009-12-16 03:46:48 +03:00
static const struct pipe_buf_operations tracing_pipe_buf_ops = {
2009-02-09 20:06:29 +03:00
. can_merge = 0 ,
. confirm = generic_pipe_buf_confirm ,
tracing: Fix buggered tee(2) on tracing_pipe
In kernel/trace/trace.c we have this:
static void tracing_pipe_buf_release(struct pipe_inode_info *pipe,
struct pipe_buffer *buf)
{
__free_page(buf->page);
}
static const struct pipe_buf_operations tracing_pipe_buf_ops = {
.can_merge = 0,
.map = generic_pipe_buf_map,
.unmap = generic_pipe_buf_unmap,
.confirm = generic_pipe_buf_confirm,
.release = tracing_pipe_buf_release,
.steal = generic_pipe_buf_steal,
.get = generic_pipe_buf_get,
};
with
void generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf)
{
page_cache_get(buf->page);
}
and I don't see anything that would've prevented tee(2) called on the pipe
that got stuff spliced into it from that sucker. ->ops->get() will be
called, then buf gets copied into target pipe's ->bufs[] and eventually
readers get to both copies of the buffer. With
get_page(page)
look at that page
__free_page(page)
look at that page
__free_page(page)
which is not a good thing, to put it mildly. AFAICS, that ought to use
the normal generic_pipe_buf_release() (aka page_cache_release(buf->page)),
shouldn't it?
[
SDR - As trace_pipe just allocates the page with alloc_page(GFP_KERNEL),
and doesn't do anything special with it (no LRU logic). The __free_page()
should be fine, as it wont actually free a page with reference count.
Maybe there's a chance to leak memory? Anyway, This change is at a minimum
good for being symmetric with generic_pipe_buf_get, it is fine to add.
]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[ SDR - Removed no longer used tracing_pipe_buf_release ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-17 16:53:39 +04:00
. release = generic_pipe_buf_release ,
2009-02-09 20:06:29 +03:00
. steal = generic_pipe_buf_steal ,
. get = generic_pipe_buf_get ,
2009-02-09 09:15:56 +03:00
} ;
2009-02-09 20:06:29 +03:00
static size_t
2009-02-11 04:51:30 +03:00
tracing_fill_pipe_page ( size_t rem , struct trace_iterator * iter )
2009-02-09 20:06:29 +03:00
{
size_t count ;
2014-11-17 21:12:22 +03:00
int save_len ;
2009-02-09 20:06:29 +03:00
int ret ;
/* Seq buffer is page-sized, exactly what we need. */
for ( ; ; ) {
2014-11-17 21:12:22 +03:00
save_len = iter - > seq . seq . len ;
2009-02-09 20:06:29 +03:00
ret = print_trace_line ( iter ) ;
2014-11-17 21:12:22 +03:00
if ( trace_seq_has_overflowed ( & iter - > seq ) ) {
iter - > seq . seq . len = save_len ;
2009-02-09 20:06:29 +03:00
break ;
}
2014-11-17 21:12:22 +03:00
/*
* This should not be hit , because it should only
* be set if the iter - > seq overflowed . But check it
* anyway to be safe .
*/
2009-02-09 20:06:29 +03:00
if ( ret = = TRACE_TYPE_PARTIAL_LINE ) {
2014-11-17 21:12:22 +03:00
iter - > seq . seq . len = save_len ;
break ;
}
2014-11-14 23:49:41 +03:00
count = trace_seq_used ( & iter - > seq ) - save_len ;
2014-11-17 21:12:22 +03:00
if ( rem < count ) {
rem = 0 ;
iter - > seq . seq . len = save_len ;
2009-02-09 20:06:29 +03:00
break ;
}
2009-07-28 16:17:22 +04:00
if ( ret ! = TRACE_TYPE_NO_CONSUME )
trace_consume ( iter ) ;
2009-02-09 20:06:29 +03:00
rem - = count ;
2010-08-05 18:22:23 +04:00
if ( ! trace_find_next_entry_inc ( iter ) ) {
2009-02-09 20:06:29 +03:00
rem = 0 ;
iter - > ent = NULL ;
break ;
}
}
return rem ;
}
2009-02-09 09:15:56 +03:00
static ssize_t tracing_splice_read_pipe ( struct file * filp ,
loff_t * ppos ,
struct pipe_inode_info * pipe ,
size_t len ,
unsigned int flags )
{
2010-05-20 12:43:18 +04:00
struct page * pages_def [ PIPE_DEF_BUFFERS ] ;
struct partial_page partial_def [ PIPE_DEF_BUFFERS ] ;
2009-02-09 09:15:56 +03:00
struct trace_iterator * iter = filp - > private_data ;
struct splice_pipe_desc spd = {
2010-05-20 12:43:18 +04:00
. pages = pages_def ,
. partial = partial_def ,
2009-02-09 20:06:29 +03:00
. nr_pages = 0 , /* This gets updated below. */
2012-06-12 17:24:40 +04:00
. nr_pages_max = PIPE_DEF_BUFFERS ,
2009-02-09 20:06:29 +03:00
. flags = flags ,
. ops = & tracing_pipe_buf_ops ,
. spd_release = tracing_spd_release_pipe ,
2009-02-09 09:15:56 +03:00
} ;
ssize_t ret ;
2009-02-09 20:06:29 +03:00
size_t rem ;
2009-02-09 09:15:56 +03:00
unsigned int i ;
2010-05-20 12:43:18 +04:00
if ( splice_grow_spd ( pipe , & spd ) )
return - ENOMEM ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
mutex_lock ( & iter - > mutex ) ;
2009-02-09 09:15:56 +03:00
if ( iter - > trace - > splice_read ) {
ret = iter - > trace - > splice_read ( iter , filp ,
ppos , pipe , len , flags ) ;
if ( ret )
2009-02-09 20:06:29 +03:00
goto out_err ;
2009-02-09 09:15:56 +03:00
}
ret = tracing_wait_pipe ( filp ) ;
if ( ret < = 0 )
2009-02-09 20:06:29 +03:00
goto out_err ;
2009-02-09 09:15:56 +03:00
2010-08-05 18:22:23 +04:00
if ( ! iter - > ent & & ! trace_find_next_entry_inc ( iter ) ) {
2009-02-09 09:15:56 +03:00
ret = - EFAULT ;
2009-02-09 20:06:29 +03:00
goto out_err ;
2009-02-09 09:15:56 +03:00
}
2009-05-18 15:35:34 +04:00
trace_event_read_lock ( ) ;
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
trace_access_lock ( iter - > cpu_file ) ;
2009-05-18 15:35:34 +04:00
2009-02-09 09:15:56 +03:00
/* Fill as many pages as possible. */
2014-04-11 20:01:03 +04:00
for ( i = 0 , rem = len ; i < spd . nr_pages_max & & rem ; i + + ) {
2010-05-20 12:43:18 +04:00
spd . pages [ i ] = alloc_page ( GFP_KERNEL ) ;
if ( ! spd . pages [ i ] )
2009-02-09 20:06:29 +03:00
break ;
2009-02-09 09:15:56 +03:00
2009-02-11 04:51:30 +03:00
rem = tracing_fill_pipe_page ( rem , iter ) ;
2009-02-09 09:15:56 +03:00
/* Copy the data into the page, so we can start over. */
ret = trace_seq_to_buffer ( & iter - > seq ,
2010-05-20 12:43:18 +04:00
page_address ( spd . pages [ i ] ) ,
2014-11-14 23:49:41 +03:00
trace_seq_used ( & iter - > seq ) ) ;
2009-02-09 09:15:56 +03:00
if ( ret < 0 ) {
2010-05-20 12:43:18 +04:00
__free_page ( spd . pages [ i ] ) ;
2009-02-09 09:15:56 +03:00
break ;
}
2010-05-20 12:43:18 +04:00
spd . partial [ i ] . offset = 0 ;
2014-11-14 23:49:41 +03:00
spd . partial [ i ] . len = trace_seq_used ( & iter - > seq ) ;
2009-02-09 09:15:56 +03:00
2009-03-02 22:04:40 +03:00
trace_seq_init ( & iter - > seq ) ;
2009-02-09 09:15:56 +03:00
}
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
trace_access_unlock ( iter - > cpu_file ) ;
2009-05-18 15:35:34 +04:00
trace_event_read_unlock ( ) ;
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
mutex_unlock ( & iter - > mutex ) ;
2009-02-09 09:15:56 +03:00
spd . nr_pages = i ;
2010-05-20 12:43:18 +04:00
ret = splice_to_pipe ( pipe , & spd ) ;
out :
2012-06-12 17:24:40 +04:00
splice_shrink_spd ( & spd ) ;
2010-05-20 12:43:18 +04:00
return ret ;
2009-02-09 09:15:56 +03:00
2009-02-09 20:06:29 +03:00
out_err :
tracing/core: make the read callbacks reentrants
Now that several per-cpu files can be read or spliced at the
same, we want the read/splice callbacks for tracing files to be
reentrants.
Until now, a single global mutex (trace_types_lock) serialized
the access to tracing_read_pipe(), tracing_splice_read_pipe(),
and the seq helpers.
Ie: it means that if a user tries to read trace_pipe0 and
trace_pipe1 at the same time, the access to the function
tracing_read_pipe() is contended and one reader must wait for
the other to finish its read call.
The trace_type_lock mutex is mostly here to serialize the access
to the global current tracer (current_trace), which can be
changed concurrently. Although the iter struct keeps a private
pointer to this tracer, its callbacks can be changed by another
function.
The method used here is to not keep anymore private reference to
the tracer inside the iterator but to make a copy of it inside
the iterator. Then it checks on subsequents read calls if the
tracer has changed. This is not costly because the current
tracer is not expected to be changed often, so we use a branch
prediction for that.
Moreover, we add a private mutex to the iterator (there is one
iterator per file descriptor) to serialize the accesses in case
of multiple consumers per file descriptor (which would be a
silly idea from the user). Note that this is not to protect the
ring buffer, since the ring buffer already serializes the
readers accesses. This is to prevent from traces weirdness in
case of concurrent consumers. But these mutexes can be dropped
anyway, that would not result in any crash. Just tell me what
you think about it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 08:13:16 +03:00
mutex_unlock ( & iter - > mutex ) ;
2010-05-20 12:43:18 +04:00
goto out ;
2009-02-09 09:15:56 +03:00
}
2008-05-12 23:20:59 +04:00
static ssize_t
tracing_entries_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2013-07-23 19:26:06 +04:00
struct inode * inode = file_inode ( filp ) ;
struct trace_array * tr = inode - > i_private ;
int cpu = tracing_get_cpu ( inode ) ;
2012-02-03 00:00:41 +04:00
char buf [ 64 ] ;
int r = 0 ;
ssize_t ret ;
2008-05-12 23:20:59 +04:00
2009-03-12 20:53:25 +03:00
mutex_lock ( & trace_types_lock ) ;
2012-02-03 00:00:41 +04:00
2013-07-23 19:26:06 +04:00
if ( cpu = = RING_BUFFER_ALL_CPUS ) {
2012-02-03 00:00:41 +04:00
int cpu , buf_size_same ;
unsigned long size ;
size = 0 ;
buf_size_same = 1 ;
/* check if all cpu sizes are same */
for_each_tracing_cpu ( cpu ) {
/* fill in the size from first enabled cpu */
if ( size = = 0 )
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
size = per_cpu_ptr ( tr - > trace_buffer . data , cpu ) - > entries ;
if ( size ! = per_cpu_ptr ( tr - > trace_buffer . data , cpu ) - > entries ) {
2012-02-03 00:00:41 +04:00
buf_size_same = 0 ;
break ;
}
}
if ( buf_size_same ) {
if ( ! ring_buffer_expanded )
r = sprintf ( buf , " %lu (expanded: %lu) \n " ,
size > > 10 ,
trace_buf_size > > 10 ) ;
else
r = sprintf ( buf , " %lu \n " , size > > 10 ) ;
} else
r = sprintf ( buf , " X \n " ) ;
} else
2013-07-23 19:26:06 +04:00
r = sprintf ( buf , " %lu \n " , per_cpu_ptr ( tr - > trace_buffer . data , cpu ) - > entries > > 10 ) ;
2012-02-03 00:00:41 +04:00
2009-03-12 20:53:25 +03:00
mutex_unlock ( & trace_types_lock ) ;
2012-02-03 00:00:41 +04:00
ret = simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
return ret ;
2008-05-12 23:20:59 +04:00
}
static ssize_t
tracing_entries_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2013-07-23 19:26:06 +04:00
struct inode * inode = file_inode ( filp ) ;
struct trace_array * tr = inode - > i_private ;
2008-05-12 23:20:59 +04:00
unsigned long val ;
2011-06-14 04:51:57 +04:00
int ret ;
2008-05-12 23:20:59 +04:00
2011-06-07 23:58:27 +04:00
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
2008-05-12 23:21:00 +04:00
return ret ;
2008-05-12 23:20:59 +04:00
/* must have at least 1 entry */
if ( ! val )
return - EINVAL ;
2008-11-13 08:09:35 +03:00
/* value is in KB */
val < < = 10 ;
2013-07-23 19:26:06 +04:00
ret = tracing_resize_ring_buffer ( tr , val , tracing_get_cpu ( inode ) ) ;
2011-06-14 04:51:57 +04:00
if ( ret < 0 )
return ret ;
2008-05-12 23:20:59 +04:00
2009-10-24 03:36:16 +04:00
* ppos + = cnt ;
2008-05-12 23:20:59 +04:00
2011-06-14 04:51:57 +04:00
return cnt ;
}
2008-11-11 05:46:00 +03:00
2011-08-17 01:46:15 +04:00
static ssize_t
tracing_total_entries_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
struct trace_array * tr = filp - > private_data ;
char buf [ 64 ] ;
int r , cpu ;
unsigned long size = 0 , expanded_size = 0 ;
mutex_lock ( & trace_types_lock ) ;
for_each_tracing_cpu ( cpu ) {
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
size + = per_cpu_ptr ( tr - > trace_buffer . data , cpu ) - > entries > > 10 ;
2011-08-17 01:46:15 +04:00
if ( ! ring_buffer_expanded )
expanded_size + = trace_buf_size > > 10 ;
}
if ( ring_buffer_expanded )
r = sprintf ( buf , " %lu \n " , size ) ;
else
r = sprintf ( buf , " %lu (expanded: %lu) \n " , size , expanded_size ) ;
mutex_unlock ( & trace_types_lock ) ;
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
}
2011-06-14 04:51:57 +04:00
static ssize_t
tracing_free_buffer_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
/*
* There is no need to read what the user has written , this function
* is just to make sure that there is no error when " echo " is used
*/
* ppos + = cnt ;
2008-05-12 23:20:59 +04:00
return cnt ;
}
2011-06-14 04:51:57 +04:00
static int
tracing_free_buffer_release ( struct inode * inode , struct file * filp )
{
2012-05-11 21:29:49 +04:00
struct trace_array * tr = inode - > i_private ;
2011-06-15 06:44:07 +04:00
/* disable tracing ? */
if ( trace_flags & TRACE_ITER_STOP_ON_FREE )
2013-08-03 05:36:15 +04:00
tracer_tracing_off ( tr ) ;
2011-06-14 04:51:57 +04:00
/* resize the ring buffer to 0 */
2012-05-11 21:29:49 +04:00
tracing_resize_ring_buffer ( tr , 0 , RING_BUFFER_ALL_CPUS ) ;
2011-06-14 04:51:57 +04:00
2013-07-02 07:34:22 +04:00
trace_array_put ( tr ) ;
2011-06-14 04:51:57 +04:00
return 0 ;
}
2008-09-16 23:06:42 +04:00
static ssize_t
tracing_mark_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * fpos )
{
2011-09-22 19:50:27 +04:00
unsigned long addr = ( unsigned long ) ubuf ;
2013-07-02 02:31:24 +04:00
struct trace_array * tr = filp - > private_data ;
2011-09-22 19:50:27 +04:00
struct ring_buffer_event * event ;
struct ring_buffer * buffer ;
struct print_entry * entry ;
unsigned long irq_flags ;
struct page * pages [ 2 ] ;
2012-05-12 07:28:49 +04:00
void * map_page [ 2 ] ;
2011-09-22 19:50:27 +04:00
int nr_pages = 1 ;
ssize_t written ;
int offset ;
int size ;
int len ;
int ret ;
2012-05-12 07:28:49 +04:00
int i ;
2008-09-16 23:06:42 +04:00
2008-11-08 06:36:02 +03:00
if ( tracing_disabled )
2008-09-16 23:06:42 +04:00
return - EINVAL ;
2012-09-08 05:12:19 +04:00
if ( ! ( trace_flags & TRACE_ITER_MARKERS ) )
return - EINVAL ;
2008-09-16 23:06:42 +04:00
if ( cnt > TRACE_BUF_SIZE )
cnt = TRACE_BUF_SIZE ;
2011-09-22 19:50:27 +04:00
/*
* Userspace is injecting traces into the kernel trace buffer .
* We want to be as non intrusive as possible .
* To do so , we do not want to allocate any special buffers
* or take any locks , but instead write the userspace data
* straight into the ring buffer .
*
* First we need to pin the userspace buffer into memory ,
* which , most likely it is , because it just referenced it .
* But there ' s no guarantee that it is . By using get_user_pages_fast ( )
* and kmap_atomic / kunmap_atomic ( ) we can get access to the
* pages directly . We then write the data directly into the
* ring buffer .
*/
BUILD_BUG_ON ( TRACE_BUF_SIZE > = PAGE_SIZE ) ;
2008-09-16 23:06:42 +04:00
2011-09-22 19:50:27 +04:00
/* check if we cross pages */
if ( ( addr & PAGE_MASK ) ! = ( ( addr + cnt ) & PAGE_MASK ) )
nr_pages = 2 ;
offset = addr & ( PAGE_SIZE - 1 ) ;
addr & = PAGE_MASK ;
ret = get_user_pages_fast ( addr , nr_pages , 0 , pages ) ;
if ( ret < nr_pages ) {
while ( - - ret > = 0 )
put_page ( pages [ ret ] ) ;
written = - EFAULT ;
goto out ;
2008-09-16 23:06:42 +04:00
}
2011-09-22 19:50:27 +04:00
2012-05-12 07:28:49 +04:00
for ( i = 0 ; i < nr_pages ; i + + )
map_page [ i ] = kmap_atomic ( pages [ i ] ) ;
2011-09-22 19:50:27 +04:00
local_save_flags ( irq_flags ) ;
size = sizeof ( * entry ) + cnt + 2 ; /* possible \n added */
2013-07-02 02:31:24 +04:00
buffer = tr - > trace_buffer . buffer ;
2011-09-22 19:50:27 +04:00
event = trace_buffer_lock_reserve ( buffer , TRACE_PRINT , size ,
irq_flags , preempt_count ( ) ) ;
if ( ! event ) {
/* Ring buffer disabled, return as if not open for write */
written = - EBADF ;
goto out_unlock ;
2008-09-16 23:06:42 +04:00
}
2011-09-22 19:50:27 +04:00
entry = ring_buffer_event_data ( event ) ;
entry - > ip = _THIS_IP_ ;
if ( nr_pages = = 2 ) {
len = PAGE_SIZE - offset ;
2012-05-12 07:28:49 +04:00
memcpy ( & entry - > buf , map_page [ 0 ] + offset , len ) ;
memcpy ( & entry - > buf [ len ] , map_page [ 1 ] , cnt - len ) ;
2009-11-16 22:56:13 +03:00
} else
2012-05-12 07:28:49 +04:00
memcpy ( & entry - > buf , map_page [ 0 ] + offset , cnt ) ;
2008-09-16 23:06:42 +04:00
2011-09-22 19:50:27 +04:00
if ( entry - > buf [ cnt - 1 ] ! = ' \n ' ) {
entry - > buf [ cnt ] = ' \n ' ;
entry - > buf [ cnt + 1 ] = ' \0 ' ;
} else
entry - > buf [ cnt ] = ' \0 ' ;
2012-10-11 20:14:25 +04:00
__buffer_unlock_commit ( buffer , event ) ;
2008-09-16 23:06:42 +04:00
2011-09-22 19:50:27 +04:00
written = cnt ;
2008-09-16 23:06:42 +04:00
2011-09-22 19:50:27 +04:00
* fpos + = written ;
tracing: Sanitize value returned from write(trace_marker, "...", len)
When userspace code writes non-new-line-terminated string to trace_marker
file, write handler appends new-line and returns number of bytes written
to trace buffer, so
write(fd, "abc", 3) will return 4
That's unexpected and unfortunately it confuses glibc's fprintf function.
Example:
int main() {
fprintf(stderr, "abc");
return 0;
}
$ gcc test.c -o test
$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
$ ./test 2>/sys/kernel/debug/tracing/trace_marker
results in infinite loop:
write(fd, "abc", 3) = 4
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
(...)
...and kernel trace buffer full of empty markers.
Fix it by sanitizing write return value.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
LKML-Reference: <20100727231801.GB2826@joi.lan>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-07-28 03:18:01 +04:00
2011-09-22 19:50:27 +04:00
out_unlock :
2014-12-18 05:50:56 +03:00
for ( i = nr_pages - 1 ; i > = 0 ; i - - ) {
2012-05-12 07:28:49 +04:00
kunmap_atomic ( map_page [ i ] ) ;
put_page ( pages [ i ] ) ;
}
2011-09-22 19:50:27 +04:00
out :
tracing: Sanitize value returned from write(trace_marker, "...", len)
When userspace code writes non-new-line-terminated string to trace_marker
file, write handler appends new-line and returns number of bytes written
to trace buffer, so
write(fd, "abc", 3) will return 4
That's unexpected and unfortunately it confuses glibc's fprintf function.
Example:
int main() {
fprintf(stderr, "abc");
return 0;
}
$ gcc test.c -o test
$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
$ ./test 2>/sys/kernel/debug/tracing/trace_marker
results in infinite loop:
write(fd, "abc", 3) = 4
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
write(fd, "", 1) = 0
(...)
...and kernel trace buffer full of empty markers.
Fix it by sanitizing write return value.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
LKML-Reference: <20100727231801.GB2826@joi.lan>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-07-28 03:18:01 +04:00
return written ;
2008-09-16 23:06:42 +04:00
}
2009-12-08 06:16:11 +03:00
static int tracing_clock_show ( struct seq_file * m , void * v )
2009-08-25 12:12:56 +04:00
{
2012-05-11 21:29:49 +04:00
struct trace_array * tr = m - > private ;
2009-08-25 12:12:56 +04:00
int i ;
for ( i = 0 ; i < ARRAY_SIZE ( trace_clocks ) ; i + + )
2009-12-08 06:16:11 +03:00
seq_printf ( m ,
2009-08-25 12:12:56 +04:00
" %s%s%s%s " , i ? " " : " " ,
2012-05-11 21:29:49 +04:00
i = = tr - > clock_id ? " [ " : " " , trace_clocks [ i ] . name ,
i = = tr - > clock_id ? " ] " : " " ) ;
2009-12-08 06:16:11 +03:00
seq_putc ( m , ' \n ' ) ;
2009-08-25 12:12:56 +04:00
2009-12-08 06:16:11 +03:00
return 0 ;
2009-08-25 12:12:56 +04:00
}
2014-02-11 08:38:46 +04:00
static int tracing_set_clock ( struct trace_array * tr , const char * clockstr )
2009-08-25 12:12:56 +04:00
{
int i ;
for ( i = 0 ; i < ARRAY_SIZE ( trace_clocks ) ; i + + ) {
if ( strcmp ( trace_clocks [ i ] . name , clockstr ) = = 0 )
break ;
}
if ( i = = ARRAY_SIZE ( trace_clocks ) )
return - EINVAL ;
mutex_lock ( & trace_types_lock ) ;
2012-05-11 21:29:49 +04:00
tr - > clock_id = i ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ring_buffer_set_clock ( tr - > trace_buffer . buffer , trace_clocks [ i ] . func ) ;
2009-08-25 12:12:56 +04:00
2012-10-12 03:27:52 +04:00
/*
* New clock may not be consistent with the previous clock .
* Reset the buffer so that it doesn ' t have incomparable timestamps .
*/
2013-08-03 05:36:16 +04:00
tracing_reset_online_cpus ( & tr - > trace_buffer ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
if ( tr - > flags & TRACE_ARRAY_FL_GLOBAL & & tr - > max_buffer . buffer )
ring_buffer_set_clock ( tr - > max_buffer . buffer , trace_clocks [ i ] . func ) ;
2013-08-03 05:36:16 +04:00
tracing_reset_online_cpus ( & tr - > max_buffer ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
# endif
2012-10-12 03:27:52 +04:00
2009-08-25 12:12:56 +04:00
mutex_unlock ( & trace_types_lock ) ;
2014-02-11 08:38:46 +04:00
return 0 ;
}
static ssize_t tracing_clock_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * fpos )
{
struct seq_file * m = filp - > private_data ;
struct trace_array * tr = m - > private ;
char buf [ 64 ] ;
const char * clockstr ;
int ret ;
if ( cnt > = sizeof ( buf ) )
return - EINVAL ;
if ( copy_from_user ( & buf , ubuf , cnt ) )
return - EFAULT ;
buf [ cnt ] = 0 ;
clockstr = strstrip ( buf ) ;
ret = tracing_set_clock ( tr , clockstr ) ;
if ( ret )
return ret ;
2009-08-25 12:12:56 +04:00
* fpos + = cnt ;
return cnt ;
}
2009-12-08 06:16:11 +03:00
static int tracing_clock_open ( struct inode * inode , struct file * file )
{
2013-07-02 07:34:22 +04:00
struct trace_array * tr = inode - > i_private ;
int ret ;
2009-12-08 06:16:11 +03:00
if ( tracing_disabled )
return - ENODEV ;
2012-05-11 21:29:49 +04:00
2013-07-02 07:34:22 +04:00
if ( trace_array_get ( tr ) )
return - ENODEV ;
ret = single_open ( file , tracing_clock_show , inode - > i_private ) ;
if ( ret < 0 )
trace_array_put ( tr ) ;
return ret ;
2009-12-08 06:16:11 +03:00
}
2013-03-06 01:18:16 +04:00
struct ftrace_buffer_info {
struct trace_iterator iter ;
void * spare ;
unsigned int read ;
} ;
2012-12-26 06:53:00 +04:00
# ifdef CONFIG_TRACER_SNAPSHOT
static int tracing_snapshot_open ( struct inode * inode , struct file * file )
{
2013-07-23 19:26:10 +04:00
struct trace_array * tr = inode - > i_private ;
2012-12-26 06:53:00 +04:00
struct trace_iterator * iter ;
2012-05-11 21:29:49 +04:00
struct seq_file * m ;
2012-12-26 06:53:00 +04:00
int ret = 0 ;
2013-07-02 06:50:29 +04:00
if ( trace_array_get ( tr ) < 0 )
return - ENODEV ;
2012-12-26 06:53:00 +04:00
if ( file - > f_mode & FMODE_READ ) {
2013-07-23 19:26:10 +04:00
iter = __tracing_open ( inode , file , true ) ;
2012-12-26 06:53:00 +04:00
if ( IS_ERR ( iter ) )
ret = PTR_ERR ( iter ) ;
2012-05-11 21:29:49 +04:00
} else {
/* Writes still need the seq_file to hold the private data */
2013-07-18 22:18:44 +04:00
ret = - ENOMEM ;
2012-05-11 21:29:49 +04:00
m = kzalloc ( sizeof ( * m ) , GFP_KERNEL ) ;
if ( ! m )
2013-07-18 22:18:44 +04:00
goto out ;
2012-05-11 21:29:49 +04:00
iter = kzalloc ( sizeof ( * iter ) , GFP_KERNEL ) ;
if ( ! iter ) {
kfree ( m ) ;
2013-07-18 22:18:44 +04:00
goto out ;
2012-05-11 21:29:49 +04:00
}
2013-07-18 22:18:44 +04:00
ret = 0 ;
2013-07-02 06:50:29 +04:00
iter - > tr = tr ;
2013-07-23 19:26:10 +04:00
iter - > trace_buffer = & tr - > max_buffer ;
iter - > cpu_file = tracing_get_cpu ( inode ) ;
2012-05-11 21:29:49 +04:00
m - > private = iter ;
file - > private_data = m ;
2012-12-26 06:53:00 +04:00
}
2013-07-18 22:18:44 +04:00
out :
2013-07-02 06:50:29 +04:00
if ( ret < 0 )
trace_array_put ( tr ) ;
2012-12-26 06:53:00 +04:00
return ret ;
}
static ssize_t
tracing_snapshot_write ( struct file * filp , const char __user * ubuf , size_t cnt ,
loff_t * ppos )
{
2012-05-11 21:29:49 +04:00
struct seq_file * m = filp - > private_data ;
struct trace_iterator * iter = m - > private ;
struct trace_array * tr = iter - > tr ;
2012-12-26 06:53:00 +04:00
unsigned long val ;
int ret ;
ret = tracing_update_buffers ( ) ;
if ( ret < 0 )
return ret ;
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
return ret ;
mutex_lock ( & trace_types_lock ) ;
2012-05-11 21:29:49 +04:00
if ( tr - > current_trace - > use_max_tr ) {
2012-12-26 06:53:00 +04:00
ret = - EBUSY ;
goto out ;
}
switch ( val ) {
case 0 :
2013-03-05 23:35:11 +04:00
if ( iter - > cpu_file ! = RING_BUFFER_ALL_CPUS ) {
ret = - EINVAL ;
break ;
2012-12-26 06:53:00 +04:00
}
2013-03-12 19:17:54 +04:00
if ( tr - > allocated_snapshot )
free_snapshot ( tr ) ;
2012-12-26 06:53:00 +04:00
break ;
case 1 :
2013-03-05 23:35:11 +04:00
/* Only allow per-cpu swap if the ring buffer supports it */
# ifndef CONFIG_RING_BUFFER_ALLOW_SWAP
if ( iter - > cpu_file ! = RING_BUFFER_ALL_CPUS ) {
ret = - EINVAL ;
break ;
}
# endif
2013-03-06 03:25:02 +04:00
if ( ! tr - > allocated_snapshot ) {
2013-03-12 19:17:54 +04:00
ret = alloc_snapshot ( tr ) ;
2012-12-26 06:53:00 +04:00
if ( ret < 0 )
break ;
}
local_irq_disable ( ) ;
/* Now, we're going to swap */
2013-03-05 23:35:11 +04:00
if ( iter - > cpu_file = = RING_BUFFER_ALL_CPUS )
2013-03-06 06:23:55 +04:00
update_max_tr ( tr , current , smp_processor_id ( ) ) ;
2013-03-05 23:35:11 +04:00
else
2013-03-06 06:23:55 +04:00
update_max_tr_single ( tr , current , iter - > cpu_file ) ;
2012-12-26 06:53:00 +04:00
local_irq_enable ( ) ;
break ;
default :
2013-03-06 03:25:02 +04:00
if ( tr - > allocated_snapshot ) {
2013-03-05 23:35:11 +04:00
if ( iter - > cpu_file = = RING_BUFFER_ALL_CPUS )
tracing_reset_online_cpus ( & tr - > max_buffer ) ;
else
tracing_reset ( & tr - > max_buffer , iter - > cpu_file ) ;
}
2012-12-26 06:53:00 +04:00
break ;
}
if ( ret > = 0 ) {
* ppos + = cnt ;
ret = cnt ;
}
out :
mutex_unlock ( & trace_types_lock ) ;
return ret ;
}
2012-05-11 21:29:49 +04:00
static int tracing_snapshot_release ( struct inode * inode , struct file * file )
{
struct seq_file * m = file - > private_data ;
2013-07-02 06:50:29 +04:00
int ret ;
ret = tracing_release ( inode , file ) ;
2012-05-11 21:29:49 +04:00
if ( file - > f_mode & FMODE_READ )
2013-07-02 06:50:29 +04:00
return ret ;
2012-05-11 21:29:49 +04:00
/* If write only, the seq_file is just a stub */
if ( m )
kfree ( m - > private ) ;
kfree ( m ) ;
return 0 ;
}
2013-03-06 01:18:16 +04:00
static int tracing_buffers_open ( struct inode * inode , struct file * filp ) ;
static ssize_t tracing_buffers_read ( struct file * filp , char __user * ubuf ,
size_t count , loff_t * ppos ) ;
static int tracing_buffers_release ( struct inode * inode , struct file * file ) ;
static ssize_t tracing_buffers_splice_read ( struct file * file , loff_t * ppos ,
struct pipe_inode_info * pipe , size_t len , unsigned int flags ) ;
static int snapshot_raw_open ( struct inode * inode , struct file * filp )
{
struct ftrace_buffer_info * info ;
int ret ;
ret = tracing_buffers_open ( inode , filp ) ;
if ( ret < 0 )
return ret ;
info = filp - > private_data ;
if ( info - > iter . trace - > use_max_tr ) {
tracing_buffers_release ( inode , filp ) ;
return - EBUSY ;
}
info - > iter . snapshot = true ;
info - > iter . trace_buffer = & info - > iter . tr - > max_buffer ;
return ret ;
}
2012-12-26 06:53:00 +04:00
# endif /* CONFIG_TRACER_SNAPSHOT */
2014-07-18 15:17:27 +04:00
static const struct file_operations tracing_thresh_fops = {
. open = tracing_open_generic ,
. read = tracing_thresh_read ,
. write = tracing_thresh_write ,
. llseek = generic_file_llseek ,
} ;
2009-03-06 05:44:55 +03:00
static const struct file_operations tracing_max_lat_fops = {
2008-05-12 23:20:46 +04:00
. open = tracing_open_generic ,
. read = tracing_max_lat_read ,
. write = tracing_max_lat_write ,
2010-07-08 01:40:11 +04:00
. llseek = generic_file_llseek ,
2008-05-12 23:20:42 +04:00
} ;
2009-03-06 05:44:55 +03:00
static const struct file_operations set_tracer_fops = {
2008-05-12 23:20:46 +04:00
. open = tracing_open_generic ,
. read = tracing_set_trace_read ,
. write = tracing_set_trace_write ,
2010-07-08 01:40:11 +04:00
. llseek = generic_file_llseek ,
2008-05-12 23:20:42 +04:00
} ;
2009-03-06 05:44:55 +03:00
static const struct file_operations tracing_pipe_fops = {
2008-05-12 23:20:46 +04:00
. open = tracing_open_pipe ,
2008-05-12 23:20:49 +04:00
. poll = tracing_poll_pipe ,
2008-05-12 23:20:46 +04:00
. read = tracing_read_pipe ,
2009-02-09 09:15:56 +03:00
. splice_read = tracing_splice_read_pipe ,
2008-05-12 23:20:46 +04:00
. release = tracing_release_pipe ,
2010-07-08 01:40:11 +04:00
. llseek = no_llseek ,
2008-05-12 23:20:46 +04:00
} ;
2009-03-06 05:44:55 +03:00
static const struct file_operations tracing_entries_fops = {
2013-07-23 19:26:06 +04:00
. open = tracing_open_generic_tr ,
2008-05-12 23:20:59 +04:00
. read = tracing_entries_read ,
. write = tracing_entries_write ,
2010-07-08 01:40:11 +04:00
. llseek = generic_file_llseek ,
2013-07-23 19:26:06 +04:00
. release = tracing_release_generic_tr ,
2008-05-12 23:20:59 +04:00
} ;
2011-08-17 01:46:15 +04:00
static const struct file_operations tracing_total_entries_fops = {
2013-07-02 07:34:22 +04:00
. open = tracing_open_generic_tr ,
2011-08-17 01:46:15 +04:00
. read = tracing_total_entries_read ,
. llseek = generic_file_llseek ,
2013-07-02 07:34:22 +04:00
. release = tracing_release_generic_tr ,
2011-08-17 01:46:15 +04:00
} ;
2011-06-14 04:51:57 +04:00
static const struct file_operations tracing_free_buffer_fops = {
2013-07-02 07:34:22 +04:00
. open = tracing_open_generic_tr ,
2011-06-14 04:51:57 +04:00
. write = tracing_free_buffer_write ,
. release = tracing_free_buffer_release ,
} ;
2009-03-06 05:44:55 +03:00
static const struct file_operations tracing_mark_fops = {
2013-07-02 07:34:22 +04:00
. open = tracing_open_generic_tr ,
2008-09-16 23:06:42 +04:00
. write = tracing_mark_write ,
2010-07-08 01:40:11 +04:00
. llseek = generic_file_llseek ,
2013-07-02 07:34:22 +04:00
. release = tracing_release_generic_tr ,
2008-09-16 23:06:42 +04:00
} ;
2009-08-25 12:12:56 +04:00
static const struct file_operations trace_clock_fops = {
2009-12-08 06:16:11 +03:00
. open = tracing_clock_open ,
. read = seq_read ,
. llseek = seq_lseek ,
2013-07-02 07:34:22 +04:00
. release = tracing_single_release_tr ,
2009-08-25 12:12:56 +04:00
. write = tracing_clock_write ,
} ;
2012-12-26 06:53:00 +04:00
# ifdef CONFIG_TRACER_SNAPSHOT
static const struct file_operations snapshot_fops = {
. open = tracing_snapshot_open ,
. read = seq_read ,
. write = tracing_snapshot_write ,
2013-12-22 02:39:40 +04:00
. llseek = tracing_lseek ,
2012-05-11 21:29:49 +04:00
. release = tracing_snapshot_release ,
2012-12-26 06:53:00 +04:00
} ;
2013-03-06 01:18:16 +04:00
static const struct file_operations snapshot_raw_fops = {
. open = snapshot_raw_open ,
. read = tracing_buffers_read ,
. release = tracing_buffers_release ,
. splice_read = tracing_buffers_splice_read ,
. llseek = no_llseek ,
2008-12-02 06:20:19 +03:00
} ;
2013-03-06 01:18:16 +04:00
# endif /* CONFIG_TRACER_SNAPSHOT */
2008-12-02 06:20:19 +03:00
static int tracing_buffers_open ( struct inode * inode , struct file * filp )
{
2013-07-23 19:26:00 +04:00
struct trace_array * tr = inode - > i_private ;
2008-12-02 06:20:19 +03:00
struct ftrace_buffer_info * info ;
2013-07-02 07:34:22 +04:00
int ret ;
2008-12-02 06:20:19 +03:00
if ( tracing_disabled )
return - ENODEV ;
2013-07-02 07:34:22 +04:00
if ( trace_array_get ( tr ) < 0 )
return - ENODEV ;
2008-12-02 06:20:19 +03:00
info = kzalloc ( sizeof ( * info ) , GFP_KERNEL ) ;
2013-07-02 07:34:22 +04:00
if ( ! info ) {
trace_array_put ( tr ) ;
2008-12-02 06:20:19 +03:00
return - ENOMEM ;
2013-07-02 07:34:22 +04:00
}
2008-12-02 06:20:19 +03:00
2013-03-07 00:27:24 +04:00
mutex_lock ( & trace_types_lock ) ;
2013-02-28 18:17:16 +04:00
info - > iter . tr = tr ;
2013-07-23 19:26:00 +04:00
info - > iter . cpu_file = tracing_get_cpu ( inode ) ;
2013-02-28 22:44:11 +04:00
info - > iter . trace = tr - > current_trace ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
info - > iter . trace_buffer = & tr - > trace_buffer ;
2013-02-28 18:17:16 +04:00
info - > spare = NULL ;
2008-12-02 06:20:19 +03:00
/* Force reading ring buffer for first read */
2013-02-28 18:17:16 +04:00
info - > read = ( unsigned int ) - 1 ;
2008-12-02 06:20:19 +03:00
filp - > private_data = info ;
2014-12-16 04:13:31 +03:00
tr - > current_trace - > ref + + ;
2013-03-07 00:27:24 +04:00
mutex_unlock ( & trace_types_lock ) ;
2013-07-02 07:34:22 +04:00
ret = nonseekable_open ( inode , filp ) ;
if ( ret < 0 )
trace_array_put ( tr ) ;
return ret ;
2008-12-02 06:20:19 +03:00
}
2013-02-28 18:17:16 +04:00
static unsigned int
tracing_buffers_poll ( struct file * filp , poll_table * poll_table )
{
struct ftrace_buffer_info * info = filp - > private_data ;
struct trace_iterator * iter = & info - > iter ;
return trace_poll ( iter , filp , poll_table ) ;
}
2008-12-02 06:20:19 +03:00
static ssize_t
tracing_buffers_read ( struct file * filp , char __user * ubuf ,
size_t count , loff_t * ppos )
{
struct ftrace_buffer_info * info = filp - > private_data ;
2013-02-28 18:17:16 +04:00
struct trace_iterator * iter = & info - > iter ;
2008-12-02 06:20:19 +03:00
ssize_t ret ;
2013-03-06 01:18:16 +04:00
ssize_t size ;
2008-12-02 06:20:19 +03:00
2009-03-05 03:10:05 +03:00
if ( ! count )
return 0 ;
2013-03-06 01:18:16 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
2014-12-16 06:31:07 +03:00
if ( iter - > snapshot & & iter - > tr - > current_trace - > use_max_tr )
return - EBUSY ;
2013-03-06 01:18:16 +04:00
# endif
2009-04-02 11:16:59 +04:00
if ( ! info - > spare )
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
info - > spare = ring_buffer_alloc_read_page ( iter - > trace_buffer - > buffer ,
iter - > cpu_file ) ;
2009-04-02 11:16:59 +04:00
if ( ! info - > spare )
2014-12-16 06:31:07 +03:00
return - ENOMEM ;
2009-04-02 11:16:59 +04:00
2008-12-02 06:20:19 +03:00
/* Do we have previous read data to read? */
if ( info - > read < PAGE_SIZE )
goto read ;
2013-02-28 22:44:11 +04:00
again :
2013-02-28 18:17:16 +04:00
trace_access_lock ( iter - > cpu_file ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ret = ring_buffer_read_page ( iter - > trace_buffer - > buffer ,
2008-12-02 06:20:19 +03:00
& info - > spare ,
count ,
2013-02-28 18:17:16 +04:00
iter - > cpu_file , 0 ) ;
trace_access_unlock ( iter - > cpu_file ) ;
2008-12-02 06:20:19 +03:00
2013-02-28 22:44:11 +04:00
if ( ret < 0 ) {
if ( trace_empty ( iter ) ) {
2014-12-16 06:31:07 +03:00
if ( ( filp - > f_flags & O_NONBLOCK ) )
return - EAGAIN ;
2014-11-10 21:46:34 +03:00
ret = wait_on_pipe ( iter , false ) ;
2014-12-16 06:31:07 +03:00
if ( ret )
return ret ;
2013-02-28 22:44:11 +04:00
goto again ;
}
2014-12-16 06:31:07 +03:00
return 0 ;
2013-02-28 22:44:11 +04:00
}
2011-10-14 18:44:25 +04:00
info - > read = 0 ;
2013-02-28 22:44:11 +04:00
read :
2008-12-02 06:20:19 +03:00
size = PAGE_SIZE - info - > read ;
if ( size > count )
size = count ;
ret = copy_to_user ( ubuf , info - > spare + info - > read , size ) ;
2014-12-16 06:31:07 +03:00
if ( ret = = size )
return - EFAULT ;
2009-03-05 03:10:05 +03:00
size - = ret ;
2008-12-02 06:20:19 +03:00
* ppos + = size ;
info - > read + = size ;
return size ;
}
static int tracing_buffers_release ( struct inode * inode , struct file * file )
{
struct ftrace_buffer_info * info = file - > private_data ;
2013-02-28 18:17:16 +04:00
struct trace_iterator * iter = & info - > iter ;
2008-12-02 06:20:19 +03:00
2013-03-07 00:27:24 +04:00
mutex_lock ( & trace_types_lock ) ;
2014-12-16 04:13:31 +03:00
iter - > tr - > current_trace - > ref - - ;
2013-07-02 06:50:29 +04:00
__trace_array_put ( iter - > tr ) ;
2008-12-02 06:20:19 +03:00
2009-04-02 11:16:59 +04:00
if ( info - > spare )
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ring_buffer_free_read_page ( iter - > trace_buffer - > buffer , info - > spare ) ;
2008-12-02 06:20:19 +03:00
kfree ( info ) ;
2013-03-07 00:27:24 +04:00
mutex_unlock ( & trace_types_lock ) ;
2008-12-02 06:20:19 +03:00
return 0 ;
}
struct buffer_ref {
struct ring_buffer * buffer ;
void * page ;
int ref ;
} ;
static void buffer_pipe_buf_release ( struct pipe_inode_info * pipe ,
struct pipe_buffer * buf )
{
struct buffer_ref * ref = ( struct buffer_ref * ) buf - > private ;
if ( - - ref - > ref )
return ;
ring_buffer_free_read_page ( ref - > buffer , ref - > page ) ;
kfree ( ref ) ;
buf - > private = 0 ;
}
static void buffer_pipe_buf_get ( struct pipe_inode_info * pipe ,
struct pipe_buffer * buf )
{
struct buffer_ref * ref = ( struct buffer_ref * ) buf - > private ;
ref - > ref + + ;
}
/* Pipe buffer operations for a buffer. */
2009-12-16 03:46:48 +03:00
static const struct pipe_buf_operations buffer_pipe_buf_ops = {
2008-12-02 06:20:19 +03:00
. can_merge = 0 ,
. confirm = generic_pipe_buf_confirm ,
. release = buffer_pipe_buf_release ,
2012-08-09 16:31:10 +04:00
. steal = generic_pipe_buf_steal ,
2008-12-02 06:20:19 +03:00
. get = buffer_pipe_buf_get ,
} ;
/*
* Callback from splice_to_pipe ( ) , if we need to release some pages
* at the end of the spd in case we error ' ed out in filling the pipe .
*/
static void buffer_spd_release ( struct splice_pipe_desc * spd , unsigned int i )
{
struct buffer_ref * ref =
( struct buffer_ref * ) spd - > partial [ i ] . private ;
if ( - - ref - > ref )
return ;
ring_buffer_free_read_page ( ref - > buffer , ref - > page ) ;
kfree ( ref ) ;
spd - > partial [ i ] . private = 0 ;
}
static ssize_t
tracing_buffers_splice_read ( struct file * file , loff_t * ppos ,
struct pipe_inode_info * pipe , size_t len ,
unsigned int flags )
{
struct ftrace_buffer_info * info = file - > private_data ;
2013-02-28 18:17:16 +04:00
struct trace_iterator * iter = & info - > iter ;
2010-05-20 12:43:18 +04:00
struct partial_page partial_def [ PIPE_DEF_BUFFERS ] ;
struct page * pages_def [ PIPE_DEF_BUFFERS ] ;
2008-12-02 06:20:19 +03:00
struct splice_pipe_desc spd = {
2010-05-20 12:43:18 +04:00
. pages = pages_def ,
. partial = partial_def ,
2012-06-12 17:24:40 +04:00
. nr_pages_max = PIPE_DEF_BUFFERS ,
2008-12-02 06:20:19 +03:00
. flags = flags ,
. ops = & buffer_pipe_buf_ops ,
. spd_release = buffer_spd_release ,
} ;
struct buffer_ref * ref ;
2009-04-29 08:23:13 +04:00
int entries , size , i ;
2014-11-07 00:26:07 +03:00
ssize_t ret = 0 ;
2008-12-02 06:20:19 +03:00
2013-03-06 01:18:16 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
2014-12-16 06:31:07 +03:00
if ( iter - > snapshot & & iter - > tr - > current_trace - > use_max_tr )
return - EBUSY ;
2013-03-06 01:18:16 +04:00
# endif
2014-12-16 06:31:07 +03:00
if ( splice_grow_spd ( pipe , & spd ) )
return - ENOMEM ;
2010-05-20 12:43:18 +04:00
2014-12-16 06:31:07 +03:00
if ( * ppos & ( PAGE_SIZE - 1 ) )
return - EINVAL ;
tracing: fix splice return too large
I got these from strace:
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
It is because the return value of tracing_buffers_splice_read()
does not include "zero out any left over data" bytes.
But tracing_buffers_read() includes these bytes, we make them
consistent.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 11:17:08 +04:00
if ( len & ( PAGE_SIZE - 1 ) ) {
2014-12-16 06:31:07 +03:00
if ( len < PAGE_SIZE )
return - EINVAL ;
tracing: fix splice return too large
I got these from strace:
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
It is because the return value of tracing_buffers_splice_read()
does not include "zero out any left over data" bytes.
But tracing_buffers_read() includes these bytes, we make them
consistent.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 11:17:08 +04:00
len & = PAGE_MASK ;
}
2013-02-28 18:17:16 +04:00
again :
trace_access_lock ( iter - > cpu_file ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
entries = ring_buffer_entries_cpu ( iter - > trace_buffer - > buffer , iter - > cpu_file ) ;
2009-04-29 08:23:13 +04:00
2014-04-11 20:01:03 +04:00
for ( i = 0 ; i < spd . nr_pages_max & & len & & entries ; i + + , len - = PAGE_SIZE ) {
2008-12-02 06:20:19 +03:00
struct page * page ;
int r ;
ref = kzalloc ( sizeof ( * ref ) , GFP_KERNEL ) ;
2014-11-07 00:26:07 +03:00
if ( ! ref ) {
ret = - ENOMEM ;
2008-12-02 06:20:19 +03:00
break ;
2014-11-07 00:26:07 +03:00
}
2008-12-02 06:20:19 +03:00
2009-04-29 08:16:21 +04:00
ref - > ref = 1 ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ref - > buffer = iter - > trace_buffer - > buffer ;
2013-02-28 18:17:16 +04:00
ref - > page = ring_buffer_alloc_read_page ( ref - > buffer , iter - > cpu_file ) ;
2008-12-02 06:20:19 +03:00
if ( ! ref - > page ) {
2014-11-07 00:26:07 +03:00
ret = - ENOMEM ;
2008-12-02 06:20:19 +03:00
kfree ( ref ) ;
break ;
}
r = ring_buffer_read_page ( ref - > buffer , & ref - > page ,
2013-02-28 18:17:16 +04:00
len , iter - > cpu_file , 1 ) ;
2008-12-02 06:20:19 +03:00
if ( r < 0 ) {
2011-05-04 04:56:42 +04:00
ring_buffer_free_read_page ( ref - > buffer , ref - > page ) ;
2008-12-02 06:20:19 +03:00
kfree ( ref ) ;
break ;
}
/*
* zero out any left over data , this is going to
* user land .
*/
size = ring_buffer_page_len ( ref - > page ) ;
if ( size < PAGE_SIZE )
memset ( ref - > page + size , 0 , PAGE_SIZE - size ) ;
page = virt_to_page ( ref - > page ) ;
spd . pages [ i ] = page ;
spd . partial [ i ] . len = PAGE_SIZE ;
spd . partial [ i ] . offset = 0 ;
spd . partial [ i ] . private = ( unsigned long ) ref ;
spd . nr_pages + + ;
tracing: fix splice return too large
I got these from strace:
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
It is because the return value of tracing_buffers_splice_read()
does not include "zero out any left over data" bytes.
But tracing_buffers_read() includes these bytes, we make them
consistent.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 11:17:08 +04:00
* ppos + = PAGE_SIZE ;
2009-04-29 08:23:13 +04:00
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
entries = ring_buffer_entries_cpu ( iter - > trace_buffer - > buffer , iter - > cpu_file ) ;
2008-12-02 06:20:19 +03:00
}
2013-02-28 18:17:16 +04:00
trace_access_unlock ( iter - > cpu_file ) ;
2008-12-02 06:20:19 +03:00
spd . nr_pages = i ;
/* did we read anything? */
if ( ! spd . nr_pages ) {
2014-11-07 00:26:07 +03:00
if ( ret )
2014-12-16 06:31:07 +03:00
return ret ;
if ( ( file - > f_flags & O_NONBLOCK ) | | ( flags & SPLICE_F_NONBLOCK ) )
return - EAGAIN ;
2014-11-07 00:26:07 +03:00
2014-11-10 21:46:34 +03:00
ret = wait_on_pipe ( iter , true ) ;
2014-06-10 17:46:00 +04:00
if ( ret )
2014-12-16 06:31:07 +03:00
return ret ;
2014-11-10 21:46:34 +03:00
2013-02-28 18:17:16 +04:00
goto again ;
2008-12-02 06:20:19 +03:00
}
ret = splice_to_pipe ( pipe , & spd ) ;
2012-06-12 17:24:40 +04:00
splice_shrink_spd ( & spd ) ;
2013-03-06 01:18:16 +04:00
2008-12-02 06:20:19 +03:00
return ret ;
}
static const struct file_operations tracing_buffers_fops = {
. open = tracing_buffers_open ,
. read = tracing_buffers_read ,
2013-02-28 18:17:16 +04:00
. poll = tracing_buffers_poll ,
2008-12-02 06:20:19 +03:00
. release = tracing_buffers_release ,
. splice_read = tracing_buffers_splice_read ,
. llseek = no_llseek ,
} ;
2009-04-30 02:03:45 +04:00
static ssize_t
tracing_stats_read ( struct file * filp , char __user * ubuf ,
size_t count , loff_t * ppos )
{
2013-07-23 19:26:03 +04:00
struct inode * inode = file_inode ( filp ) ;
struct trace_array * tr = inode - > i_private ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
struct trace_buffer * trace_buf = & tr - > trace_buffer ;
2013-07-23 19:26:03 +04:00
int cpu = tracing_get_cpu ( inode ) ;
2009-04-30 02:03:45 +04:00
struct trace_seq * s ;
unsigned long cnt ;
2011-08-17 01:46:16 +04:00
unsigned long long t ;
unsigned long usec_rem ;
2009-04-30 02:03:45 +04:00
2009-06-15 06:57:28 +04:00
s = kmalloc ( sizeof ( * s ) , GFP_KERNEL ) ;
2009-04-30 02:03:45 +04:00
if ( ! s )
2009-11-12 00:26:35 +03:00
return - ENOMEM ;
2009-04-30 02:03:45 +04:00
trace_seq_init ( s ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
cnt = ring_buffer_entries_cpu ( trace_buf - > buffer , cpu ) ;
2009-04-30 02:03:45 +04:00
trace_seq_printf ( s , " entries: %ld \n " , cnt ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
cnt = ring_buffer_overrun_cpu ( trace_buf - > buffer , cpu ) ;
2009-04-30 02:03:45 +04:00
trace_seq_printf ( s , " overrun: %ld \n " , cnt ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
cnt = ring_buffer_commit_overrun_cpu ( trace_buf - > buffer , cpu ) ;
2009-04-30 02:03:45 +04:00
trace_seq_printf ( s , " commit overrun: %ld \n " , cnt ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
cnt = ring_buffer_bytes_cpu ( trace_buf - > buffer , cpu ) ;
2011-08-17 01:46:16 +04:00
trace_seq_printf ( s , " bytes: %ld \n " , cnt ) ;
2013-04-23 05:32:39 +04:00
if ( trace_clocks [ tr - > clock_id ] . in_ns ) {
2012-11-14 00:18:23 +04:00
/* local or global for trace_clock */
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
t = ns2usecs ( ring_buffer_oldest_event_ts ( trace_buf - > buffer , cpu ) ) ;
2012-11-14 00:18:23 +04:00
usec_rem = do_div ( t , USEC_PER_SEC ) ;
trace_seq_printf ( s , " oldest event ts: %5llu.%06lu \n " ,
t , usec_rem ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
t = ns2usecs ( ring_buffer_time_stamp ( trace_buf - > buffer , cpu ) ) ;
2012-11-14 00:18:23 +04:00
usec_rem = do_div ( t , USEC_PER_SEC ) ;
trace_seq_printf ( s , " now ts: %5llu.%06lu \n " , t , usec_rem ) ;
} else {
/* counter or tsc mode for trace_clock */
trace_seq_printf ( s , " oldest event ts: %llu \n " ,
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ring_buffer_oldest_event_ts ( trace_buf - > buffer , cpu ) ) ;
2011-08-17 01:46:16 +04:00
2012-11-14 00:18:23 +04:00
trace_seq_printf ( s , " now ts: %llu \n " ,
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
ring_buffer_time_stamp ( trace_buf - > buffer , cpu ) ) ;
2012-11-14 00:18:23 +04:00
}
2011-08-17 01:46:16 +04:00
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
cnt = ring_buffer_dropped_events_cpu ( trace_buf - > buffer , cpu ) ;
2011-07-16 01:23:58 +04:00
trace_seq_printf ( s , " dropped events: %ld \n " , cnt ) ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
cnt = ring_buffer_read_events_cpu ( trace_buf - > buffer , cpu ) ;
2013-01-30 02:45:49 +04:00
trace_seq_printf ( s , " read events: %ld \n " , cnt ) ;
2014-11-14 23:49:41 +03:00
count = simple_read_from_buffer ( ubuf , count , ppos ,
s - > buffer , trace_seq_used ( s ) ) ;
2009-04-30 02:03:45 +04:00
kfree ( s ) ;
return count ;
}
static const struct file_operations tracing_stats_fops = {
2013-07-23 19:26:03 +04:00
. open = tracing_open_generic_tr ,
2009-04-30 02:03:45 +04:00
. read = tracing_stats_read ,
2010-07-08 01:40:11 +04:00
. llseek = generic_file_llseek ,
2013-07-23 19:26:03 +04:00
. release = tracing_release_generic_tr ,
2009-04-30 02:03:45 +04:00
} ;
2008-05-12 23:20:42 +04:00
# ifdef CONFIG_DYNAMIC_FTRACE
2008-10-30 23:08:33 +03:00
int __weak ftrace_arch_read_dyn_info ( char * buf , int size )
{
return 0 ;
}
2008-05-12 23:20:42 +04:00
static ssize_t
2008-10-30 23:08:33 +03:00
tracing_read_dyn_info ( struct file * filp , char __user * ubuf ,
2008-05-12 23:20:42 +04:00
size_t cnt , loff_t * ppos )
{
2008-10-31 07:03:22 +03:00
static char ftrace_dyn_info_buffer [ 1024 ] ;
static DEFINE_MUTEX ( dyn_info_mutex ) ;
2008-05-12 23:20:42 +04:00
unsigned long * p = filp - > private_data ;
2008-10-30 23:08:33 +03:00
char * buf = ftrace_dyn_info_buffer ;
2008-10-31 07:03:22 +03:00
int size = ARRAY_SIZE ( ftrace_dyn_info_buffer ) ;
2008-05-12 23:20:42 +04:00
int r ;
2008-10-30 23:08:33 +03:00
mutex_lock ( & dyn_info_mutex ) ;
r = sprintf ( buf , " %ld " , * p ) ;
2008-05-12 23:20:46 +04:00
2008-10-31 07:03:22 +03:00
r + = ftrace_arch_read_dyn_info ( buf + r , ( size - 1 ) - r ) ;
2008-10-30 23:08:33 +03:00
buf [ r + + ] = ' \n ' ;
r = simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
mutex_unlock ( & dyn_info_mutex ) ;
return r ;
2008-05-12 23:20:42 +04:00
}
2009-03-06 05:44:55 +03:00
static const struct file_operations tracing_dyn_info_fops = {
2008-05-12 23:20:46 +04:00
. open = tracing_open_generic ,
2008-10-30 23:08:33 +03:00
. read = tracing_read_dyn_info ,
2010-07-08 01:40:11 +04:00
. llseek = generic_file_llseek ,
2008-05-12 23:20:42 +04:00
} ;
2013-03-12 19:49:18 +04:00
# endif /* CONFIG_DYNAMIC_FTRACE */
2008-05-12 23:20:42 +04:00
2013-03-12 19:49:18 +04:00
# if defined(CONFIG_TRACER_SNAPSHOT) && defined(CONFIG_DYNAMIC_FTRACE)
static void
ftrace_snapshot ( unsigned long ip , unsigned long parent_ip , void * * data )
{
tracing_snapshot ( ) ;
}
2008-05-12 23:20:42 +04:00
2013-03-12 19:49:18 +04:00
static void
ftrace_count_snapshot ( unsigned long ip , unsigned long parent_ip , void * * data )
2008-05-12 23:20:42 +04:00
{
2013-03-12 19:49:18 +04:00
unsigned long * count = ( long * ) data ;
if ( ! * count )
return ;
2008-05-12 23:20:42 +04:00
2013-03-12 19:49:18 +04:00
if ( * count ! = - 1 )
( * count ) - - ;
tracing_snapshot ( ) ;
}
static int
ftrace_snapshot_print ( struct seq_file * m , unsigned long ip ,
struct ftrace_probe_ops * ops , void * data )
{
long count = ( long ) data ;
seq_printf ( m , " %ps: " , ( void * ) ip ) ;
2014-11-08 23:42:10 +03:00
seq_puts ( m , " snapshot " ) ;
2013-03-12 19:49:18 +04:00
if ( count = = - 1 )
2014-11-08 23:42:10 +03:00
seq_puts ( m , " :unlimited \n " ) ;
2013-03-12 19:49:18 +04:00
else
seq_printf ( m , " :count=%ld \n " , count ) ;
return 0 ;
}
static struct ftrace_probe_ops snapshot_probe_ops = {
. func = ftrace_snapshot ,
. print = ftrace_snapshot_print ,
} ;
static struct ftrace_probe_ops snapshot_count_probe_ops = {
. func = ftrace_count_snapshot ,
. print = ftrace_snapshot_print ,
} ;
static int
ftrace_trace_snapshot_callback ( struct ftrace_hash * hash ,
char * glob , char * cmd , char * param , int enable )
{
struct ftrace_probe_ops * ops ;
void * count = ( void * ) - 1 ;
char * number ;
int ret ;
/* hash funcs only work with set_ftrace_filter */
if ( ! enable )
return - EINVAL ;
ops = param ? & snapshot_count_probe_ops : & snapshot_probe_ops ;
if ( glob [ 0 ] = = ' ! ' ) {
unregister_ftrace_function_probe_func ( glob + 1 , ops ) ;
return 0 ;
}
if ( ! param )
goto out_reg ;
number = strsep ( & param , " : " ) ;
if ( ! strlen ( number ) )
goto out_reg ;
/*
* We use the callback data field ( which is a pointer )
* as our counter .
*/
ret = kstrtoul ( number , 0 , ( unsigned long * ) & count ) ;
if ( ret )
return ret ;
out_reg :
ret = register_ftrace_function_probe ( glob , ops , count ) ;
if ( ret > = 0 )
alloc_snapshot ( & global_trace ) ;
return ret < 0 ? ret : 0 ;
}
static struct ftrace_func_command ftrace_snapshot_cmd = {
. name = " snapshot " ,
. func = ftrace_trace_snapshot_callback ,
} ;
2013-10-24 17:34:18 +04:00
static __init int register_snapshot_cmd ( void )
2013-03-12 19:49:18 +04:00
{
return register_ftrace_command ( & ftrace_snapshot_cmd ) ;
}
# else
2013-10-24 17:34:18 +04:00
static inline __init int register_snapshot_cmd ( void ) { return 0 ; }
2013-03-12 19:49:18 +04:00
# endif /* defined(CONFIG_TRACER_SNAPSHOT) && defined(CONFIG_DYNAMIC_FTRACE) */
2008-05-12 23:20:42 +04:00
2015-01-27 05:00:48 +03:00
static struct dentry * tracing_get_dentry ( struct trace_array * tr )
2008-05-12 23:20:42 +04:00
{
2015-01-20 20:13:40 +03:00
if ( WARN_ON ( ! tr - > dir ) )
return ERR_PTR ( - ENODEV ) ;
/* Top directory uses NULL as the parent */
if ( tr - > flags & TRACE_ARRAY_FL_GLOBAL )
return NULL ;
/* All sub buffers have a descriptor */
2012-05-11 21:29:49 +04:00
return tr - > dir ;
2008-05-12 23:20:42 +04:00
}
2012-05-11 21:29:49 +04:00
static struct dentry * tracing_dentry_percpu ( struct trace_array * tr , int cpu )
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
{
struct dentry * d_tracer ;
2012-05-11 21:29:49 +04:00
if ( tr - > percpu_dir )
return tr - > percpu_dir ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
2015-01-27 05:00:48 +03:00
d_tracer = tracing_get_dentry ( tr ) ;
2015-01-20 19:14:16 +03:00
if ( IS_ERR ( d_tracer ) )
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
return NULL ;
2015-01-20 20:13:40 +03:00
tr - > percpu_dir = tracefs_create_dir ( " per_cpu " , d_tracer ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
2012-05-11 21:29:49 +04:00
WARN_ONCE ( ! tr - > percpu_dir ,
2015-01-20 20:13:40 +03:00
" Could not create tracefs directory 'per_cpu/%d' \n " , cpu ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
2012-05-11 21:29:49 +04:00
return tr - > percpu_dir ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
}
tracing: Introduce trace_create_cpu_file() and tracing_get_cpu()
Every "file_operations" used by tracing_init_debugfs_percpu is buggy.
f_op->open/etc does:
1. struct trace_cpu *tc = inode->i_private;
struct trace_array *tr = tc->tr;
2. trace_array_get(tr) or fail;
3. do_something(tc);
But tc (and tr) can be already freed before trace_array_get() is called.
And it doesn't matter whether this file is per-cpu or it was created by
init_tracer_debugfs(), free_percpu() or kfree() are equally bad.
Note that even 1. is not safe, the freed memory can be unmapped. But even
if it was safe trace_array_get() can wrongly succeed if we also race with
the next new_instance_create() which can re-allocate the same tr, or tc
was overwritten and ->tr points to the valid tr. In this case 3. uses the
freed/reused memory.
Add the new trivial helper, trace_create_cpu_file() which simply calls
trace_create_file() and encodes "cpu" in "struct inode". Another helper,
tracing_get_cpu() will be used to read cpu_nr-or-RING_BUFFER_ALL_CPUS.
The patch abuses ->i_cdev to encode the number, it is never used unless
the file is S_ISCHR(). But we could use something else, say, i_bytes or
even ->d_fsdata. In any case this hack is hidden inside these 2 helpers,
it would be trivial to change them if needed.
This patch only changes tracing_init_debugfs_percpu() to use the new
trace_create_cpu_file(), the next patches will change file_operations.
Note: tracing_get_cpu(inode) is always safe but you can't trust the
result unless trace_array_get() was called, without trace_types_lock
which acts as a barrier it can wrongly return RING_BUFFER_ALL_CPUS.
Link: http://lkml.kernel.org/r/20130723152554.GA23710@redhat.com
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-23 19:25:54 +04:00
static struct dentry *
trace_create_cpu_file ( const char * name , umode_t mode , struct dentry * parent ,
void * data , long cpu , const struct file_operations * fops )
{
struct dentry * ret = trace_create_file ( name , mode , parent , data , fops ) ;
if ( ret ) /* See tracing_get_cpu() */
2015-03-18 01:26:16 +03:00
d_inode ( ret ) - > i_cdev = ( void * ) ( cpu + 1 ) ;
tracing: Introduce trace_create_cpu_file() and tracing_get_cpu()
Every "file_operations" used by tracing_init_debugfs_percpu is buggy.
f_op->open/etc does:
1. struct trace_cpu *tc = inode->i_private;
struct trace_array *tr = tc->tr;
2. trace_array_get(tr) or fail;
3. do_something(tc);
But tc (and tr) can be already freed before trace_array_get() is called.
And it doesn't matter whether this file is per-cpu or it was created by
init_tracer_debugfs(), free_percpu() or kfree() are equally bad.
Note that even 1. is not safe, the freed memory can be unmapped. But even
if it was safe trace_array_get() can wrongly succeed if we also race with
the next new_instance_create() which can re-allocate the same tr, or tc
was overwritten and ->tr points to the valid tr. In this case 3. uses the
freed/reused memory.
Add the new trivial helper, trace_create_cpu_file() which simply calls
trace_create_file() and encodes "cpu" in "struct inode". Another helper,
tracing_get_cpu() will be used to read cpu_nr-or-RING_BUFFER_ALL_CPUS.
The patch abuses ->i_cdev to encode the number, it is never used unless
the file is S_ISCHR(). But we could use something else, say, i_bytes or
even ->d_fsdata. In any case this hack is hidden inside these 2 helpers,
it would be trivial to change them if needed.
This patch only changes tracing_init_debugfs_percpu() to use the new
trace_create_cpu_file(), the next patches will change file_operations.
Note: tracing_get_cpu(inode) is always safe but you can't trust the
result unless trace_array_get() was called, without trace_types_lock
which acts as a barrier it can wrongly return RING_BUFFER_ALL_CPUS.
Link: http://lkml.kernel.org/r/20130723152554.GA23710@redhat.com
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-23 19:25:54 +04:00
return ret ;
}
2012-05-11 21:29:49 +04:00
static void
2015-01-20 20:13:40 +03:00
tracing_init_tracefs_percpu ( struct trace_array * tr , long cpu )
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
{
2012-05-11 21:29:49 +04:00
struct dentry * d_percpu = tracing_dentry_percpu ( tr , cpu ) ;
2009-03-27 02:25:38 +03:00
struct dentry * d_cpu ;
2010-10-21 05:51:26 +04:00
char cpu_dir [ 30 ] ; /* 30 characters should be more than enough */
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
2012-04-23 05:11:57 +04:00
if ( ! d_percpu )
return ;
2010-10-21 05:51:26 +04:00
snprintf ( cpu_dir , 30 , " cpu%ld " , cpu ) ;
2015-01-20 20:13:40 +03:00
d_cpu = tracefs_create_dir ( cpu_dir , d_percpu ) ;
2009-02-26 02:41:38 +03:00
if ( ! d_cpu ) {
2015-01-20 20:13:40 +03:00
pr_warning ( " Could not create tracefs '%s' entry \n " , cpu_dir ) ;
2009-02-26 02:41:38 +03:00
return ;
}
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
2009-02-26 02:41:38 +03:00
/* per cpu trace_pipe */
tracing: Introduce trace_create_cpu_file() and tracing_get_cpu()
Every "file_operations" used by tracing_init_debugfs_percpu is buggy.
f_op->open/etc does:
1. struct trace_cpu *tc = inode->i_private;
struct trace_array *tr = tc->tr;
2. trace_array_get(tr) or fail;
3. do_something(tc);
But tc (and tr) can be already freed before trace_array_get() is called.
And it doesn't matter whether this file is per-cpu or it was created by
init_tracer_debugfs(), free_percpu() or kfree() are equally bad.
Note that even 1. is not safe, the freed memory can be unmapped. But even
if it was safe trace_array_get() can wrongly succeed if we also race with
the next new_instance_create() which can re-allocate the same tr, or tc
was overwritten and ->tr points to the valid tr. In this case 3. uses the
freed/reused memory.
Add the new trivial helper, trace_create_cpu_file() which simply calls
trace_create_file() and encodes "cpu" in "struct inode". Another helper,
tracing_get_cpu() will be used to read cpu_nr-or-RING_BUFFER_ALL_CPUS.
The patch abuses ->i_cdev to encode the number, it is never used unless
the file is S_ISCHR(). But we could use something else, say, i_bytes or
even ->d_fsdata. In any case this hack is hidden inside these 2 helpers,
it would be trivial to change them if needed.
This patch only changes tracing_init_debugfs_percpu() to use the new
trace_create_cpu_file(), the next patches will change file_operations.
Note: tracing_get_cpu(inode) is always safe but you can't trust the
result unless trace_array_get() was called, without trace_types_lock
which acts as a barrier it can wrongly return RING_BUFFER_ALL_CPUS.
Link: http://lkml.kernel.org/r/20130723152554.GA23710@redhat.com
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-23 19:25:54 +04:00
trace_create_cpu_file ( " trace_pipe " , 0444 , d_cpu ,
2013-07-23 19:25:57 +04:00
tr , cpu , & tracing_pipe_fops ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
/* per cpu trace */
tracing: Introduce trace_create_cpu_file() and tracing_get_cpu()
Every "file_operations" used by tracing_init_debugfs_percpu is buggy.
f_op->open/etc does:
1. struct trace_cpu *tc = inode->i_private;
struct trace_array *tr = tc->tr;
2. trace_array_get(tr) or fail;
3. do_something(tc);
But tc (and tr) can be already freed before trace_array_get() is called.
And it doesn't matter whether this file is per-cpu or it was created by
init_tracer_debugfs(), free_percpu() or kfree() are equally bad.
Note that even 1. is not safe, the freed memory can be unmapped. But even
if it was safe trace_array_get() can wrongly succeed if we also race with
the next new_instance_create() which can re-allocate the same tr, or tc
was overwritten and ->tr points to the valid tr. In this case 3. uses the
freed/reused memory.
Add the new trivial helper, trace_create_cpu_file() which simply calls
trace_create_file() and encodes "cpu" in "struct inode". Another helper,
tracing_get_cpu() will be used to read cpu_nr-or-RING_BUFFER_ALL_CPUS.
The patch abuses ->i_cdev to encode the number, it is never used unless
the file is S_ISCHR(). But we could use something else, say, i_bytes or
even ->d_fsdata. In any case this hack is hidden inside these 2 helpers,
it would be trivial to change them if needed.
This patch only changes tracing_init_debugfs_percpu() to use the new
trace_create_cpu_file(), the next patches will change file_operations.
Note: tracing_get_cpu(inode) is always safe but you can't trust the
result unless trace_array_get() was called, without trace_types_lock
which acts as a barrier it can wrongly return RING_BUFFER_ALL_CPUS.
Link: http://lkml.kernel.org/r/20130723152554.GA23710@redhat.com
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-23 19:25:54 +04:00
trace_create_cpu_file ( " trace " , 0644 , d_cpu ,
2013-07-23 19:26:10 +04:00
tr , cpu , & tracing_fops ) ;
2009-03-13 07:37:42 +03:00
tracing: Introduce trace_create_cpu_file() and tracing_get_cpu()
Every "file_operations" used by tracing_init_debugfs_percpu is buggy.
f_op->open/etc does:
1. struct trace_cpu *tc = inode->i_private;
struct trace_array *tr = tc->tr;
2. trace_array_get(tr) or fail;
3. do_something(tc);
But tc (and tr) can be already freed before trace_array_get() is called.
And it doesn't matter whether this file is per-cpu or it was created by
init_tracer_debugfs(), free_percpu() or kfree() are equally bad.
Note that even 1. is not safe, the freed memory can be unmapped. But even
if it was safe trace_array_get() can wrongly succeed if we also race with
the next new_instance_create() which can re-allocate the same tr, or tc
was overwritten and ->tr points to the valid tr. In this case 3. uses the
freed/reused memory.
Add the new trivial helper, trace_create_cpu_file() which simply calls
trace_create_file() and encodes "cpu" in "struct inode". Another helper,
tracing_get_cpu() will be used to read cpu_nr-or-RING_BUFFER_ALL_CPUS.
The patch abuses ->i_cdev to encode the number, it is never used unless
the file is S_ISCHR(). But we could use something else, say, i_bytes or
even ->d_fsdata. In any case this hack is hidden inside these 2 helpers,
it would be trivial to change them if needed.
This patch only changes tracing_init_debugfs_percpu() to use the new
trace_create_cpu_file(), the next patches will change file_operations.
Note: tracing_get_cpu(inode) is always safe but you can't trust the
result unless trace_array_get() was called, without trace_types_lock
which acts as a barrier it can wrongly return RING_BUFFER_ALL_CPUS.
Link: http://lkml.kernel.org/r/20130723152554.GA23710@redhat.com
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-23 19:25:54 +04:00
trace_create_cpu_file ( " trace_pipe_raw " , 0444 , d_cpu ,
2013-07-23 19:26:00 +04:00
tr , cpu , & tracing_buffers_fops ) ;
2009-03-13 07:37:42 +03:00
tracing: Introduce trace_create_cpu_file() and tracing_get_cpu()
Every "file_operations" used by tracing_init_debugfs_percpu is buggy.
f_op->open/etc does:
1. struct trace_cpu *tc = inode->i_private;
struct trace_array *tr = tc->tr;
2. trace_array_get(tr) or fail;
3. do_something(tc);
But tc (and tr) can be already freed before trace_array_get() is called.
And it doesn't matter whether this file is per-cpu or it was created by
init_tracer_debugfs(), free_percpu() or kfree() are equally bad.
Note that even 1. is not safe, the freed memory can be unmapped. But even
if it was safe trace_array_get() can wrongly succeed if we also race with
the next new_instance_create() which can re-allocate the same tr, or tc
was overwritten and ->tr points to the valid tr. In this case 3. uses the
freed/reused memory.
Add the new trivial helper, trace_create_cpu_file() which simply calls
trace_create_file() and encodes "cpu" in "struct inode". Another helper,
tracing_get_cpu() will be used to read cpu_nr-or-RING_BUFFER_ALL_CPUS.
The patch abuses ->i_cdev to encode the number, it is never used unless
the file is S_ISCHR(). But we could use something else, say, i_bytes or
even ->d_fsdata. In any case this hack is hidden inside these 2 helpers,
it would be trivial to change them if needed.
This patch only changes tracing_init_debugfs_percpu() to use the new
trace_create_cpu_file(), the next patches will change file_operations.
Note: tracing_get_cpu(inode) is always safe but you can't trust the
result unless trace_array_get() was called, without trace_types_lock
which acts as a barrier it can wrongly return RING_BUFFER_ALL_CPUS.
Link: http://lkml.kernel.org/r/20130723152554.GA23710@redhat.com
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-23 19:25:54 +04:00
trace_create_cpu_file ( " stats " , 0444 , d_cpu ,
2013-07-23 19:26:03 +04:00
tr , cpu , & tracing_stats_fops ) ;
2012-02-03 00:00:41 +04:00
tracing: Introduce trace_create_cpu_file() and tracing_get_cpu()
Every "file_operations" used by tracing_init_debugfs_percpu is buggy.
f_op->open/etc does:
1. struct trace_cpu *tc = inode->i_private;
struct trace_array *tr = tc->tr;
2. trace_array_get(tr) or fail;
3. do_something(tc);
But tc (and tr) can be already freed before trace_array_get() is called.
And it doesn't matter whether this file is per-cpu or it was created by
init_tracer_debugfs(), free_percpu() or kfree() are equally bad.
Note that even 1. is not safe, the freed memory can be unmapped. But even
if it was safe trace_array_get() can wrongly succeed if we also race with
the next new_instance_create() which can re-allocate the same tr, or tc
was overwritten and ->tr points to the valid tr. In this case 3. uses the
freed/reused memory.
Add the new trivial helper, trace_create_cpu_file() which simply calls
trace_create_file() and encodes "cpu" in "struct inode". Another helper,
tracing_get_cpu() will be used to read cpu_nr-or-RING_BUFFER_ALL_CPUS.
The patch abuses ->i_cdev to encode the number, it is never used unless
the file is S_ISCHR(). But we could use something else, say, i_bytes or
even ->d_fsdata. In any case this hack is hidden inside these 2 helpers,
it would be trivial to change them if needed.
This patch only changes tracing_init_debugfs_percpu() to use the new
trace_create_cpu_file(), the next patches will change file_operations.
Note: tracing_get_cpu(inode) is always safe but you can't trust the
result unless trace_array_get() was called, without trace_types_lock
which acts as a barrier it can wrongly return RING_BUFFER_ALL_CPUS.
Link: http://lkml.kernel.org/r/20130723152554.GA23710@redhat.com
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-23 19:25:54 +04:00
trace_create_cpu_file ( " buffer_size_kb " , 0444 , d_cpu ,
2013-07-23 19:26:06 +04:00
tr , cpu , & tracing_entries_fops ) ;
2013-03-05 23:35:11 +04:00
# ifdef CONFIG_TRACER_SNAPSHOT
tracing: Introduce trace_create_cpu_file() and tracing_get_cpu()
Every "file_operations" used by tracing_init_debugfs_percpu is buggy.
f_op->open/etc does:
1. struct trace_cpu *tc = inode->i_private;
struct trace_array *tr = tc->tr;
2. trace_array_get(tr) or fail;
3. do_something(tc);
But tc (and tr) can be already freed before trace_array_get() is called.
And it doesn't matter whether this file is per-cpu or it was created by
init_tracer_debugfs(), free_percpu() or kfree() are equally bad.
Note that even 1. is not safe, the freed memory can be unmapped. But even
if it was safe trace_array_get() can wrongly succeed if we also race with
the next new_instance_create() which can re-allocate the same tr, or tc
was overwritten and ->tr points to the valid tr. In this case 3. uses the
freed/reused memory.
Add the new trivial helper, trace_create_cpu_file() which simply calls
trace_create_file() and encodes "cpu" in "struct inode". Another helper,
tracing_get_cpu() will be used to read cpu_nr-or-RING_BUFFER_ALL_CPUS.
The patch abuses ->i_cdev to encode the number, it is never used unless
the file is S_ISCHR(). But we could use something else, say, i_bytes or
even ->d_fsdata. In any case this hack is hidden inside these 2 helpers,
it would be trivial to change them if needed.
This patch only changes tracing_init_debugfs_percpu() to use the new
trace_create_cpu_file(), the next patches will change file_operations.
Note: tracing_get_cpu(inode) is always safe but you can't trust the
result unless trace_array_get() was called, without trace_types_lock
which acts as a barrier it can wrongly return RING_BUFFER_ALL_CPUS.
Link: http://lkml.kernel.org/r/20130723152554.GA23710@redhat.com
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-23 19:25:54 +04:00
trace_create_cpu_file ( " snapshot " , 0644 , d_cpu ,
2013-07-23 19:26:10 +04:00
tr , cpu , & snapshot_fops ) ;
2013-03-06 01:18:16 +04:00
tracing: Introduce trace_create_cpu_file() and tracing_get_cpu()
Every "file_operations" used by tracing_init_debugfs_percpu is buggy.
f_op->open/etc does:
1. struct trace_cpu *tc = inode->i_private;
struct trace_array *tr = tc->tr;
2. trace_array_get(tr) or fail;
3. do_something(tc);
But tc (and tr) can be already freed before trace_array_get() is called.
And it doesn't matter whether this file is per-cpu or it was created by
init_tracer_debugfs(), free_percpu() or kfree() are equally bad.
Note that even 1. is not safe, the freed memory can be unmapped. But even
if it was safe trace_array_get() can wrongly succeed if we also race with
the next new_instance_create() which can re-allocate the same tr, or tc
was overwritten and ->tr points to the valid tr. In this case 3. uses the
freed/reused memory.
Add the new trivial helper, trace_create_cpu_file() which simply calls
trace_create_file() and encodes "cpu" in "struct inode". Another helper,
tracing_get_cpu() will be used to read cpu_nr-or-RING_BUFFER_ALL_CPUS.
The patch abuses ->i_cdev to encode the number, it is never used unless
the file is S_ISCHR(). But we could use something else, say, i_bytes or
even ->d_fsdata. In any case this hack is hidden inside these 2 helpers,
it would be trivial to change them if needed.
This patch only changes tracing_init_debugfs_percpu() to use the new
trace_create_cpu_file(), the next patches will change file_operations.
Note: tracing_get_cpu(inode) is always safe but you can't trust the
result unless trace_array_get() was called, without trace_types_lock
which acts as a barrier it can wrongly return RING_BUFFER_ALL_CPUS.
Link: http://lkml.kernel.org/r/20130723152554.GA23710@redhat.com
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-23 19:25:54 +04:00
trace_create_cpu_file ( " snapshot_raw " , 0444 , d_cpu ,
2013-07-23 19:26:00 +04:00
tr , cpu , & snapshot_raw_fops ) ;
2013-03-05 23:35:11 +04:00
# endif
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
}
2008-05-12 23:20:44 +04:00
# ifdef CONFIG_FTRACE_SELFTEST
/* Let selftest have access to static functions in this file */
# include "trace_selftest.c"
# endif
2009-02-27 07:43:05 +03:00
struct trace_option_dentry {
struct tracer_opt * opt ;
struct tracer_flags * flags ;
2012-05-11 21:29:49 +04:00
struct trace_array * tr ;
2009-02-27 07:43:05 +03:00
struct dentry * entry ;
} ;
static ssize_t
trace_options_read ( struct file * filp , char __user * ubuf , size_t cnt ,
loff_t * ppos )
{
struct trace_option_dentry * topt = filp - > private_data ;
char * buf ;
if ( topt - > flags - > val & topt - > opt - > bit )
buf = " 1 \n " ;
else
buf = " 0 \n " ;
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , 2 ) ;
}
static ssize_t
trace_options_write ( struct file * filp , const char __user * ubuf , size_t cnt ,
loff_t * ppos )
{
struct trace_option_dentry * topt = filp - > private_data ;
unsigned long val ;
int ret ;
2011-06-07 23:58:27 +04:00
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
2009-02-27 07:43:05 +03:00
return ret ;
2009-12-08 06:17:06 +03:00
if ( val ! = 0 & & val ! = 1 )
return - EINVAL ;
2009-02-27 07:43:05 +03:00
2009-12-08 06:17:06 +03:00
if ( ! ! ( topt - > flags - > val & topt - > opt - > bit ) ! = val ) {
2009-02-27 07:43:05 +03:00
mutex_lock ( & trace_types_lock ) ;
2014-01-10 20:13:54 +04:00
ret = __set_tracer_option ( topt - > tr , topt - > flags ,
2009-12-22 06:35:16 +03:00
topt - > opt , ! val ) ;
2009-02-27 07:43:05 +03:00
mutex_unlock ( & trace_types_lock ) ;
if ( ret )
return ret ;
}
* ppos + = cnt ;
return cnt ;
}
static const struct file_operations trace_options_fops = {
. open = tracing_open_generic ,
. read = trace_options_read ,
. write = trace_options_write ,
2010-07-08 01:40:11 +04:00
. llseek = generic_file_llseek ,
2009-02-27 07:43:05 +03:00
} ;
2009-02-27 06:19:12 +03:00
static ssize_t
trace_options_core_read ( struct file * filp , char __user * ubuf , size_t cnt ,
loff_t * ppos )
{
long index = ( long ) filp - > private_data ;
char * buf ;
if ( trace_flags & ( 1 < < index ) )
buf = " 1 \n " ;
else
buf = " 0 \n " ;
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , 2 ) ;
}
static ssize_t
trace_options_core_write ( struct file * filp , const char __user * ubuf , size_t cnt ,
loff_t * ppos )
{
2012-05-11 21:29:49 +04:00
struct trace_array * tr = & global_trace ;
2009-02-27 06:19:12 +03:00
long index = ( long ) filp - > private_data ;
unsigned long val ;
int ret ;
2011-06-07 23:58:27 +04:00
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
2009-02-27 06:19:12 +03:00
return ret ;
2009-08-07 14:55:48 +04:00
if ( val ! = 0 & & val ! = 1 )
2009-02-27 06:19:12 +03:00
return - EINVAL ;
2013-03-14 21:50:56 +04:00
mutex_lock ( & trace_types_lock ) ;
2012-05-11 21:29:49 +04:00
ret = set_tracer_flag ( tr , 1 < < index , val ) ;
2013-03-14 21:50:56 +04:00
mutex_unlock ( & trace_types_lock ) ;
2009-02-27 06:19:12 +03:00
2013-03-14 23:03:53 +04:00
if ( ret < 0 )
return ret ;
2009-02-27 06:19:12 +03:00
* ppos + = cnt ;
return cnt ;
}
static const struct file_operations trace_options_core_fops = {
. open = tracing_open_generic ,
. read = trace_options_core_read ,
. write = trace_options_core_write ,
2010-07-08 01:40:11 +04:00
. llseek = generic_file_llseek ,
2009-02-27 06:19:12 +03:00
} ;
2009-03-27 02:25:38 +03:00
struct dentry * trace_create_file ( const char * name ,
2011-07-24 12:33:43 +04:00
umode_t mode ,
2009-03-27 02:25:38 +03:00
struct dentry * parent ,
void * data ,
const struct file_operations * fops )
{
struct dentry * ret ;
2015-01-20 20:13:40 +03:00
ret = tracefs_create_file ( name , mode , parent , data , fops ) ;
2009-03-27 02:25:38 +03:00
if ( ! ret )
2015-01-20 20:13:40 +03:00
pr_warning ( " Could not create tracefs '%s' entry \n " , name ) ;
2009-03-27 02:25:38 +03:00
return ret ;
}
2012-05-11 21:29:49 +04:00
static struct dentry * trace_options_init_dentry ( struct trace_array * tr )
2009-02-27 06:19:12 +03:00
{
struct dentry * d_tracer ;
2012-05-11 21:29:49 +04:00
if ( tr - > options )
return tr - > options ;
2009-02-27 06:19:12 +03:00
2015-01-27 05:00:48 +03:00
d_tracer = tracing_get_dentry ( tr ) ;
2015-01-20 19:14:16 +03:00
if ( IS_ERR ( d_tracer ) )
2009-02-27 06:19:12 +03:00
return NULL ;
2015-01-20 20:13:40 +03:00
tr - > options = tracefs_create_dir ( " options " , d_tracer ) ;
2012-05-11 21:29:49 +04:00
if ( ! tr - > options ) {
2015-01-20 20:13:40 +03:00
pr_warning ( " Could not create tracefs directory 'options' \n " ) ;
2009-02-27 06:19:12 +03:00
return NULL ;
}
2012-05-11 21:29:49 +04:00
return tr - > options ;
2009-02-27 06:19:12 +03:00
}
2009-02-27 07:43:05 +03:00
static void
2012-05-11 21:29:49 +04:00
create_trace_option_file ( struct trace_array * tr ,
struct trace_option_dentry * topt ,
2009-02-27 07:43:05 +03:00
struct tracer_flags * flags ,
struct tracer_opt * opt )
{
struct dentry * t_options ;
2012-05-11 21:29:49 +04:00
t_options = trace_options_init_dentry ( tr ) ;
2009-02-27 07:43:05 +03:00
if ( ! t_options )
return ;
topt - > flags = flags ;
topt - > opt = opt ;
2012-05-11 21:29:49 +04:00
topt - > tr = tr ;
2009-02-27 07:43:05 +03:00
2009-03-27 02:25:38 +03:00
topt - > entry = trace_create_file ( opt - > name , 0644 , t_options , topt ,
2009-02-27 07:43:05 +03:00
& trace_options_fops ) ;
}
static struct trace_option_dentry *
2012-05-11 21:29:49 +04:00
create_trace_option_files ( struct trace_array * tr , struct tracer * tracer )
2009-02-27 07:43:05 +03:00
{
struct trace_option_dentry * topts ;
struct tracer_flags * flags ;
struct tracer_opt * opts ;
int cnt ;
if ( ! tracer )
return NULL ;
flags = tracer - > flags ;
if ( ! flags | | ! flags - > opts )
return NULL ;
opts = flags - > opts ;
for ( cnt = 0 ; opts [ cnt ] . name ; cnt + + )
;
2009-02-27 18:51:10 +03:00
topts = kcalloc ( cnt + 1 , sizeof ( * topts ) , GFP_KERNEL ) ;
2009-02-27 07:43:05 +03:00
if ( ! topts )
return NULL ;
for ( cnt = 0 ; opts [ cnt ] . name ; cnt + + )
2012-05-11 21:29:49 +04:00
create_trace_option_file ( tr , & topts [ cnt ] , flags ,
2009-02-27 07:43:05 +03:00
& opts [ cnt ] ) ;
return topts ;
}
static void
destroy_trace_option_files ( struct trace_option_dentry * topts )
{
int cnt ;
if ( ! topts )
return ;
2014-06-26 21:14:31 +04:00
for ( cnt = 0 ; topts [ cnt ] . opt ; cnt + + )
2015-01-20 20:13:40 +03:00
tracefs_remove ( topts [ cnt ] . entry ) ;
2009-02-27 07:43:05 +03:00
kfree ( topts ) ;
}
2009-02-27 06:19:12 +03:00
static struct dentry *
2012-05-11 21:29:49 +04:00
create_trace_option_core_file ( struct trace_array * tr ,
const char * option , long index )
2009-02-27 06:19:12 +03:00
{
struct dentry * t_options ;
2012-05-11 21:29:49 +04:00
t_options = trace_options_init_dentry ( tr ) ;
2009-02-27 06:19:12 +03:00
if ( ! t_options )
return NULL ;
2009-03-27 02:25:38 +03:00
return trace_create_file ( option , 0644 , t_options , ( void * ) index ,
2009-02-27 06:19:12 +03:00
& trace_options_core_fops ) ;
}
2012-05-11 21:29:49 +04:00
static __init void create_trace_options_dir ( struct trace_array * tr )
2009-02-27 06:19:12 +03:00
{
struct dentry * t_options ;
int i ;
2012-05-11 21:29:49 +04:00
t_options = trace_options_init_dentry ( tr ) ;
2009-02-27 06:19:12 +03:00
if ( ! t_options )
return ;
2009-03-27 02:25:38 +03:00
for ( i = 0 ; trace_options [ i ] ; i + + )
2012-05-11 21:29:49 +04:00
create_trace_option_core_file ( tr , trace_options [ i ] , i ) ;
2009-02-27 06:19:12 +03:00
}
2012-02-23 00:50:28 +04:00
static ssize_t
rb_simple_read ( struct file * filp , char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2012-04-16 23:41:28 +04:00
struct trace_array * tr = filp - > private_data ;
2012-02-23 00:50:28 +04:00
char buf [ 64 ] ;
int r ;
2013-07-01 23:58:24 +04:00
r = tracer_tracing_is_on ( tr ) ;
2012-02-23 00:50:28 +04:00
r = sprintf ( buf , " %d \n " , r ) ;
return simple_read_from_buffer ( ubuf , cnt , ppos , buf , r ) ;
}
static ssize_t
rb_simple_write ( struct file * filp , const char __user * ubuf ,
size_t cnt , loff_t * ppos )
{
2012-04-16 23:41:28 +04:00
struct trace_array * tr = filp - > private_data ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
struct ring_buffer * buffer = tr - > trace_buffer . buffer ;
2012-02-23 00:50:28 +04:00
unsigned long val ;
int ret ;
ret = kstrtoul_from_user ( ubuf , cnt , 10 , & val ) ;
if ( ret )
return ret ;
if ( buffer ) {
2013-01-12 01:14:10 +04:00
mutex_lock ( & trace_types_lock ) ;
if ( val ) {
2013-07-01 23:58:24 +04:00
tracer_tracing_on ( tr ) ;
2012-05-11 21:29:49 +04:00
if ( tr - > current_trace - > start )
tr - > current_trace - > start ( tr ) ;
2013-01-12 01:14:10 +04:00
} else {
2013-07-01 23:58:24 +04:00
tracer_tracing_off ( tr ) ;
2012-05-11 21:29:49 +04:00
if ( tr - > current_trace - > stop )
tr - > current_trace - > stop ( tr ) ;
2013-01-12 01:14:10 +04:00
}
mutex_unlock ( & trace_types_lock ) ;
2012-02-23 00:50:28 +04:00
}
( * ppos ) + + ;
return cnt ;
}
static const struct file_operations rb_simple_fops = {
2013-07-02 07:34:22 +04:00
. open = tracing_open_generic_tr ,
2012-02-23 00:50:28 +04:00
. read = rb_simple_read ,
. write = rb_simple_write ,
2013-07-02 07:34:22 +04:00
. release = tracing_release_generic_tr ,
2012-02-23 00:50:28 +04:00
. llseek = default_llseek ,
} ;
2012-08-04 00:10:49 +04:00
struct dentry * trace_instance_dir ;
static void
2015-01-20 20:13:40 +03:00
init_tracer_tracefs ( struct trace_array * tr , struct dentry * d_tracer ) ;
2012-08-04 00:10:49 +04:00
2013-03-08 07:48:09 +04:00
static int
allocate_trace_buffer ( struct trace_array * tr , struct trace_buffer * buf , int size )
2012-08-04 00:10:49 +04:00
{
enum ring_buffer_flags rb_flags ;
2013-03-06 06:13:47 +04:00
rb_flags = trace_flags & TRACE_ITER_OVERWRITE ? RB_FL_OVERWRITE : 0 ;
2014-01-14 19:19:46 +04:00
buf - > tr = tr ;
2013-03-08 07:48:09 +04:00
buf - > buffer = ring_buffer_alloc ( size , rb_flags ) ;
if ( ! buf - > buffer )
return - ENOMEM ;
2013-03-06 06:13:47 +04:00
2013-03-08 07:48:09 +04:00
buf - > data = alloc_percpu ( struct trace_array_cpu ) ;
if ( ! buf - > data ) {
ring_buffer_free ( buf - > buffer ) ;
return - ENOMEM ;
}
2013-03-06 06:13:47 +04:00
/* Allocate the first page for all buffers */
set_buffer_entries ( & tr - > trace_buffer ,
ring_buffer_size ( tr - > trace_buffer . buffer , 0 ) ) ;
2013-03-08 07:48:09 +04:00
return 0 ;
}
2013-03-06 06:13:47 +04:00
2013-03-08 07:48:09 +04:00
static int allocate_trace_buffers ( struct trace_array * tr , int size )
{
int ret ;
2013-03-06 06:13:47 +04:00
2013-03-08 07:48:09 +04:00
ret = allocate_trace_buffer ( tr , & tr - > trace_buffer , size ) ;
if ( ret )
return ret ;
2013-03-06 06:13:47 +04:00
2013-03-08 07:48:09 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
ret = allocate_trace_buffer ( tr , & tr - > max_buffer ,
allocate_snapshot ? size : 1 ) ;
if ( WARN_ON ( ret ) ) {
2013-03-06 06:13:47 +04:00
ring_buffer_free ( tr - > trace_buffer . buffer ) ;
2013-03-08 07:48:09 +04:00
free_percpu ( tr - > trace_buffer . data ) ;
return - ENOMEM ;
}
tr - > allocated_snapshot = allocate_snapshot ;
2013-03-06 06:13:47 +04:00
2013-03-08 07:48:09 +04:00
/*
* Only the top level trace array gets its snapshot allocated
* from the kernel command line .
*/
allocate_snapshot = false ;
2013-03-06 06:13:47 +04:00
# endif
2013-03-08 07:48:09 +04:00
return 0 ;
2013-03-06 06:13:47 +04:00
}
2014-06-10 20:06:30 +04:00
static void free_trace_buffer ( struct trace_buffer * buf )
{
if ( buf - > buffer ) {
ring_buffer_free ( buf - > buffer ) ;
buf - > buffer = NULL ;
free_percpu ( buf - > data ) ;
buf - > data = NULL ;
}
}
2014-06-06 08:01:46 +04:00
static void free_trace_buffers ( struct trace_array * tr )
{
if ( ! tr )
return ;
2014-06-10 20:06:30 +04:00
free_trace_buffer ( & tr - > trace_buffer ) ;
2014-06-06 08:01:46 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
2014-06-10 20:06:30 +04:00
free_trace_buffer ( & tr - > max_buffer ) ;
2014-06-06 08:01:46 +04:00
# endif
}
2015-01-21 18:01:39 +03:00
static int instance_mkdir ( const char * name )
2013-03-06 06:13:47 +04:00
{
2012-08-04 00:10:49 +04:00
struct trace_array * tr ;
int ret ;
mutex_lock ( & trace_types_lock ) ;
ret = - EEXIST ;
list_for_each_entry ( tr , & ftrace_trace_arrays , list ) {
if ( tr - > name & & strcmp ( tr - > name , name ) = = 0 )
goto out_unlock ;
}
ret = - ENOMEM ;
tr = kzalloc ( sizeof ( * tr ) , GFP_KERNEL ) ;
if ( ! tr )
goto out_unlock ;
tr - > name = kstrdup ( name , GFP_KERNEL ) ;
if ( ! tr - > name )
goto out_free_tr ;
2013-08-08 20:47:45 +04:00
if ( ! alloc_cpumask_var ( & tr - > tracing_cpumask , GFP_KERNEL ) )
goto out_free_tr ;
cpumask_copy ( tr - > tracing_cpumask , cpu_all_mask ) ;
2012-08-04 00:10:49 +04:00
raw_spin_lock_init ( & tr - > start_lock ) ;
2014-01-14 19:04:59 +04:00
tr - > max_lock = ( arch_spinlock_t ) __ARCH_SPIN_LOCK_UNLOCKED ;
2012-08-04 00:10:49 +04:00
tr - > current_trace = & nop_trace ;
INIT_LIST_HEAD ( & tr - > systems ) ;
INIT_LIST_HEAD ( & tr - > events ) ;
2013-03-06 06:13:47 +04:00
if ( allocate_trace_buffers ( tr , trace_buf_size ) < 0 )
2012-08-04 00:10:49 +04:00
goto out_free_tr ;
2015-01-20 20:13:40 +03:00
tr - > dir = tracefs_create_dir ( name , trace_instance_dir ) ;
2012-08-04 00:10:49 +04:00
if ( ! tr - > dir )
goto out_free_tr ;
ret = event_trace_add_tracer ( tr - > dir , tr ) ;
2013-07-11 04:34:34 +04:00
if ( ret ) {
2015-01-20 20:13:40 +03:00
tracefs_remove_recursive ( tr - > dir ) ;
2012-08-04 00:10:49 +04:00
goto out_free_tr ;
2013-07-11 04:34:34 +04:00
}
2012-08-04 00:10:49 +04:00
2015-01-20 20:13:40 +03:00
init_tracer_tracefs ( tr , tr - > dir ) ;
2012-08-04 00:10:49 +04:00
list_add ( & tr - > list , & ftrace_trace_arrays ) ;
mutex_unlock ( & trace_types_lock ) ;
return 0 ;
out_free_tr :
2014-06-06 08:01:46 +04:00
free_trace_buffers ( tr ) ;
2013-08-08 20:47:45 +04:00
free_cpumask_var ( tr - > tracing_cpumask ) ;
2012-08-04 00:10:49 +04:00
kfree ( tr - > name ) ;
kfree ( tr ) ;
out_unlock :
mutex_unlock ( & trace_types_lock ) ;
return ret ;
}
2015-01-21 18:01:39 +03:00
static int instance_rmdir ( const char * name )
2012-08-08 00:14:16 +04:00
{
struct trace_array * tr ;
int found = 0 ;
int ret ;
mutex_lock ( & trace_types_lock ) ;
ret = - ENODEV ;
list_for_each_entry ( tr , & ftrace_trace_arrays , list ) {
if ( tr - > name & & strcmp ( tr - > name , name ) = = 0 ) {
found = 1 ;
break ;
}
}
if ( ! found )
goto out_unlock ;
2013-03-07 00:27:24 +04:00
ret = - EBUSY ;
2014-12-16 04:13:31 +03:00
if ( tr - > ref | | ( tr - > current_trace & & tr - > current_trace - > ref ) )
2013-03-07 00:27:24 +04:00
goto out_unlock ;
2012-08-08 00:14:16 +04:00
list_del ( & tr - > list ) ;
2014-01-14 17:43:01 +04:00
tracing_set_nop ( tr ) ;
2012-08-08 00:14:16 +04:00
event_trace_del_tracer ( tr ) ;
2014-01-11 01:17:45 +04:00
ftrace_destroy_function_files ( tr ) ;
2012-08-08 00:14:16 +04:00
debugfs_remove_recursive ( tr - > dir ) ;
2014-06-07 07:17:28 +04:00
free_trace_buffers ( tr ) ;
2012-08-08 00:14:16 +04:00
kfree ( tr - > name ) ;
kfree ( tr ) ;
ret = 0 ;
out_unlock :
mutex_unlock ( & trace_types_lock ) ;
return ret ;
}
2012-08-04 00:10:49 +04:00
static __init void create_trace_instances ( struct dentry * d_tracer )
{
2015-01-21 18:01:39 +03:00
trace_instance_dir = tracefs_create_instance_dir ( " instances " , d_tracer ,
instance_mkdir ,
instance_rmdir ) ;
2012-08-04 00:10:49 +04:00
if ( WARN_ON ( ! trace_instance_dir ) )
return ;
}
2012-05-11 21:29:49 +04:00
static void
2015-01-20 20:13:40 +03:00
init_tracer_tracefs ( struct trace_array * tr , struct dentry * d_tracer )
2012-05-11 21:29:49 +04:00
{
2013-03-06 06:52:25 +04:00
int cpu ;
2012-05-11 21:29:49 +04:00
2013-11-07 07:42:48 +04:00
trace_create_file ( " available_tracers " , 0444 , d_tracer ,
tr , & show_traces_fops ) ;
trace_create_file ( " current_tracer " , 0644 , d_tracer ,
tr , & set_tracer_fops ) ;
2013-08-08 20:47:45 +04:00
trace_create_file ( " tracing_cpumask " , 0644 , d_tracer ,
tr , & tracing_cpumask_fops ) ;
2012-05-11 21:29:49 +04:00
trace_create_file ( " trace_options " , 0644 , d_tracer ,
tr , & tracing_iter_fops ) ;
trace_create_file ( " trace " , 0644 , d_tracer ,
2013-07-23 19:26:10 +04:00
tr , & tracing_fops ) ;
2012-05-11 21:29:49 +04:00
trace_create_file ( " trace_pipe " , 0444 , d_tracer ,
2013-07-23 19:25:57 +04:00
tr , & tracing_pipe_fops ) ;
2012-05-11 21:29:49 +04:00
trace_create_file ( " buffer_size_kb " , 0644 , d_tracer ,
2013-07-23 19:26:06 +04:00
tr , & tracing_entries_fops ) ;
2012-05-11 21:29:49 +04:00
trace_create_file ( " buffer_total_size_kb " , 0444 , d_tracer ,
tr , & tracing_total_entries_fops ) ;
2013-05-26 12:52:01 +04:00
trace_create_file ( " free_buffer " , 0200 , d_tracer ,
2012-05-11 21:29:49 +04:00
tr , & tracing_free_buffer_fops ) ;
trace_create_file ( " trace_marker " , 0220 , d_tracer ,
tr , & tracing_mark_fops ) ;
trace_create_file ( " trace_clock " , 0644 , d_tracer , tr ,
& trace_clock_fops ) ;
trace_create_file ( " tracing_on " , 0644 , d_tracer ,
2013-07-23 19:26:10 +04:00
tr , & rb_simple_fops ) ;
2013-03-06 06:23:55 +04:00
2014-01-14 20:28:38 +04:00
# ifdef CONFIG_TRACER_MAX_TRACE
trace_create_file ( " tracing_max_latency " , 0644 , d_tracer ,
& tr - > max_latency , & tracing_max_lat_fops ) ;
# endif
2014-01-11 01:17:45 +04:00
if ( ftrace_create_function_files ( tr , d_tracer ) )
WARN ( 1 , " Could not allocate function filter files " ) ;
2013-03-06 06:23:55 +04:00
# ifdef CONFIG_TRACER_SNAPSHOT
trace_create_file ( " snapshot " , 0644 , d_tracer ,
2013-07-23 19:26:10 +04:00
tr , & snapshot_fops ) ;
2013-03-06 06:23:55 +04:00
# endif
2013-03-06 06:52:25 +04:00
for_each_tracing_cpu ( cpu )
2015-01-20 20:13:40 +03:00
tracing_init_tracefs_percpu ( tr , cpu ) ;
2013-03-06 06:52:25 +04:00
2012-05-11 21:29:49 +04:00
}
2015-01-20 23:48:46 +03:00
static struct vfsmount * trace_automount ( void * ingore )
{
struct vfsmount * mnt ;
struct file_system_type * type ;
/*
* To maintain backward compatibility for tools that mount
* debugfs to get to the tracing facility , tracefs is automatically
* mounted to the debugfs / tracing directory .
*/
type = get_fs_type ( " tracefs " ) ;
if ( ! type )
return NULL ;
mnt = vfs_kern_mount ( type , 0 , " tracefs " , NULL ) ;
put_filesystem ( type ) ;
if ( IS_ERR ( mnt ) )
return NULL ;
mntget ( mnt ) ;
return mnt ;
}
2015-01-27 05:00:48 +03:00
/**
* tracing_init_dentry - initialize top level trace array
*
* This is called when creating files or directories in the tracing
* directory . It is called via fs_initcall ( ) by any of the boot up code
* and expects to return the dentry of the top level tracing directory .
*/
struct dentry * tracing_init_dentry ( void )
{
struct trace_array * tr = & global_trace ;
2015-01-20 23:48:46 +03:00
/* The top level trace array uses NULL as parent */
2015-01-27 05:00:48 +03:00
if ( tr - > dir )
2015-01-20 23:48:46 +03:00
return NULL ;
2015-01-27 05:00:48 +03:00
if ( WARN_ON ( ! debugfs_initialized ( ) ) )
return ERR_PTR ( - ENODEV ) ;
2015-01-20 23:48:46 +03:00
/*
* As there may still be users that expect the tracing
* files to exist in debugfs / tracing , we must automount
* the tracefs file system there , so older tools still
* work with the newer kerenl .
*/
tr - > dir = debugfs_create_automount ( " tracing " , NULL ,
trace_automount , NULL ) ;
2015-01-27 05:00:48 +03:00
if ( ! tr - > dir ) {
pr_warn_once ( " Could not create debugfs directory 'tracing' \n " ) ;
return ERR_PTR ( - ENOMEM ) ;
}
2015-01-20 20:13:40 +03:00
return NULL ;
2015-01-27 05:00:48 +03:00
}
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
Several tracepoints use the helper functions __print_symbolic() or
__print_flags() and pass in enums that do the mapping between the
binary data stored and the value to print. This works well for reading
the ASCII trace files, but when the data is read via userspace tools
such as perf and trace-cmd, the conversion of the binary value to a
human string format is lost if an enum is used, as userspace does not
have access to what the ENUM is.
For example, the tracepoint trace_tlb_flush() has:
__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })
Which maps the enum values to the strings they represent. But perf and
trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
not be able to map it.
With TRACE_DEFINE_ENUM(), developers can place these in the event header
files and ftrace will convert the enums to their values:
By adding:
TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);
$ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
[...]
__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })
The above is what userspace expects to see, and tools do not need to
be modified to parse them.
Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org
Cc: Guilherme Cox <cox@computer.org>
Cc: Tony Luck <tony.luck@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-03-25 00:58:09 +03:00
extern struct trace_enum_map * __start_ftrace_enum_maps [ ] ;
extern struct trace_enum_map * __stop_ftrace_enum_maps [ ] ;
static void __init trace_enum_init ( void )
{
2015-03-25 22:44:21 +03:00
int len ;
len = __stop_ftrace_enum_maps - __start_ftrace_enum_maps ;
2015-04-01 00:23:45 +03:00
trace_insert_enum_map ( NULL , __start_ftrace_enum_maps , len ) ;
2015-03-25 22:44:21 +03:00
}
# ifdef CONFIG_MODULES
static void trace_module_add_enums ( struct module * mod )
{
if ( ! mod - > num_trace_enums )
return ;
/*
* Modules with bad taint do not have events created , do
* not bother with enums either .
*/
if ( trace_module_has_bad_taint ( mod ) )
return ;
2015-04-01 00:23:45 +03:00
trace_insert_enum_map ( mod , mod - > trace_enums , mod - > num_trace_enums ) ;
2015-03-25 22:44:21 +03:00
}
2015-04-01 00:23:45 +03:00
# ifdef CONFIG_TRACE_ENUM_MAP_FILE
static void trace_module_remove_enums ( struct module * mod )
{
union trace_enum_map_item * map ;
union trace_enum_map_item * * last = & trace_enum_maps ;
if ( ! mod - > num_trace_enums )
return ;
mutex_lock ( & trace_enum_mutex ) ;
map = trace_enum_maps ;
while ( map ) {
if ( map - > head . mod = = mod )
break ;
map = trace_enum_jmp_to_tail ( map ) ;
last = & map - > tail . next ;
map = map - > tail . next ;
}
if ( ! map )
goto out ;
* last = trace_enum_jmp_to_tail ( map ) - > tail . next ;
kfree ( map ) ;
out :
mutex_unlock ( & trace_enum_mutex ) ;
}
# else
static inline void trace_module_remove_enums ( struct module * mod ) { }
# endif /* CONFIG_TRACE_ENUM_MAP_FILE */
2015-03-25 22:44:21 +03:00
static int trace_module_notify ( struct notifier_block * self ,
unsigned long val , void * data )
{
struct module * mod = data ;
switch ( val ) {
case MODULE_STATE_COMING :
trace_module_add_enums ( mod ) ;
break ;
2015-04-01 00:23:45 +03:00
case MODULE_STATE_GOING :
trace_module_remove_enums ( mod ) ;
break ;
2015-03-25 22:44:21 +03:00
}
return 0 ;
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
Several tracepoints use the helper functions __print_symbolic() or
__print_flags() and pass in enums that do the mapping between the
binary data stored and the value to print. This works well for reading
the ASCII trace files, but when the data is read via userspace tools
such as perf and trace-cmd, the conversion of the binary value to a
human string format is lost if an enum is used, as userspace does not
have access to what the ENUM is.
For example, the tracepoint trace_tlb_flush() has:
__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })
Which maps the enum values to the strings they represent. But perf and
trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
not be able to map it.
With TRACE_DEFINE_ENUM(), developers can place these in the event header
files and ftrace will convert the enums to their values:
By adding:
TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);
$ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
[...]
__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })
The above is what userspace expects to see, and tools do not need to
be modified to parse them.
Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org
Cc: Guilherme Cox <cox@computer.org>
Cc: Tony Luck <tony.luck@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-03-25 00:58:09 +03:00
}
2015-03-25 22:44:21 +03:00
static struct notifier_block trace_module_nb = {
. notifier_call = trace_module_notify ,
. priority = 0 ,
} ;
2015-04-01 00:23:45 +03:00
# endif /* CONFIG_MODULES */
2015-03-25 22:44:21 +03:00
2015-01-20 20:13:40 +03:00
static __init int tracer_init_tracefs ( void )
2008-05-12 23:20:42 +04:00
{
struct dentry * d_tracer ;
tracing: Consolidate protection of reader access to the ring buffer
At the beginning, access to the ring buffer was fully serialized
by trace_types_lock. Patch d7350c3f4569 gives more freedom to readers,
and patch b04cc6b1f6 adds code to protect trace_pipe and cpu#/trace_pipe.
But actually it is not enough, ring buffer readers are not always
read-only, they may consume data.
This patch makes accesses to trace, trace_pipe, trace_pipe_raw
cpu#/trace, cpu#/trace_pipe and cpu#/trace_pipe_raw serialized.
And removes tracing_reader_cpumask which is used to protect trace_pipe.
Details:
Ring buffer serializes readers, but it is low level protection.
The validity of the events (which returns by ring_buffer_peek() ..etc)
are not protected by ring buffer.
The content of events may become garbage if we allow another process to consume
these events concurrently:
A) the page of the consumed events may become a normal page
(not reader page) in ring buffer, and this page will be rewritten
by the events producer.
B) The page of the consumed events may become a page for splice_read,
and this page will be returned to system.
This patch adds trace_access_lock() and trace_access_unlock() primitives.
These primitives allow multi process access to different cpu ring buffers
concurrently.
These primitives don't distinguish read-only and read-consume access.
Multi read-only access is also serialized.
And we don't use these primitives when we open files,
we only use them when we read files.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B447D52.1050602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-01-06 15:08:50 +03:00
trace_access_lock_init ( ) ;
2008-05-12 23:20:42 +04:00
d_tracer = tracing_init_dentry ( ) ;
2015-01-20 19:14:16 +03:00
if ( IS_ERR ( d_tracer ) )
2013-04-10 04:18:12 +04:00
return 0 ;
2008-05-12 23:20:42 +04:00
2015-01-20 20:13:40 +03:00
init_tracer_tracefs ( & global_trace , d_tracer ) ;
2008-05-12 23:20:42 +04:00
2009-03-27 02:25:38 +03:00
trace_create_file ( " tracing_thresh " , 0644 , d_tracer ,
2014-07-18 15:17:27 +04:00
& global_trace , & tracing_thresh_fops ) ;
2009-02-27 06:19:12 +03:00
2009-04-17 06:34:30 +04:00
trace_create_file ( " README " , 0444 , d_tracer ,
2009-03-27 02:25:38 +03:00
NULL , & tracing_readme_fops ) ;
2009-04-11 00:04:48 +04:00
trace_create_file ( " saved_cmdlines " , 0444 , d_tracer ,
NULL , & tracing_saved_cmdlines_fops ) ;
2008-09-16 23:06:42 +04:00
2014-06-05 05:24:27 +04:00
trace_create_file ( " saved_cmdlines_size " , 0644 , d_tracer ,
NULL , & tracing_saved_cmdlines_size_fops ) ;
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
Several tracepoints use the helper functions __print_symbolic() or
__print_flags() and pass in enums that do the mapping between the
binary data stored and the value to print. This works well for reading
the ASCII trace files, but when the data is read via userspace tools
such as perf and trace-cmd, the conversion of the binary value to a
human string format is lost if an enum is used, as userspace does not
have access to what the ENUM is.
For example, the tracepoint trace_tlb_flush() has:
__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })
Which maps the enum values to the strings they represent. But perf and
trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
not be able to map it.
With TRACE_DEFINE_ENUM(), developers can place these in the event header
files and ftrace will convert the enums to their values:
By adding:
TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);
$ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
[...]
__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })
The above is what userspace expects to see, and tools do not need to
be modified to parse them.
Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org
Cc: Guilherme Cox <cox@computer.org>
Cc: Tony Luck <tony.luck@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-03-25 00:58:09 +03:00
trace_enum_init ( ) ;
2015-04-01 00:23:45 +03:00
trace_create_enum_file ( d_tracer ) ;
2015-03-25 22:44:21 +03:00
# ifdef CONFIG_MODULES
register_module_notifier ( & trace_module_nb ) ;
# endif
2008-05-12 23:20:42 +04:00
# ifdef CONFIG_DYNAMIC_FTRACE
2009-03-27 02:25:38 +03:00
trace_create_file ( " dyn_ftrace_total_info " , 0444 , d_tracer ,
& ftrace_update_tot_cnt , & tracing_dyn_info_fops ) ;
2008-05-12 23:20:42 +04:00
# endif
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
2012-08-04 00:10:49 +04:00
create_trace_instances ( d_tracer ) ;
2009-03-27 02:25:38 +03:00
2012-05-11 21:29:49 +04:00
create_trace_options_dir ( & global_trace ) ;
tracing/core: introduce per cpu tracing files
Impact: split up tracing output per cpu
Currently, on the tracing debugfs directory, three files are
available to the user to let him extracting the trace output:
- trace is an iterator through the ring-buffer. It's a reader
but not a consumer It doesn't block when no more traces are
available.
- trace pretty similar to the former, except that it adds more
informations such as prempt count, irq flag, ...
- trace_pipe is a reader and a consumer, it will also block
waiting for traces if necessary (heh, yes it's a pipe).
The traces coming from different cpus are curretly mixed up
inside these files. Sometimes it messes up the informations,
sometimes it's useful, depending on what does the tracer
capture.
The tracing_cpumask file is useful to filter the output and
select only the traces captured a custom defined set of cpus.
But still it is not enough powerful to extract at the same time
one trace buffer per cpu.
So this patch creates a new directory: /debug/tracing/per_cpu/.
Inside this directory, you will now find one trace_pipe file and
one trace file per cpu.
Which means if you have two cpus, you will have:
trace0
trace1
trace_pipe0
trace_pipe1
And of course, reading these files will have the same effect
than with the usual tracing files, except that you will only see
the traces from the given cpu.
The original all-in-one cpu trace file are still available on
their original place.
Until now, only one consumer was allowed on trace_pipe to avoid
racy consuming on the ring-buffer. Now the approach changed a
bit, you can have only one consumer per cpu.
Which means you are allowed to read concurrently trace_pipe0 and
trace_pipe1 But you can't have two readers on trace_pipe0 or
trace_pipe1.
Following the same logic, if there is one reader on the common
trace_pipe, you can not have at the same time another reader on
trace_pipe0 or in trace_pipe1. Because in trace_pipe is already
a consumer in all cpu buffers in essence.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-25 05:22:28 +03:00
2015-02-03 20:45:53 +03:00
/* If the tracer was started via cmdline, create options for it here */
if ( global_trace . current_trace ! = & nop_trace )
update_tracer_options ( & global_trace , global_trace . current_trace ) ;
2008-09-23 14:34:32 +04:00
return 0 ;
2008-05-12 23:20:42 +04:00
}
2008-07-31 06:36:46 +04:00
static int trace_panic_handler ( struct notifier_block * this ,
unsigned long event , void * unused )
{
2008-10-24 03:26:08 +04:00
if ( ftrace_dump_on_oops )
2010-04-18 21:08:41 +04:00
ftrace_dump ( ftrace_dump_on_oops ) ;
2008-07-31 06:36:46 +04:00
return NOTIFY_OK ;
}
static struct notifier_block trace_panic_notifier = {
. notifier_call = trace_panic_handler ,
. next = NULL ,
. priority = 150 /* priority: INT_MAX >= x >= 0 */
} ;
static int trace_die_handler ( struct notifier_block * self ,
unsigned long val ,
void * data )
{
switch ( val ) {
case DIE_OOPS :
2008-10-24 03:26:08 +04:00
if ( ftrace_dump_on_oops )
2010-04-18 21:08:41 +04:00
ftrace_dump ( ftrace_dump_on_oops ) ;
2008-07-31 06:36:46 +04:00
break ;
default :
break ;
}
return NOTIFY_OK ;
}
static struct notifier_block trace_die_notifier = {
. notifier_call = trace_die_handler ,
. priority = 200
} ;
/*
* printk is set to max of 1024 , we really don ' t need it that big .
* Nothing should be printing 1000 characters anyway .
*/
# define TRACE_MAX_PRINT 1000
/*
* Define here KERN_TRACE so that we have one place to modify
* it if we decide to change what log level the ftrace dump
* should be at .
*/
2009-01-14 20:24:42 +03:00
# define KERN_TRACE KERN_EMERG
2008-07-31 06:36:46 +04:00
2010-08-05 18:22:23 +04:00
void
2008-07-31 06:36:46 +04:00
trace_printk_seq ( struct trace_seq * s )
{
/* Probably should print a warning here. */
2014-06-25 23:54:42 +04:00
if ( s - > seq . len > = TRACE_MAX_PRINT )
s - > seq . len = TRACE_MAX_PRINT ;
2008-07-31 06:36:46 +04:00
2014-11-19 18:56:41 +03:00
/*
* More paranoid code . Although the buffer size is set to
* PAGE_SIZE , and TRACE_MAX_PRINT is 1000 , this is just
* an extra layer of protection .
*/
if ( WARN_ON_ONCE ( s - > seq . len > = s - > seq . size ) )
s - > seq . len = s - > seq . size - 1 ;
2008-07-31 06:36:46 +04:00
/* should be zero ended, but we are paranoid. */
2014-06-25 23:54:42 +04:00
s - > buffer [ s - > seq . len ] = 0 ;
2008-07-31 06:36:46 +04:00
printk ( KERN_TRACE " %s " , s - > buffer ) ;
2009-03-02 22:04:40 +03:00
trace_seq_init ( s ) ;
2008-07-31 06:36:46 +04:00
}
2010-08-05 18:22:23 +04:00
void trace_init_global_iter ( struct trace_iterator * iter )
{
iter - > tr = & global_trace ;
2012-05-11 21:29:49 +04:00
iter - > trace = iter - > tr - > current_trace ;
2013-01-24 00:22:59 +04:00
iter - > cpu_file = RING_BUFFER_ALL_CPUS ;
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 18:24:35 +04:00
iter - > trace_buffer = & global_trace . trace_buffer ;
2013-10-23 22:49:57 +04:00
if ( iter - > trace & & iter - > trace - > open )
iter - > trace - > open ( iter ) ;
/* Annotate start of buffers if we had overruns */
if ( ring_buffer_overruns ( iter - > trace_buffer - > buffer ) )
iter - > iter_flags | = TRACE_FILE_ANNOTATE ;
/* Output in nanoseconds only if we are using a clock in nanoseconds. */
if ( trace_clocks [ iter - > tr - > clock_id ] . in_ns )
iter - > iter_flags | = TRACE_FILE_TIME_IN_NS ;
2010-08-05 18:22:23 +04:00
}
tracing: Fix ftrace_dump()
ftrace_dump() had a lot of issues. What ftrace_dump() does, is when
ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it
will dump out the ftrace buffers to the console when either a oops,
panic, or a sysrq-z occurs.
This was written a long time ago when ftrace was fragile to recursion.
But it wasn't written well even for that.
There's a possible deadlock that can occur if a ftrace_dump() is happening
and an NMI triggers another dump. This is because it grabs a lock
before checking if the dump ran.
It also totally disables ftrace, and tracing for no good reasons.
As the ring_buffer now checks if it is read via a oops or NMI, where
there's a chance that the buffer gets corrupted, it will disable
itself. No need to have ftrace_dump() do the same.
ftrace_dump() is now cleaned up where it uses an atomic counter to
make sure only one dump happens at a time. A simple atomic_inc_return()
is enough that is needed for both other CPUs and NMIs. No need for
a spinlock, as if one CPU is running the dump, no other CPU needs
to do it too.
The tracing_on variable is turned off and not turned on. The original
code did this, but it wasn't pretty. By just disabling this variable
we get the result of not seeing traces that happen between crashes.
For sysrq-z, it doesn't get turned on, but the user can always write
a '1' to the tracing_on file. If they are using sysrq-z, then they should
know about tracing_on.
The new code is much easier to read and less error prone. No more
deadlock possibility when an NMI triggers here.
Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 21:10:35 +04:00
void ftrace_dump ( enum ftrace_dump_mode oops_dump_mode )
2008-07-31 06:36:46 +04:00
{
/* use static because iter can be a bit big for the stack */
static struct trace_iterator iter ;
tracing: Fix ftrace_dump()
ftrace_dump() had a lot of issues. What ftrace_dump() does, is when
ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it
will dump out the ftrace buffers to the console when either a oops,
panic, or a sysrq-z occurs.
This was written a long time ago when ftrace was fragile to recursion.
But it wasn't written well even for that.
There's a possible deadlock that can occur if a ftrace_dump() is happening
and an NMI triggers another dump. This is because it grabs a lock
before checking if the dump ran.
It also totally disables ftrace, and tracing for no good reasons.
As the ring_buffer now checks if it is read via a oops or NMI, where
there's a chance that the buffer gets corrupted, it will disable
itself. No need to have ftrace_dump() do the same.
ftrace_dump() is now cleaned up where it uses an atomic counter to
make sure only one dump happens at a time. A simple atomic_inc_return()
is enough that is needed for both other CPUs and NMIs. No need for
a spinlock, as if one CPU is running the dump, no other CPU needs
to do it too.
The tracing_on variable is turned off and not turned on. The original
code did this, but it wasn't pretty. By just disabling this variable
we get the result of not seeing traces that happen between crashes.
For sysrq-z, it doesn't get turned on, but the user can always write
a '1' to the tracing_on file. If they are using sysrq-z, then they should
know about tracing_on.
The new code is much easier to read and less error prone. No more
deadlock possibility when an NMI triggers here.
Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 21:10:35 +04:00
static atomic_t dump_running ;
2009-03-22 07:04:35 +03:00
unsigned int old_userobj ;
2008-10-01 08:29:53 +04:00
unsigned long flags ;
int cnt = 0 , cpu ;
2008-07-31 06:36:46 +04:00
tracing: Fix ftrace_dump()
ftrace_dump() had a lot of issues. What ftrace_dump() does, is when
ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it
will dump out the ftrace buffers to the console when either a oops,
panic, or a sysrq-z occurs.
This was written a long time ago when ftrace was fragile to recursion.
But it wasn't written well even for that.
There's a possible deadlock that can occur if a ftrace_dump() is happening
and an NMI triggers another dump. This is because it grabs a lock
before checking if the dump ran.
It also totally disables ftrace, and tracing for no good reasons.
As the ring_buffer now checks if it is read via a oops or NMI, where
there's a chance that the buffer gets corrupted, it will disable
itself. No need to have ftrace_dump() do the same.
ftrace_dump() is now cleaned up where it uses an atomic counter to
make sure only one dump happens at a time. A simple atomic_inc_return()
is enough that is needed for both other CPUs and NMIs. No need for
a spinlock, as if one CPU is running the dump, no other CPU needs
to do it too.
The tracing_on variable is turned off and not turned on. The original
code did this, but it wasn't pretty. By just disabling this variable
we get the result of not seeing traces that happen between crashes.
For sysrq-z, it doesn't get turned on, but the user can always write
a '1' to the tracing_on file. If they are using sysrq-z, then they should
know about tracing_on.
The new code is much easier to read and less error prone. No more
deadlock possibility when an NMI triggers here.
Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 21:10:35 +04:00
/* Only allow one dump user at a time. */
if ( atomic_inc_return ( & dump_running ) ! = 1 ) {
atomic_dec ( & dump_running ) ;
return ;
}
2008-07-31 06:36:46 +04:00
tracing: Fix ftrace_dump()
ftrace_dump() had a lot of issues. What ftrace_dump() does, is when
ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it
will dump out the ftrace buffers to the console when either a oops,
panic, or a sysrq-z occurs.
This was written a long time ago when ftrace was fragile to recursion.
But it wasn't written well even for that.
There's a possible deadlock that can occur if a ftrace_dump() is happening
and an NMI triggers another dump. This is because it grabs a lock
before checking if the dump ran.
It also totally disables ftrace, and tracing for no good reasons.
As the ring_buffer now checks if it is read via a oops or NMI, where
there's a chance that the buffer gets corrupted, it will disable
itself. No need to have ftrace_dump() do the same.
ftrace_dump() is now cleaned up where it uses an atomic counter to
make sure only one dump happens at a time. A simple atomic_inc_return()
is enough that is needed for both other CPUs and NMIs. No need for
a spinlock, as if one CPU is running the dump, no other CPU needs
to do it too.
The tracing_on variable is turned off and not turned on. The original
code did this, but it wasn't pretty. By just disabling this variable
we get the result of not seeing traces that happen between crashes.
For sysrq-z, it doesn't get turned on, but the user can always write
a '1' to the tracing_on file. If they are using sysrq-z, then they should
know about tracing_on.
The new code is much easier to read and less error prone. No more
deadlock possibility when an NMI triggers here.
Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 21:10:35 +04:00
/*
* Always turn off tracing when we dump .
* We don ' t need to show trace output of what happens
* between multiple crashes .
*
* If the user does a sysrq - z , then they can re - enable
* tracing with echo 1 > tracing_on .
*/
2009-01-14 22:50:19 +03:00
tracing_off ( ) ;
2009-03-22 07:04:35 +03:00
tracing: Fix ftrace_dump()
ftrace_dump() had a lot of issues. What ftrace_dump() does, is when
ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it
will dump out the ftrace buffers to the console when either a oops,
panic, or a sysrq-z occurs.
This was written a long time ago when ftrace was fragile to recursion.
But it wasn't written well even for that.
There's a possible deadlock that can occur if a ftrace_dump() is happening
and an NMI triggers another dump. This is because it grabs a lock
before checking if the dump ran.
It also totally disables ftrace, and tracing for no good reasons.
As the ring_buffer now checks if it is read via a oops or NMI, where
there's a chance that the buffer gets corrupted, it will disable
itself. No need to have ftrace_dump() do the same.
ftrace_dump() is now cleaned up where it uses an atomic counter to
make sure only one dump happens at a time. A simple atomic_inc_return()
is enough that is needed for both other CPUs and NMIs. No need for
a spinlock, as if one CPU is running the dump, no other CPU needs
to do it too.
The tracing_on variable is turned off and not turned on. The original
code did this, but it wasn't pretty. By just disabling this variable
we get the result of not seeing traces that happen between crashes.
For sysrq-z, it doesn't get turned on, but the user can always write
a '1' to the tracing_on file. If they are using sysrq-z, then they should
know about tracing_on.
The new code is much easier to read and less error prone. No more
deadlock possibility when an NMI triggers here.
Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 21:10:35 +04:00
local_irq_save ( flags ) ;
2008-07-31 06:36:46 +04:00
2013-01-25 14:03:07 +04:00
/* Simulate the iterator */
2010-08-05 18:22:23 +04:00
trace_init_global_iter ( & iter ) ;
2008-10-01 08:29:53 +04:00
for_each_tracing_cpu ( cpu ) {
2015-06-22 14:25:06 +03:00
atomic_inc ( & per_cpu_ptr ( iter . trace_buffer - > data , cpu ) - > disabled ) ;
2008-10-01 08:29:53 +04:00
}
2009-03-22 07:04:35 +03:00
old_userobj = trace_flags & TRACE_ITER_SYM_USEROBJ ;
2008-11-22 14:28:48 +03:00
/* don't look at user memory in panic mode */
trace_flags & = ~ TRACE_ITER_SYM_USEROBJ ;
2010-04-18 21:08:41 +04:00
switch ( oops_dump_mode ) {
case DUMP_ALL :
2013-01-24 00:22:59 +04:00
iter . cpu_file = RING_BUFFER_ALL_CPUS ;
2010-04-18 21:08:41 +04:00
break ;
case DUMP_ORIG :
iter . cpu_file = raw_smp_processor_id ( ) ;
break ;
case DUMP_NONE :
goto out_enable ;
default :
printk ( KERN_TRACE " Bad dumping mode, switching to all CPUs dump \n " ) ;
2013-01-24 00:22:59 +04:00
iter . cpu_file = RING_BUFFER_ALL_CPUS ;
2010-04-18 21:08:41 +04:00
}
printk ( KERN_TRACE " Dumping ftrace buffer: \n " ) ;
2008-07-31 06:36:46 +04:00
tracing: Fix ftrace_dump()
ftrace_dump() had a lot of issues. What ftrace_dump() does, is when
ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it
will dump out the ftrace buffers to the console when either a oops,
panic, or a sysrq-z occurs.
This was written a long time ago when ftrace was fragile to recursion.
But it wasn't written well even for that.
There's a possible deadlock that can occur if a ftrace_dump() is happening
and an NMI triggers another dump. This is because it grabs a lock
before checking if the dump ran.
It also totally disables ftrace, and tracing for no good reasons.
As the ring_buffer now checks if it is read via a oops or NMI, where
there's a chance that the buffer gets corrupted, it will disable
itself. No need to have ftrace_dump() do the same.
ftrace_dump() is now cleaned up where it uses an atomic counter to
make sure only one dump happens at a time. A simple atomic_inc_return()
is enough that is needed for both other CPUs and NMIs. No need for
a spinlock, as if one CPU is running the dump, no other CPU needs
to do it too.
The tracing_on variable is turned off and not turned on. The original
code did this, but it wasn't pretty. By just disabling this variable
we get the result of not seeing traces that happen between crashes.
For sysrq-z, it doesn't get turned on, but the user can always write
a '1' to the tracing_on file. If they are using sysrq-z, then they should
know about tracing_on.
The new code is much easier to read and less error prone. No more
deadlock possibility when an NMI triggers here.
Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 21:10:35 +04:00
/* Did function tracer already get disabled? */
if ( ftrace_is_dead ( ) ) {
printk ( " # WARNING: FUNCTION TRACING IS CORRUPTED \n " ) ;
printk ( " # MAY BE MISSING FUNCTION EVENTS \n " ) ;
}
2008-07-31 06:36:46 +04:00
/*
* We need to stop all tracing on all CPUS to read the
* the next buffer . This is a bit expensive , but is
* not done often . We fill all what we can read ,
* and then release the locks again .
*/
while ( ! trace_empty ( & iter ) ) {
if ( ! cnt )
printk ( KERN_TRACE " --------------------------------- \n " ) ;
cnt + + ;
/* reset all but tr, trace, and overruns */
memset ( & iter . seq , 0 ,
sizeof ( struct trace_iterator ) -
offsetof ( struct trace_iterator , seq ) ) ;
iter . iter_flags | = TRACE_FILE_LAT_FMT ;
iter . pos = - 1 ;
2010-08-05 18:22:23 +04:00
if ( trace_find_next_entry_inc ( & iter ) ! = NULL ) {
2009-07-28 16:17:22 +04:00
int ret ;
ret = print_trace_line ( & iter ) ;
if ( ret ! = TRACE_TYPE_NO_CONSUME )
trace_consume ( & iter ) ;
2008-07-31 06:36:46 +04:00
}
2012-03-02 07:06:48 +04:00
touch_nmi_watchdog ( ) ;
2008-07-31 06:36:46 +04:00
trace_printk_seq ( & iter . seq ) ;
}
if ( ! cnt )
printk ( KERN_TRACE " (ftrace buffer empty) \n " ) ;
else
printk ( KERN_TRACE " --------------------------------- \n " ) ;
2010-04-18 21:08:41 +04:00
out_enable :
tracing: Fix ftrace_dump()
ftrace_dump() had a lot of issues. What ftrace_dump() does, is when
ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it
will dump out the ftrace buffers to the console when either a oops,
panic, or a sysrq-z occurs.
This was written a long time ago when ftrace was fragile to recursion.
But it wasn't written well even for that.
There's a possible deadlock that can occur if a ftrace_dump() is happening
and an NMI triggers another dump. This is because it grabs a lock
before checking if the dump ran.
It also totally disables ftrace, and tracing for no good reasons.
As the ring_buffer now checks if it is read via a oops or NMI, where
there's a chance that the buffer gets corrupted, it will disable
itself. No need to have ftrace_dump() do the same.
ftrace_dump() is now cleaned up where it uses an atomic counter to
make sure only one dump happens at a time. A simple atomic_inc_return()
is enough that is needed for both other CPUs and NMIs. No need for
a spinlock, as if one CPU is running the dump, no other CPU needs
to do it too.
The tracing_on variable is turned off and not turned on. The original
code did this, but it wasn't pretty. By just disabling this variable
we get the result of not seeing traces that happen between crashes.
For sysrq-z, it doesn't get turned on, but the user can always write
a '1' to the tracing_on file. If they are using sysrq-z, then they should
know about tracing_on.
The new code is much easier to read and less error prone. No more
deadlock possibility when an NMI triggers here.
Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 21:10:35 +04:00
trace_flags | = old_userobj ;
2009-03-22 07:04:35 +03:00
tracing: Fix ftrace_dump()
ftrace_dump() had a lot of issues. What ftrace_dump() does, is when
ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it
will dump out the ftrace buffers to the console when either a oops,
panic, or a sysrq-z occurs.
This was written a long time ago when ftrace was fragile to recursion.
But it wasn't written well even for that.
There's a possible deadlock that can occur if a ftrace_dump() is happening
and an NMI triggers another dump. This is because it grabs a lock
before checking if the dump ran.
It also totally disables ftrace, and tracing for no good reasons.
As the ring_buffer now checks if it is read via a oops or NMI, where
there's a chance that the buffer gets corrupted, it will disable
itself. No need to have ftrace_dump() do the same.
ftrace_dump() is now cleaned up where it uses an atomic counter to
make sure only one dump happens at a time. A simple atomic_inc_return()
is enough that is needed for both other CPUs and NMIs. No need for
a spinlock, as if one CPU is running the dump, no other CPU needs
to do it too.
The tracing_on variable is turned off and not turned on. The original
code did this, but it wasn't pretty. By just disabling this variable
we get the result of not seeing traces that happen between crashes.
For sysrq-z, it doesn't get turned on, but the user can always write
a '1' to the tracing_on file. If they are using sysrq-z, then they should
know about tracing_on.
The new code is much easier to read and less error prone. No more
deadlock possibility when an NMI triggers here.
Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 21:10:35 +04:00
for_each_tracing_cpu ( cpu ) {
atomic_dec ( & per_cpu_ptr ( iter . trace_buffer - > data , cpu ) - > disabled ) ;
2009-03-22 07:04:35 +03:00
}
tracing: Fix ftrace_dump()
ftrace_dump() had a lot of issues. What ftrace_dump() does, is when
ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it
will dump out the ftrace buffers to the console when either a oops,
panic, or a sysrq-z occurs.
This was written a long time ago when ftrace was fragile to recursion.
But it wasn't written well even for that.
There's a possible deadlock that can occur if a ftrace_dump() is happening
and an NMI triggers another dump. This is because it grabs a lock
before checking if the dump ran.
It also totally disables ftrace, and tracing for no good reasons.
As the ring_buffer now checks if it is read via a oops or NMI, where
there's a chance that the buffer gets corrupted, it will disable
itself. No need to have ftrace_dump() do the same.
ftrace_dump() is now cleaned up where it uses an atomic counter to
make sure only one dump happens at a time. A simple atomic_inc_return()
is enough that is needed for both other CPUs and NMIs. No need for
a spinlock, as if one CPU is running the dump, no other CPU needs
to do it too.
The tracing_on variable is turned off and not turned on. The original
code did this, but it wasn't pretty. By just disabling this variable
we get the result of not seeing traces that happen between crashes.
For sysrq-z, it doesn't get turned on, but the user can always write
a '1' to the tracing_on file. If they are using sysrq-z, then they should
know about tracing_on.
The new code is much easier to read and less error prone. No more
deadlock possibility when an NMI triggers here.
Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 21:10:35 +04:00
atomic_dec ( & dump_running ) ;
2009-04-28 19:39:34 +04:00
local_irq_restore ( flags ) ;
2008-07-31 06:36:46 +04:00
}
2011-10-02 22:01:15 +04:00
EXPORT_SYMBOL_GPL ( ftrace_dump ) ;
2009-03-22 07:04:35 +03:00
2008-09-30 07:02:41 +04:00
__init static int tracer_alloc_buffers ( void )
2008-05-12 23:20:42 +04:00
{
2009-03-11 20:42:01 +03:00
int ring_buf_size ;
2009-01-01 02:42:22 +03:00
int ret = - ENOMEM ;
2008-05-12 23:20:43 +04:00
2009-01-01 02:42:22 +03:00
if ( ! alloc_cpumask_var ( & tracing_buffer_mask , GFP_KERNEL ) )
goto out ;
2013-08-08 20:47:45 +04:00
if ( ! alloc_cpumask_var ( & global_trace . tracing_cpumask , GFP_KERNEL ) )
2009-01-01 02:42:22 +03:00
goto out_free_buffer_mask ;
2008-05-12 23:20:43 +04:00
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
/* Only allocate trace_printk buffers if a trace_printk exists */
if ( __stop___trace_bprintk_fmt ! = __start___trace_bprintk_fmt )
2012-10-11 18:15:05 +04:00
/* Must be called before global_trace.buffer is allocated */
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 22:01:55 +04:00
trace_printk_init_buffers ( ) ;
2009-03-11 20:42:01 +03:00
/* To save memory, keep the ring buffer size to its minimum */
if ( ring_buffer_expanded )
ring_buf_size = trace_buf_size ;
else
ring_buf_size = 1 ;
2009-01-01 02:42:22 +03:00
cpumask_copy ( tracing_buffer_mask , cpu_possible_mask ) ;
2013-08-08 20:47:45 +04:00
cpumask_copy ( global_trace . tracing_cpumask , cpu_all_mask ) ;
2009-01-01 02:42:22 +03:00
2012-05-11 21:29:49 +04:00
raw_spin_lock_init ( & global_trace . start_lock ) ;
2014-03-26 07:39:41 +04:00
/* Used for event triggers */
temp_buffer = ring_buffer_alloc ( PAGE_SIZE , RB_FL_OVERWRITE ) ;
if ( ! temp_buffer )
goto out_free_cpumask ;
2014-06-05 05:24:27 +04:00
if ( trace_create_savedcmd ( ) < 0 )
goto out_free_temp_buffer ;
2009-01-01 02:42:22 +03:00
/* TODO: make the number of buffers hot pluggable with CPUS */
2013-03-06 06:13:47 +04:00
if ( allocate_trace_buffers ( & global_trace , ring_buf_size ) < 0 ) {
2008-09-30 07:02:41 +04:00
printk ( KERN_ERR " tracer: failed to allocate ring buffer! \n " ) ;
WARN_ON ( 1 ) ;
2014-06-05 05:24:27 +04:00
goto out_free_savedcmd ;
2008-05-12 23:20:43 +04:00
}
2012-08-07 00:24:11 +04:00
2012-02-23 00:50:28 +04:00
if ( global_trace . buffer_disabled )
tracing_off ( ) ;
2008-05-12 23:20:43 +04:00
2014-02-11 08:38:46 +04:00
if ( trace_boot_clock ) {
ret = tracing_set_clock ( & global_trace , trace_boot_clock ) ;
if ( ret < 0 )
pr_warning ( " Trace clock %s not defined, going back to default \n " ,
trace_boot_clock ) ;
}
2013-05-23 19:51:10 +04:00
/*
* register_tracer ( ) might reference current_trace , so it
* needs to be set before we register anything . This is
* just a bootstrap of current_trace anyway .
*/
2012-05-11 21:29:49 +04:00
global_trace . current_trace = & nop_trace ;
2014-01-14 19:04:59 +04:00
global_trace . max_lock = ( arch_spinlock_t ) __ARCH_SPIN_LOCK_UNLOCKED ;
2014-01-11 02:01:58 +04:00
ftrace_init_global_array_ops ( & global_trace ) ;
2013-05-23 19:51:10 +04:00
register_tracer ( & nop_trace ) ;
2008-05-12 23:20:44 +04:00
/* All seems OK, enable tracing */
tracing_disabled = 0 ;
2008-09-30 07:02:41 +04:00
2008-07-31 06:36:46 +04:00
atomic_notifier_chain_register ( & panic_notifier_list ,
& trace_panic_notifier ) ;
register_die_notifier ( & trace_die_notifier ) ;
2009-03-16 03:45:03 +03:00
2012-05-04 07:09:03 +04:00
global_trace . flags = TRACE_ARRAY_FL_GLOBAL ;
INIT_LIST_HEAD ( & global_trace . systems ) ;
INIT_LIST_HEAD ( & global_trace . events ) ;
list_add ( & global_trace . list , & ftrace_trace_arrays ) ;
2012-11-02 06:56:07 +04:00
while ( trace_boot_options ) {
char * option ;
option = strsep ( & trace_boot_options , " , " ) ;
2012-05-11 21:29:49 +04:00
trace_set_options ( & global_trace , option ) ;
2012-11-02 06:56:07 +04:00
}
2013-03-12 19:49:18 +04:00
register_snapshot_cmd ( ) ;
2009-03-16 03:45:03 +03:00
return 0 ;
2008-07-31 06:36:46 +04:00
2014-06-05 05:24:27 +04:00
out_free_savedcmd :
free_saved_cmdlines_buffer ( savedcmd ) ;
2014-03-26 07:39:41 +04:00
out_free_temp_buffer :
ring_buffer_free ( temp_buffer ) ;
2009-01-01 02:42:22 +03:00
out_free_cpumask :
2013-08-08 20:47:45 +04:00
free_cpumask_var ( global_trace . tracing_cpumask ) ;
2009-01-01 02:42:22 +03:00
out_free_buffer_mask :
free_cpumask_var ( tracing_buffer_mask ) ;
out :
return ret ;
2008-05-12 23:20:42 +04:00
}
2009-02-03 05:38:32 +03:00
2014-12-13 04:05:10 +03:00
void __init trace_init ( void )
{
2014-12-13 06:27:10 +03:00
if ( tracepoint_printk ) {
tracepoint_print_iter =
kmalloc ( sizeof ( * tracepoint_print_iter ) , GFP_KERNEL ) ;
if ( WARN_ON ( ! tracepoint_print_iter ) )
tracepoint_printk = 0 ;
}
2014-12-13 04:05:10 +03:00
tracer_alloc_buffers ( ) ;
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
Several tracepoints use the helper functions __print_symbolic() or
__print_flags() and pass in enums that do the mapping between the
binary data stored and the value to print. This works well for reading
the ASCII trace files, but when the data is read via userspace tools
such as perf and trace-cmd, the conversion of the binary value to a
human string format is lost if an enum is used, as userspace does not
have access to what the ENUM is.
For example, the tracepoint trace_tlb_flush() has:
__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })
Which maps the enum values to the strings they represent. But perf and
trace-cmd do no know what value TLB_LOCAL_MM_SHOOTDOWN is, and would
not be able to map it.
With TRACE_DEFINE_ENUM(), developers can place these in the event header
files and ftrace will convert the enums to their values:
By adding:
TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);
$ cat /sys/kernel/debug/tracing/events/tlb/tlb_flush/format
[...]
__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })
The above is what userspace expects to see, and tools do not need to
be modified to parse them.
Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org
Cc: Guilherme Cox <cox@computer.org>
Cc: Tony Luck <tony.luck@gmail.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-03-25 00:58:09 +03:00
trace_event_init ( ) ;
2014-12-13 04:05:10 +03:00
}
2009-02-03 05:38:32 +03:00
__init static int clear_boot_tracer ( void )
{
/*
* The default tracer at boot buffer is an init section .
* This function is called in lateinit . If we did not
* find the boot tracer , then clear it out , to prevent
* later registration from accessing the buffer that is
* about to be freed .
*/
if ( ! default_bootup_tracer )
return 0 ;
printk ( KERN_INFO " ftrace bootup tracer '%s' not registered. \n " ,
default_bootup_tracer ) ;
default_bootup_tracer = NULL ;
return 0 ;
}
2015-01-20 20:13:40 +03:00
fs_initcall ( tracer_init_tracefs ) ;
2009-02-03 05:38:32 +03:00
late_initcall ( clear_boot_tracer ) ;