2005-04-16 15:20:36 -07:00
/*
* Kernel Probes ( KProbes )
*
* This program is free software ; you can redistribute it and / or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation ; either version 2 of the License , or
* ( at your option ) any later version .
*
* This program is distributed in the hope that it will be useful ,
* but WITHOUT ANY WARRANTY ; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE . See the
* GNU General Public License for more details .
*
* You should have received a copy of the GNU General Public License
* along with this program ; if not , write to the Free Software
* Foundation , Inc . , 59 Temple Place - Suite 330 , Boston , MA 02111 - 1307 , USA .
*
* Copyright ( C ) IBM Corporation , 2002 , 2004
*
* 2002 - Oct Created by Vamsi Krishna S < vamsi_krishna @ in . ibm . com > Kernel
* Probes initial implementation ( includes contributions from
* Rusty Russell ) .
* 2004 - July Suparna Bhattacharya < suparna @ in . ibm . com > added jumper probes
* interface to access function arguments .
2008-01-30 13:31:21 +01:00
* 2004 - Oct Jim Keniston < jkenisto @ us . ibm . com > and Prasanna S Panchamukhi
* < prasanna @ in . ibm . com > adapted for x86_64 from i386 .
2005-04-16 15:20:36 -07:00
* 2005 - Mar Roland McGrath < roland @ redhat . com >
* Fixed to handle % rip - relative addressing mode correctly .
2008-01-30 13:31:21 +01:00
* 2005 - May Hien Nguyen < hien @ us . ibm . com > , Jim Keniston
* < jkenisto @ us . ibm . com > and Prasanna S Panchamukhi
* < prasanna @ in . ibm . com > added function - return probes .
* 2005 - May Rusty Lynch < rusty . lynch @ intel . com >
* Added function return probes functionality
* 2006 - Feb Masami Hiramatsu < hiramatu @ sdl . hitachi . co . jp > added
* kprobe - booster and kretprobe - booster for i386 .
2008-01-30 13:31:21 +01:00
* 2007 - Dec Masami Hiramatsu < mhiramat @ redhat . com > added kprobe - booster
* and kretprobe - booster for x86 - 64
2008-01-30 13:31:21 +01:00
* 2007 - Dec Masami Hiramatsu < mhiramat @ redhat . com > , Arjan van de Ven
* < arjan @ infradead . org > and Jim Keniston < jkenisto @ us . ibm . com >
* unified x86 kprobes code .
2005-04-16 15:20:36 -07:00
*/
# include <linux/kprobes.h>
# include <linux/ptrace.h>
# include <linux/string.h>
# include <linux/slab.h>
x86: code clarification patch to Kprobes arch code
When developing the Kprobes arch code for ARM, I ran across some code
found in x86 and s390 Kprobes arch code which I didn't consider as
good as it could be.
Once I figured out what the code was doing, I changed the code
for ARM Kprobes to work the way I felt was more appropriate.
I've tested the code this way in ARM for about a year and would
like to push the same change to the other affected architectures.
The code in question is in kprobe_exceptions_notify() which
does:
====
/* kprobe_running() needs smp_processor_id() */
preempt_disable();
if (kprobe_running() &&
kprobe_fault_handler(args->regs, args->trapnr))
ret = NOTIFY_STOP;
preempt_enable();
====
For the moment, ignore the code having the preempt_disable()/
preempt_enable() pair in it.
The problem is that kprobe_running() needs to call smp_processor_id()
which will assert if preemption is enabled. That sanity check by
smp_processor_id() makes perfect sense since calling it with preemption
enabled would return an unreliable result.
But the function kprobe_exceptions_notify() can be called from a
context where preemption could be enabled. If that happens, the
assertion in smp_processor_id() happens and we're dead. So what
the original author did (speculation on my part!) is put in the
preempt_disable()/preempt_enable() pair to simply defeat the check.
Once I figured out what was going on, I considered this an
inappropriate approach. If kprobe_exceptions_notify() is called
from a preemptible context, we can't be in a kprobe processing
context at that time anyways since kprobes requires preemption to
already be disabled, so just check for preemption enabled, and if
so, blow out before ever calling kprobe_running(). I wrote the ARM
kprobe code like this:
====
/* To be potentially processing a kprobe fault and to
* trust the result from kprobe_running(), we have
* be non-preemptible. */
if (!preemptible() && kprobe_running() &&
kprobe_fault_handler(args->regs, args->trapnr))
ret = NOTIFY_STOP;
====
The above code has been working fine for ARM Kprobes for a year.
So I changed the x86 code (2.6.24-rc6) to be the same way and ran
the Systemtap tests on that kernel. As on ARM, Systemtap on x86
comes up with the same test results either way, so it's a neutral
external functional change (as expected).
This issue has been discussed previously on linux-arm-kernel and the
Systemtap mailing lists. Pointers to the by base for the two
discussions:
http://lists.arm.linux.org.uk/lurker/message/20071219.223225.1f5c2a5e.en.html
http://sourceware.org/ml/systemtap/2007-q1/msg00251.html
Signed-off-by: Quentin Barnes <qbarnes@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ananth N Mavinakayahanalli <ananth@in.ibm.com>
Acked-by: Ananth N Mavinakayahanalli <ananth@in.ibm.com>
2008-01-30 13:32:32 +01:00
# include <linux/hardirq.h>
2005-04-16 15:20:36 -07:00
# include <linux/preempt.h>
2006-03-26 01:38:23 -08:00
# include <linux/module.h>
2007-05-08 00:27:03 -07:00
# include <linux/kdebug.h>
2009-08-13 16:34:28 -04:00
# include <linux/kallsyms.h>
2010-02-25 08:34:46 -05:00
# include <linux/ftrace.h>
2005-06-27 15:17:01 -07:00
2008-01-30 13:31:21 +01:00
# include <asm/cacheflush.h>
# include <asm/desc.h>
2005-04-16 15:20:36 -07:00
# include <asm/pgtable.h>
2006-03-26 01:38:23 -08:00
# include <asm/uaccess.h>
2007-07-22 11:12:31 +02:00
# include <asm/alternative.h>
2009-08-13 16:34:28 -04:00
# include <asm/insn.h>
2009-06-01 23:47:06 +05:30
# include <asm/debugreg.h>
2005-04-16 15:20:36 -07:00
void jprobe_return_end ( void ) ;
2005-11-07 01:00:12 -08:00
DEFINE_PER_CPU ( struct kprobe * , current_kprobe ) = NULL ;
DEFINE_PER_CPU ( struct kprobe_ctlblk , kprobe_ctlblk ) ;
2005-04-16 15:20:36 -07:00
2009-10-12 14:14:10 -07:00
# define stack_addr(regs) ((unsigned long *)kernel_stack_pointer(regs))
2008-01-30 13:31:21 +01:00
# define W(row, b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, ba, bb, bc, bd, be, bf)\
( ( ( b0 # # UL < < 0x0 ) | ( b1 # # UL < < 0x1 ) | ( b2 # # UL < < 0x2 ) | ( b3 # # UL < < 0x3 ) | \
( b4 # # UL < < 0x4 ) | ( b5 # # UL < < 0x5 ) | ( b6 # # UL < < 0x6 ) | ( b7 # # UL < < 0x7 ) | \
( b8 # # UL < < 0x8 ) | ( b9 # # UL < < 0x9 ) | ( ba # # UL < < 0xa ) | ( bb # # UL < < 0xb ) | \
( bc # # UL < < 0xc ) | ( bd # # UL < < 0xd ) | ( be # # UL < < 0xe ) | ( bf # # UL < < 0xf ) ) \
< < ( row % 32 ) )
/*
* Undefined / reserved opcodes , conditional jump , Opcode Extension
* Groups , and some special opcodes can not boost .
*/
static const u32 twobyte_is_boostable [ 256 / 32 ] = {
/* 0 1 2 3 4 5 6 7 8 9 a b c d e f */
/* ---------------------------------------------- */
W ( 0x00 , 0 , 0 , 1 , 1 , 0 , 0 , 1 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 ) | /* 00 */
W ( 0x10 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ) , /* 10 */
W ( 0x20 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ) | /* 20 */
W ( 0x30 , 0 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ) , /* 30 */
W ( 0x40 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ) | /* 40 */
W ( 0x50 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ) , /* 50 */
W ( 0x60 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 0 , 0 , 1 , 1 ) | /* 60 */
W ( 0x70 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 1 , 1 ) , /* 70 */
W ( 0x80 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ) | /* 80 */
W ( 0x90 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ) , /* 90 */
W ( 0xa0 , 1 , 1 , 0 , 1 , 1 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 1 , 1 , 0 , 1 ) | /* a0 */
W ( 0xb0 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 1 ) , /* b0 */
W ( 0xc0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ) | /* c0 */
W ( 0xd0 , 0 , 1 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 1 , 1 , 0 , 1 ) , /* d0 */
W ( 0xe0 , 0 , 1 , 1 , 0 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 1 , 1 , 0 , 1 ) | /* e0 */
W ( 0xf0 , 0 , 1 , 1 , 1 , 0 , 1 , 0 , 0 , 1 , 1 , 1 , 0 , 1 , 1 , 1 , 0 ) /* f0 */
/* ----------------------------------------------- */
/* 0 1 2 3 4 5 6 7 8 9 a b c d e f */
} ;
# undef W
2007-10-16 01:27:49 -07:00
struct kretprobe_blackpoint kretprobe_blacklist [ ] = {
{ " __switch_to " , } , /* This function switches only current task, but
doesn ' t switch kernel stack . */
{ NULL , NULL } /* Terminator */
} ;
const int kretprobe_blacklist_size = ARRAY_SIZE ( kretprobe_blacklist ) ;
2010-02-25 08:34:46 -05:00
static void __kprobes __synthesize_relative_insn ( void * from , void * to , u8 op )
2008-01-30 13:31:21 +01:00
{
2010-02-25 08:34:46 -05:00
struct __arch_relative_insn {
u8 op ;
2008-01-30 13:31:21 +01:00
s32 raddr ;
2010-02-25 08:34:46 -05:00
} __attribute__ ( ( packed ) ) * insn ;
insn = ( struct __arch_relative_insn * ) from ;
insn - > raddr = ( s32 ) ( ( long ) ( to ) - ( ( long ) ( from ) + 5 ) ) ;
insn - > op = op ;
}
/* Insert a jump instruction at address 'from', which jumps to address 'to'.*/
static void __kprobes synthesize_reljump ( void * from , void * to )
{
__synthesize_relative_insn ( from , to , RELATIVEJUMP_OPCODE ) ;
2008-01-30 13:31:21 +01:00
}
2008-01-30 13:32:14 +01:00
/*
2010-06-29 14:53:50 +09:00
* Skip the prefixes of the instruction .
2008-01-30 13:32:14 +01:00
*/
2010-06-29 14:53:50 +09:00
static kprobe_opcode_t * __kprobes skip_prefixes ( kprobe_opcode_t * insn )
2008-01-30 13:32:14 +01:00
{
2010-06-29 14:53:50 +09:00
insn_attr_t attr ;
attr = inat_get_opcode_attribute ( ( insn_byte_t ) * insn ) ;
while ( inat_is_legacy_prefix ( attr ) ) {
insn + + ;
attr = inat_get_opcode_attribute ( ( insn_byte_t ) * insn ) ;
}
2008-01-30 13:32:14 +01:00
# ifdef CONFIG_X86_64
2010-06-29 14:53:50 +09:00
if ( inat_is_rex_prefix ( attr ) )
insn + + ;
2008-01-30 13:32:14 +01:00
# endif
2010-06-29 14:53:50 +09:00
return insn ;
2008-01-30 13:32:14 +01:00
}
2008-01-30 13:31:21 +01:00
/*
2008-01-30 13:31:21 +01:00
* Returns non - zero if opcode is boostable .
* RIP relative instructions are adjusted at copying time in 64 bits mode
2008-01-30 13:31:21 +01:00
*/
2008-01-30 13:31:43 +01:00
static int __kprobes can_boost ( kprobe_opcode_t * opcodes )
2008-01-30 13:31:21 +01:00
{
kprobe_opcode_t opcode ;
kprobe_opcode_t * orig_opcodes = opcodes ;
2009-03-18 17:37:45 +05:30
if ( search_exception_tables ( ( unsigned long ) opcodes ) )
2009-03-16 18:57:22 -04:00
return 0 ; /* Page fault may occur on this address. */
2008-01-30 13:31:21 +01:00
retry :
if ( opcodes - orig_opcodes > MAX_INSN_SIZE - 1 )
return 0 ;
opcode = * ( opcodes + + ) ;
/* 2nd-byte opcode */
if ( opcode = = 0x0f ) {
if ( opcodes - orig_opcodes > MAX_INSN_SIZE - 1 )
return 0 ;
2008-01-30 13:31:21 +01:00
return test_bit ( * opcodes ,
( unsigned long * ) twobyte_is_boostable ) ;
2008-01-30 13:31:21 +01:00
}
switch ( opcode & 0xf0 ) {
2008-01-30 13:31:21 +01:00
# ifdef CONFIG_X86_64
2008-01-30 13:31:21 +01:00
case 0x40 :
goto retry ; /* REX prefix is boostable */
2008-01-30 13:31:21 +01:00
# endif
2008-01-30 13:31:21 +01:00
case 0x60 :
if ( 0x63 < opcode & & opcode < 0x67 )
goto retry ; /* prefixes */
/* can't boost Address-size override and bound */
return ( opcode ! = 0x62 & & opcode ! = 0x67 ) ;
case 0x70 :
return 0 ; /* can't boost conditional jump */
case 0xc0 :
/* can't boost software-interruptions */
return ( 0xc1 < opcode & & opcode < 0xcc ) | | opcode = = 0xcf ;
case 0xd0 :
/* can boost AA* and XLAT */
return ( opcode = = 0xd4 | | opcode = = 0xd5 | | opcode = = 0xd7 ) ;
case 0xe0 :
/* can boost in/out and absolute jmps */
return ( ( opcode & 0x04 ) | | opcode = = 0xea ) ;
case 0xf0 :
if ( ( opcode & 0x0c ) = = 0 & & opcode ! = 0xf1 )
goto retry ; /* lock/rep(ne) prefix */
/* clear and set flags are boostable */
return ( opcode = = 0xf5 | | ( 0xf7 < opcode & & opcode < 0xfe ) ) ;
default :
/* segment override prefixes are boostable */
if ( opcode = = 0x26 | | opcode = = 0x36 | | opcode = = 0x3e )
goto retry ; /* prefixes */
/* CS override prefix and call are not boostable */
return ( opcode ! = 0x2e & & opcode ! = 0x9a ) ;
}
}
2009-08-13 16:34:28 -04:00
/* Recover the probed instruction at addr for further analysis. */
static int recover_probed_instruction ( kprobe_opcode_t * buf , unsigned long addr )
{
struct kprobe * kp ;
kp = get_kprobe ( ( void * ) addr ) ;
if ( ! kp )
return - EINVAL ;
/*
* Basically , kp - > ainsn . insn has an original instruction .
* However , RIP - relative instruction can not do single - stepping
2010-02-25 08:34:46 -05:00
* at different place , __copy_instruction ( ) tweaks the displacement of
2009-08-13 16:34:28 -04:00
* that instruction . In that case , we can ' t recover the instruction
* from the kp - > ainsn . insn .
*
* On the other hand , kp - > opcode has a copy of the first byte of
* the probed instruction , which is overwritten by int3 . And
* the instruction at kp - > addr is not modified by kprobes except
* for the first byte , we can recover the original instruction
* from it and kp - > opcode .
*/
memcpy ( buf , kp - > addr , MAX_INSN_SIZE * sizeof ( kprobe_opcode_t ) ) ;
buf [ 0 ] = kp - > opcode ;
return 0 ;
}
/* Dummy buffers for kallsyms_lookup */
static char __dummy_buf [ KSYM_NAME_LEN ] ;
/* Check if paddr is at an instruction boundary */
static int __kprobes can_probe ( unsigned long paddr )
{
int ret ;
unsigned long addr , offset = 0 ;
struct insn insn ;
kprobe_opcode_t buf [ MAX_INSN_SIZE ] ;
if ( ! kallsyms_lookup ( paddr , NULL , & offset , NULL , __dummy_buf ) )
return 0 ;
/* Decode instructions */
addr = paddr - offset ;
while ( addr < paddr ) {
kernel_insn_init ( & insn , ( void * ) addr ) ;
insn_get_opcode ( & insn ) ;
/*
* Check if the instruction has been modified by another
* kprobe , in which case we replace the breakpoint by the
* original instruction in our buffer .
*/
if ( insn . opcode . bytes [ 0 ] = = BREAKPOINT_INSTRUCTION ) {
ret = recover_probed_instruction ( buf , addr ) ;
if ( ret )
/*
* Another debugging subsystem might insert
* this breakpoint . In that case , we can ' t
* recover it .
*/
return 0 ;
kernel_insn_init ( & insn , buf ) ;
}
insn_get_length ( & insn ) ;
addr + = insn . length ;
}
return ( addr = = paddr ) ;
}
2005-04-16 15:20:36 -07:00
/*
2008-01-30 13:31:21 +01:00
* Returns non - zero if opcode modifies the interrupt flag .
2005-04-16 15:20:36 -07:00
*/
2007-11-26 20:42:19 +01:00
static int __kprobes is_IF_modifier ( kprobe_opcode_t * insn )
2005-04-16 15:20:36 -07:00
{
2010-06-29 14:53:50 +09:00
/* Skip prefixes */
insn = skip_prefixes ( insn ) ;
2005-04-16 15:20:36 -07:00
switch ( * insn ) {
case 0xfa : /* cli */
case 0xfb : /* sti */
case 0xcf : /* iret/iretd */
case 0x9d : /* popf/popfd */
return 1 ;
}
2008-01-30 13:32:14 +01:00
2005-04-16 15:20:36 -07:00
return 0 ;
}
/*
2010-02-25 08:34:46 -05:00
* Copy an instruction and adjust the displacement if the instruction
* uses the % rip - relative addressing mode .
2008-01-30 13:31:21 +01:00
* If it does , Return the address of the 32 - bit displacement word .
2005-04-16 15:20:36 -07:00
* If not , return null .
2008-01-30 13:32:16 +01:00
* Only applicable to 64 - bit x86 .
2005-04-16 15:20:36 -07:00
*/
2010-02-25 08:34:46 -05:00
static int __kprobes __copy_instruction ( u8 * dest , u8 * src , int recover )
2005-04-16 15:20:36 -07:00
{
2009-08-13 16:34:36 -04:00
struct insn insn ;
2010-02-25 08:34:46 -05:00
int ret ;
kprobe_opcode_t buf [ MAX_INSN_SIZE ] ;
2005-04-16 15:20:36 -07:00
2010-02-25 08:34:46 -05:00
kernel_insn_init ( & insn , src ) ;
if ( recover ) {
insn_get_opcode ( & insn ) ;
if ( insn . opcode . bytes [ 0 ] = = BREAKPOINT_INSTRUCTION ) {
ret = recover_probed_instruction ( buf ,
( unsigned long ) src ) ;
if ( ret )
return 0 ;
kernel_insn_init ( & insn , buf ) ;
}
}
insn_get_length ( & insn ) ;
memcpy ( dest , insn . kaddr , insn . length ) ;
# ifdef CONFIG_X86_64
2009-08-13 16:34:36 -04:00
if ( insn_rip_relative ( & insn ) ) {
s64 newdisp ;
u8 * disp ;
2010-02-25 08:34:46 -05:00
kernel_insn_init ( & insn , dest ) ;
2009-08-13 16:34:36 -04:00
insn_get_displacement ( & insn ) ;
/*
* The copied instruction uses the % rip - relative addressing
* mode . Adjust the displacement for the difference between
* the original location of this instruction and the location
* of the copy that will actually be run . The tricky bit here
* is making sure that the sign extension happens correctly in
* this calculation , since we need a signed 32 - bit result to
* be sign - extended to 64 bits when it ' s added to the % rip
* value and yield the same 64 - bit result that the sign -
* extension of the original signed 32 - bit displacement would
* have given .
*/
2010-02-25 08:34:46 -05:00
newdisp = ( u8 * ) src + ( s64 ) insn . displacement . value -
( u8 * ) dest ;
2009-08-13 16:34:36 -04:00
BUG_ON ( ( s64 ) ( s32 ) newdisp ! = newdisp ) ; /* Sanity check. */
2010-02-25 08:34:46 -05:00
disp = ( u8 * ) dest + insn_offset_displacement ( & insn ) ;
2009-08-13 16:34:36 -04:00
* ( s32 * ) disp = ( s32 ) newdisp ;
2005-04-16 15:20:36 -07:00
}
2008-01-30 13:31:21 +01:00
# endif
2010-02-25 08:34:46 -05:00
return insn . length ;
2008-01-30 13:32:16 +01:00
}
2005-04-16 15:20:36 -07:00
2006-01-09 20:52:44 -08:00
static void __kprobes arch_copy_kprobe ( struct kprobe * p )
2005-04-16 15:20:36 -07:00
{
2010-02-25 08:34:46 -05:00
/*
* Copy an instruction without recovering int3 , because it will be
* put by another subsystem .
*/
__copy_instruction ( p - > ainsn . insn , p - > addr , 0 ) ;
2008-01-30 13:32:16 +01:00
2008-01-30 13:31:21 +01:00
if ( can_boost ( p - > addr ) )
2008-01-30 13:31:21 +01:00
p - > ainsn . boostable = 0 ;
2008-01-30 13:31:21 +01:00
else
2008-01-30 13:31:21 +01:00
p - > ainsn . boostable = - 1 ;
2008-01-30 13:31:21 +01:00
2005-06-23 00:09:25 -07:00
p - > opcode = * p - > addr ;
2005-04-16 15:20:36 -07:00
}
2008-01-30 13:31:21 +01:00
int __kprobes arch_prepare_kprobe ( struct kprobe * p )
{
2010-02-02 16:49:18 -05:00
if ( alternatives_text_reserved ( p - > addr , p - > addr ) )
return - EINVAL ;
2009-08-13 16:34:28 -04:00
if ( ! can_probe ( ( unsigned long ) p - > addr ) )
return - EILSEQ ;
2008-01-30 13:31:21 +01:00
/* insn: must be on special executable page on x86. */
p - > ainsn . insn = get_insn_slot ( ) ;
if ( ! p - > ainsn . insn )
return - ENOMEM ;
arch_copy_kprobe ( p ) ;
return 0 ;
}
2005-09-06 15:19:28 -07:00
void __kprobes arch_arm_kprobe ( struct kprobe * p )
2005-04-16 15:20:36 -07:00
{
2007-07-22 11:12:31 +02:00
text_poke ( p - > addr , ( ( unsigned char [ ] ) { BREAKPOINT_INSTRUCTION } ) , 1 ) ;
2005-04-16 15:20:36 -07:00
}
2005-09-06 15:19:28 -07:00
void __kprobes arch_disarm_kprobe ( struct kprobe * p )
2005-04-16 15:20:36 -07:00
{
2007-07-22 11:12:31 +02:00
text_poke ( p - > addr , & p - > opcode , 1 ) ;
2005-06-23 00:09:25 -07:00
}
2006-01-09 20:52:46 -08:00
void __kprobes arch_remove_kprobe ( struct kprobe * p )
2005-06-23 00:09:25 -07:00
{
2009-01-06 14:41:50 -08:00
if ( p - > ainsn . insn ) {
free_insn_slot ( p - > ainsn . insn , ( p - > ainsn . boostable = = 1 ) ) ;
p - > ainsn . insn = NULL ;
}
2005-04-16 15:20:36 -07:00
}
2006-04-18 22:22:00 -07:00
static void __kprobes save_previous_kprobe ( struct kprobe_ctlblk * kcb )
2005-06-23 00:09:37 -07:00
{
2005-11-07 01:00:12 -08:00
kcb - > prev_kprobe . kp = kprobe_running ( ) ;
kcb - > prev_kprobe . status = kcb - > kprobe_status ;
2008-01-30 13:31:21 +01:00
kcb - > prev_kprobe . old_flags = kcb - > kprobe_old_flags ;
kcb - > prev_kprobe . saved_flags = kcb - > kprobe_saved_flags ;
2005-06-23 00:09:37 -07:00
}
2006-04-18 22:22:00 -07:00
static void __kprobes restore_previous_kprobe ( struct kprobe_ctlblk * kcb )
2005-06-23 00:09:37 -07:00
{
2005-11-07 01:00:12 -08:00
__get_cpu_var ( current_kprobe ) = kcb - > prev_kprobe . kp ;
kcb - > kprobe_status = kcb - > prev_kprobe . status ;
2008-01-30 13:31:21 +01:00
kcb - > kprobe_old_flags = kcb - > prev_kprobe . old_flags ;
kcb - > kprobe_saved_flags = kcb - > prev_kprobe . saved_flags ;
2005-06-23 00:09:37 -07:00
}
2006-04-18 22:22:00 -07:00
static void __kprobes set_current_kprobe ( struct kprobe * p , struct pt_regs * regs ,
2005-11-07 01:00:12 -08:00
struct kprobe_ctlblk * kcb )
2005-06-23 00:09:37 -07:00
{
2005-11-07 01:00:12 -08:00
__get_cpu_var ( current_kprobe ) = p ;
2008-01-30 13:31:21 +01:00
kcb - > kprobe_saved_flags = kcb - > kprobe_old_flags
2008-01-30 13:31:27 +01:00
= ( regs - > flags & ( X86_EFLAGS_TF | X86_EFLAGS_IF ) ) ;
2005-06-23 00:09:37 -07:00
if ( is_IF_modifier ( p - > ainsn . insn ) )
2008-01-30 13:31:27 +01:00
kcb - > kprobe_saved_flags & = ~ X86_EFLAGS_IF ;
2005-06-23 00:09:37 -07:00
}
2008-01-30 13:31:43 +01:00
static void __kprobes clear_btf ( void )
2008-01-30 13:30:54 +01:00
{
2010-03-25 14:51:51 +01:00
if ( test_thread_flag ( TIF_BLOCKSTEP ) ) {
unsigned long debugctl = get_debugctlmsr ( ) ;
debugctl & = ~ DEBUGCTLMSR_BTF ;
update_debugctlmsr ( debugctl ) ;
}
2008-01-30 13:30:54 +01:00
}
2008-01-30 13:31:43 +01:00
static void __kprobes restore_btf ( void )
2008-01-30 13:30:54 +01:00
{
2010-03-25 14:51:51 +01:00
if ( test_thread_flag ( TIF_BLOCKSTEP ) ) {
unsigned long debugctl = get_debugctlmsr ( ) ;
debugctl | = DEBUGCTLMSR_BTF ;
update_debugctlmsr ( debugctl ) ;
}
2008-01-30 13:30:54 +01:00
}
2007-05-08 00:34:14 -07:00
void __kprobes arch_prepare_kretprobe ( struct kretprobe_instance * ri ,
2005-09-06 15:19:28 -07:00
struct pt_regs * regs )
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
{
2008-01-30 13:31:21 +01:00
unsigned long * sara = stack_addr ( regs ) ;
2005-06-27 15:17:10 -07:00
2007-05-08 00:34:14 -07:00
ri - > ret_addr = ( kprobe_opcode_t * ) * sara ;
2008-01-30 13:31:21 +01:00
2007-05-08 00:34:14 -07:00
/* Replace the return addr with trampoline addr */
* sara = ( unsigned long ) & kretprobe_trampoline ;
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
}
2008-01-30 13:32:50 +01:00
2010-02-25 08:34:46 -05:00
# ifdef CONFIG_OPTPROBES
static int __kprobes setup_detour_execution ( struct kprobe * p ,
struct pt_regs * regs ,
int reenter ) ;
# else
# define setup_detour_execution(p, regs, reenter) (0)
# endif
2008-01-30 13:32:50 +01:00
static void __kprobes setup_singlestep ( struct kprobe * p , struct pt_regs * regs ,
2010-02-25 08:34:23 -05:00
struct kprobe_ctlblk * kcb , int reenter )
2008-01-30 13:32:50 +01:00
{
2010-02-25 08:34:46 -05:00
if ( setup_detour_execution ( p , regs , reenter ) )
return ;
2010-02-02 16:49:04 -05:00
# if !defined(CONFIG_PREEMPT)
2008-01-30 13:32:50 +01:00
if ( p - > ainsn . boostable = = 1 & & ! p - > post_handler ) {
/* Boost up -- we can execute copied instructions directly */
2010-02-25 08:34:23 -05:00
if ( ! reenter )
reset_current_kprobe ( ) ;
/*
* Reentering boosted probe doesn ' t reset current_kprobe ,
* nor set current_kprobe , because it doesn ' t use single
* stepping .
*/
2008-01-30 13:32:50 +01:00
regs - > ip = ( unsigned long ) p - > ainsn . insn ;
preempt_enable_no_resched ( ) ;
return ;
}
# endif
2010-02-25 08:34:23 -05:00
if ( reenter ) {
save_previous_kprobe ( kcb ) ;
set_current_kprobe ( p , regs , kcb ) ;
kcb - > kprobe_status = KPROBE_REENTER ;
} else
kcb - > kprobe_status = KPROBE_HIT_SS ;
/* Prepare real single stepping */
clear_btf ( ) ;
regs - > flags | = X86_EFLAGS_TF ;
regs - > flags & = ~ X86_EFLAGS_IF ;
/* single step inline if the instruction is an int3 */
if ( p - > opcode = = BREAKPOINT_INSTRUCTION )
regs - > ip = ( unsigned long ) p - > addr ;
else
regs - > ip = ( unsigned long ) p - > ainsn . insn ;
2008-01-30 13:32:50 +01:00
}
2008-01-30 13:32:02 +01:00
/*
* We have reentered the kprobe_handler ( ) , since another probe was hit while
* within the handler . We save the original kprobes variables and just single
* step on the instruction of the new probe without calling any user handlers .
*/
2008-01-30 13:32:02 +01:00
static int __kprobes reenter_kprobe ( struct kprobe * p , struct pt_regs * regs ,
struct kprobe_ctlblk * kcb )
2008-01-30 13:32:02 +01:00
{
2008-01-30 13:32:50 +01:00
switch ( kcb - > kprobe_status ) {
case KPROBE_HIT_SSDONE :
case KPROBE_HIT_ACTIVE :
2008-01-30 13:33:13 +01:00
kprobes_inc_nmissed_count ( p ) ;
2010-02-25 08:34:23 -05:00
setup_singlestep ( p , regs , kcb , 1 ) ;
2008-01-30 13:32:50 +01:00
break ;
case KPROBE_HIT_SS :
2009-08-27 13:22:58 -04:00
/* A probe has been hit in the codepath leading up to, or just
* after , single - stepping of a probed instruction . This entire
* codepath should strictly reside in . kprobes . text section .
* Raise a BUG or we ' ll continue in an endless reentering loop
* and eventually a stack overflow .
*/
printk ( KERN_WARNING " Unrecoverable kprobe detected at %p. \n " ,
p - > addr ) ;
dump_kprobe ( p ) ;
BUG ( ) ;
2008-01-30 13:32:50 +01:00
default :
/* impossible cases */
WARN_ON ( 1 ) ;
2008-01-30 13:33:13 +01:00
return 0 ;
2008-01-30 13:32:02 +01:00
}
2008-01-30 13:32:50 +01:00
2008-01-30 13:32:02 +01:00
return 1 ;
2008-01-30 13:32:02 +01:00
}
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
2008-01-30 13:31:21 +01:00
/*
* Interrupts are disabled on entry as trap3 is an interrupt gate and they
tree-wide: fix assorted typos all over the place
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-11-14 13:09:05 -02:00
* remain disabled throughout this function .
2008-01-30 13:31:21 +01:00
*/
static int __kprobes kprobe_handler ( struct pt_regs * regs )
2005-04-16 15:20:36 -07:00
{
2008-01-30 13:31:21 +01:00
kprobe_opcode_t * addr ;
2008-01-30 13:32:50 +01:00
struct kprobe * p ;
2005-11-07 01:00:14 -08:00
struct kprobe_ctlblk * kcb ;
2008-01-30 13:31:21 +01:00
addr = ( kprobe_opcode_t * ) ( regs - > ip - sizeof ( kprobe_opcode_t ) ) ;
2005-11-07 01:00:14 -08:00
/*
* We don ' t want to be preempted for the entire
2008-01-30 13:32:50 +01:00
* duration of kprobe processing . We conditionally
* re - enable preemption at the end of this function ,
* and also in reenter_kprobe ( ) and setup_singlestep ( ) .
2005-11-07 01:00:14 -08:00
*/
preempt_disable ( ) ;
2005-04-16 15:20:36 -07:00
2008-01-30 13:32:50 +01:00
kcb = get_kprobe_ctlblk ( ) ;
2008-01-30 13:32:19 +01:00
p = get_kprobe ( addr ) ;
2008-01-30 13:32:50 +01:00
2008-01-30 13:32:19 +01:00
if ( p ) {
if ( kprobe_running ( ) ) {
2008-01-30 13:32:50 +01:00
if ( reenter_kprobe ( p , regs , kcb ) )
return 1 ;
2005-04-16 15:20:36 -07:00
} else {
2008-01-30 13:32:19 +01:00
set_current_kprobe ( p , regs , kcb ) ;
kcb - > kprobe_status = KPROBE_HIT_ACTIVE ;
2008-01-30 13:32:50 +01:00
2005-04-16 15:20:36 -07:00
/*
2008-01-30 13:32:50 +01:00
* If we have no pre - handler or it returned 0 , we
* continue with normal processing . If we have a
* pre - handler and it returned non - zero , it prepped
* for calling the break_handler below on re - entry
* for jprobe processing , so get out doing nothing
* more here .
2005-04-16 15:20:36 -07:00
*/
2008-01-30 13:32:50 +01:00
if ( ! p - > pre_handler | | ! p - > pre_handler ( p , regs ) )
2010-02-25 08:34:23 -05:00
setup_singlestep ( p , regs , kcb , 0 ) ;
2008-01-30 13:32:50 +01:00
return 1 ;
2008-01-30 13:32:19 +01:00
}
2010-04-27 18:33:49 -04:00
} else if ( * addr ! = BREAKPOINT_INSTRUCTION ) {
/*
* The breakpoint instruction was removed right
* after we hit it . Another cpu has removed
* either a probepoint or a debugger breakpoint
* at this address . In either case , no further
* handling of this interrupt is appropriate .
* Back up over the ( now missing ) int3 and run
* the original instruction .
*/
regs - > ip = ( unsigned long ) addr ;
preempt_enable_no_resched ( ) ;
return 1 ;
2008-01-30 13:32:50 +01:00
} else if ( kprobe_running ( ) ) {
p = __get_cpu_var ( current_kprobe ) ;
if ( p - > break_handler & & p - > break_handler ( p , regs ) ) {
2010-02-25 08:34:23 -05:00
setup_singlestep ( p , regs , kcb , 0 ) ;
2008-01-30 13:32:50 +01:00
return 1 ;
2005-04-16 15:20:36 -07:00
}
2008-01-30 13:32:50 +01:00
} /* else: not a kprobe fault; let the kernel handle it */
2005-04-16 15:20:36 -07:00
2005-11-07 01:00:14 -08:00
preempt_enable_no_resched ( ) ;
2008-01-30 13:32:50 +01:00
return 0 ;
2005-04-16 15:20:36 -07:00
}
2010-02-25 08:34:30 -05:00
# ifdef CONFIG_X86_64
# define SAVE_REGS_STRING \
/* Skip cs, ip, orig_ax. */ \
" subq $24, %rsp \n " \
" pushq %rdi \n " \
" pushq %rsi \n " \
" pushq %rdx \n " \
" pushq %rcx \n " \
" pushq %rax \n " \
" pushq %r8 \n " \
" pushq %r9 \n " \
" pushq %r10 \n " \
" pushq %r11 \n " \
" pushq %rbx \n " \
" pushq %rbp \n " \
" pushq %r12 \n " \
" pushq %r13 \n " \
" pushq %r14 \n " \
" pushq %r15 \n "
# define RESTORE_REGS_STRING \
" popq %r15 \n " \
" popq %r14 \n " \
" popq %r13 \n " \
" popq %r12 \n " \
" popq %rbp \n " \
" popq %rbx \n " \
" popq %r11 \n " \
" popq %r10 \n " \
" popq %r9 \n " \
" popq %r8 \n " \
" popq %rax \n " \
" popq %rcx \n " \
" popq %rdx \n " \
" popq %rsi \n " \
" popq %rdi \n " \
/* Skip orig_ax, ip, cs */ \
" addq $24, %rsp \n "
# else
# define SAVE_REGS_STRING \
/* Skip cs, ip, orig_ax and gs. */ \
" subl $16, %esp \n " \
" pushl %fs \n " \
" pushl %es \n " \
2010-07-16 18:17:12 -07:00
" pushl %ds \n " \
2010-02-25 08:34:30 -05:00
" pushl %eax \n " \
" pushl %ebp \n " \
" pushl %edi \n " \
" pushl %esi \n " \
" pushl %edx \n " \
" pushl %ecx \n " \
" pushl %ebx \n "
# define RESTORE_REGS_STRING \
" popl %ebx \n " \
" popl %ecx \n " \
" popl %edx \n " \
" popl %esi \n " \
" popl %edi \n " \
" popl %ebp \n " \
" popl %eax \n " \
/* Skip ds, es, fs, gs, orig_ax, and ip. Note: don't pop cs here*/ \
" addl $24, %esp \n "
# endif
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
/*
2008-01-30 13:31:21 +01:00
* When a retprobed function returns , this code saves registers and
* calls trampoline_handler ( ) runs , which calls the kretprobe ' s handler .
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
*/
2008-02-14 15:23:53 -08:00
static void __used __kprobes kretprobe_trampoline_holder ( void )
2008-01-30 13:33:01 +01:00
{
2008-01-30 13:31:21 +01:00
asm volatile (
" .global kretprobe_trampoline \n "
2008-01-30 13:31:21 +01:00
" kretprobe_trampoline: \n "
2008-01-30 13:31:21 +01:00
# ifdef CONFIG_X86_64
2008-01-30 13:31:21 +01:00
/* We don't bother saving the ss register */
" pushq %rsp \n "
" pushfq \n "
2010-02-25 08:34:30 -05:00
SAVE_REGS_STRING
2008-01-30 13:31:21 +01:00
" movq %rsp, %rdi \n "
" call trampoline_handler \n "
/* Replace saved sp with true return address. */
" movq %rax, 152(%rsp) \n "
2010-02-25 08:34:30 -05:00
RESTORE_REGS_STRING
2008-01-30 13:31:21 +01:00
" popfq \n "
2008-01-30 13:31:21 +01:00
# else
" pushf \n "
2010-02-25 08:34:30 -05:00
SAVE_REGS_STRING
2008-01-30 13:31:21 +01:00
" movl %esp, %eax \n "
" call trampoline_handler \n "
/* Move flags to cs */
2009-03-23 10:14:52 -04:00
" movl 56(%esp), %edx \n "
" movl %edx, 52(%esp) \n "
2008-01-30 13:31:21 +01:00
/* Replace saved flags with true return address. */
2009-03-23 10:14:52 -04:00
" movl %eax, 56(%esp) \n "
2010-02-25 08:34:30 -05:00
RESTORE_REGS_STRING
2008-01-30 13:31:21 +01:00
" popf \n "
# endif
2008-01-30 13:31:21 +01:00
" ret \n " ) ;
2008-01-30 13:33:01 +01:00
}
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
/*
2008-01-30 13:31:21 +01:00
* Called from kretprobe_trampoline
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
*/
2008-02-14 15:23:53 -08:00
static __used __kprobes void * trampoline_handler ( struct pt_regs * regs )
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
{
2006-10-02 02:17:33 -07:00
struct kretprobe_instance * ri = NULL ;
2006-10-02 02:17:35 -07:00
struct hlist_head * head , empty_rp ;
2006-10-02 02:17:33 -07:00
struct hlist_node * node , * tmp ;
2005-11-07 01:00:14 -08:00
unsigned long flags , orig_ret_address = 0 ;
2008-01-30 13:31:21 +01:00
unsigned long trampoline_address = ( unsigned long ) & kretprobe_trampoline ;
2010-08-15 15:18:04 +09:00
kprobe_opcode_t * correct_ret_addr = NULL ;
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
2006-10-02 02:17:35 -07:00
INIT_HLIST_HEAD ( & empty_rp ) ;
2008-07-25 01:46:04 -07:00
kretprobe_hash_lock ( current , & head , & flags ) ;
2008-01-30 13:31:21 +01:00
/* fixup registers */
2008-01-30 13:31:21 +01:00
# ifdef CONFIG_X86_64
2008-01-30 13:31:21 +01:00
regs - > cs = __KERNEL_CS ;
2008-01-30 13:31:21 +01:00
# else
regs - > cs = __KERNEL_CS | get_kernel_rpl ( ) ;
2009-03-23 10:14:52 -04:00
regs - > gs = 0 ;
2008-01-30 13:31:21 +01:00
# endif
2008-01-30 13:31:21 +01:00
regs - > ip = trampoline_address ;
2008-01-30 13:31:21 +01:00
regs - > orig_ax = ~ 0UL ;
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
2005-06-27 15:17:10 -07:00
/*
* It is possible to have multiple instances associated with a given
2008-01-30 13:31:21 +01:00
* task either because multiple functions in the call path have
2008-10-16 19:02:37 +02:00
* return probes installed on them , and / or more than one
2005-06-27 15:17:10 -07:00
* return probe was registered for a target function .
*
* We can handle this because :
2008-01-30 13:31:21 +01:00
* - instances are always pushed into the head of the list
2005-06-27 15:17:10 -07:00
* - when multiple return probes are registered for the same
2008-01-30 13:31:21 +01:00
* function , the ( chronologically ) first instance ' s ret_addr
* will be the real return address , and all the rest will
* point to kretprobe_trampoline .
2005-06-27 15:17:10 -07:00
*/
hlist_for_each_entry_safe ( ri , node , tmp , head , hlist ) {
2006-10-02 02:17:33 -07:00
if ( ri - > task ! = current )
2005-06-27 15:17:10 -07:00
/* another task is sharing our hash bucket */
2006-10-02 02:17:33 -07:00
continue ;
2005-06-27 15:17:10 -07:00
2010-08-15 15:18:04 +09:00
orig_ret_address = ( unsigned long ) ri - > ret_addr ;
if ( orig_ret_address ! = trampoline_address )
/*
* This is the real return address . Any other
* instances associated with this task are for
* other calls deeper on the call stack
*/
break ;
}
kretprobe_assert ( ri , orig_ret_address , trampoline_address ) ;
correct_ret_addr = ri - > ret_addr ;
hlist_for_each_entry_safe ( ri , node , tmp , head , hlist ) {
if ( ri - > task ! = current )
/* another task is sharing our hash bucket */
continue ;
orig_ret_address = ( unsigned long ) ri - > ret_addr ;
2008-01-30 13:31:21 +01:00
if ( ri - > rp & & ri - > rp - > handler ) {
__get_cpu_var ( current_kprobe ) = & ri - > rp - > kp ;
get_kprobe_ctlblk ( ) - > kprobe_status = KPROBE_HIT_ACTIVE ;
2010-08-15 15:18:04 +09:00
ri - > ret_addr = correct_ret_addr ;
2005-06-27 15:17:10 -07:00
ri - > rp - > handler ( ri , regs ) ;
2008-01-30 13:31:21 +01:00
__get_cpu_var ( current_kprobe ) = NULL ;
}
2005-06-27 15:17:10 -07:00
2006-10-02 02:17:35 -07:00
recycle_rp_inst ( ri , & empty_rp ) ;
2005-06-27 15:17:10 -07:00
if ( orig_ret_address ! = trampoline_address )
/*
* This is the real return address . Any other
* instances associated with this task are for
* other calls deeper on the call stack
*/
break ;
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
}
2005-06-27 15:17:10 -07:00
2008-07-25 01:46:04 -07:00
kretprobe_hash_unlock ( current , & flags ) ;
2005-06-27 15:17:10 -07:00
2006-10-02 02:17:35 -07:00
hlist_for_each_entry_safe ( ri , node , tmp , & empty_rp , hlist ) {
hlist_del ( & ri - > hlist ) ;
kfree ( ri ) ;
}
2008-01-30 13:31:21 +01:00
return ( void * ) orig_ret_address ;
[PATCH] x86_64 specific function return probes
The following patch adds the x86_64 architecture specific implementation
for function return probes.
Function return probes is a mechanism built on top of kprobes that allows
a caller to register a handler to be called when a given function exits.
For example, to instrument the return path of sys_mkdir:
static int sys_mkdir_exit(struct kretprobe_instance *i, struct pt_regs *regs)
{
printk("sys_mkdir exited\n");
return 0;
}
static struct kretprobe return_probe = {
.handler = sys_mkdir_exit,
};
<inside setup function>
return_probe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("sys_mkdir");
if (register_kretprobe(&return_probe)) {
printk(KERN_DEBUG "Unable to register return probe!\n");
/* do error path */
}
<inside cleanup function>
unregister_kretprobe(&return_probe);
The way this works is that:
* At system initialization time, kernel/kprobes.c installs a kprobe
on a function called kretprobe_trampoline() that is implemented in
the arch/x86_64/kernel/kprobes.c (More on this later)
* When a return probe is registered using register_kretprobe(),
kernel/kprobes.c will install a kprobe on the first instruction of the
targeted function with the pre handler set to arch_prepare_kretprobe()
which is implemented in arch/x86_64/kernel/kprobes.c.
* arch_prepare_kretprobe() will prepare a kretprobe instance that stores:
- nodes for hanging this instance in an empty or free list
- a pointer to the return probe
- the original return address
- a pointer to the stack address
With all this stowed away, arch_prepare_kretprobe() then sets the return
address for the targeted function to a special trampoline function called
kretprobe_trampoline() implemented in arch/x86_64/kernel/kprobes.c
* The kprobe completes as normal, with control passing back to the target
function that executes as normal, and eventually returns to our trampoline
function.
* Since a kprobe was installed on kretprobe_trampoline() during system
initialization, control passes back to kprobes via the architecture
specific function trampoline_probe_handler() which will lookup the
instance in an hlist maintained by kernel/kprobes.c, and then call
the handler function.
* When trampoline_probe_handler() is done, the kprobes infrastructure
single steps the original instruction (in this case just a top), and
then calls trampoline_post_handler(). trampoline_post_handler() then
looks up the instance again, puts the instance back on the free list,
and then makes a long jump back to the original return instruction.
So to recap, to instrument the exit path of a function this implementation
will cause four interruptions:
- A breakpoint at the very beginning of the function allowing us to
switch out the return address
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
- A breakpoint in the trampoline function where our instrumented function
returned to
- A single step interruption to execute the original instruction that
we replaced with the break instruction (normal kprobe flow)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 00:09:23 -07:00
}
2005-04-16 15:20:36 -07:00
/*
* Called after single - stepping . p - > addr is the address of the
* instruction whose first byte has been replaced by the " int 3 "
* instruction . To avoid the SMP problems that can occur when we
* temporarily put back the original opcode to single - step , we
* single - stepped a copy of the instruction . The address of this
* copy is p - > ainsn . insn .
*
* This function prepares to return from the post - single - step
* interrupt . We have to fix up the stack as follows :
*
* 0 ) Except in the case of absolute or indirect jump or call instructions ,
2008-01-30 13:30:56 +01:00
* the new ip is relative to the copied instruction . We need to make
2005-04-16 15:20:36 -07:00
* it relative to the original instruction .
*
* 1 ) If the single - stepped instruction was pushfl , then the TF and IF
2008-01-30 13:30:56 +01:00
* flags are set in the just - pushed flags , and may need to be cleared .
2005-04-16 15:20:36 -07:00
*
* 2 ) If the single - stepped instruction was a call , the return address
* that is atop the stack is the address following the copied instruction .
* We need to make it the address following the original instruction .
2008-01-30 13:31:21 +01:00
*
* If this is the first time we ' ve single - stepped the instruction at
* this probepoint , and the instruction is boostable , boost it : add a
* jump instruction after the copied instruction , that jumps to the next
* instruction after the probepoint .
2005-04-16 15:20:36 -07:00
*/
2005-11-07 01:00:12 -08:00
static void __kprobes resume_execution ( struct kprobe * p ,
struct pt_regs * regs , struct kprobe_ctlblk * kcb )
2005-04-16 15:20:36 -07:00
{
2008-01-30 13:31:21 +01:00
unsigned long * tos = stack_addr ( regs ) ;
unsigned long copy_ip = ( unsigned long ) p - > ainsn . insn ;
unsigned long orig_ip = ( unsigned long ) p - > addr ;
2005-04-16 15:20:36 -07:00
kprobe_opcode_t * insn = p - > ainsn . insn ;
2010-06-29 14:53:50 +09:00
/* Skip prefixes */
insn = skip_prefixes ( insn ) ;
2005-04-16 15:20:36 -07:00
2008-01-30 13:31:27 +01:00
regs - > flags & = ~ X86_EFLAGS_TF ;
2005-04-16 15:20:36 -07:00
switch ( * insn ) {
2007-12-18 18:05:58 +01:00
case 0x9c : /* pushfl */
2008-01-30 13:31:27 +01:00
* tos & = ~ ( X86_EFLAGS_TF | X86_EFLAGS_IF ) ;
2008-01-30 13:31:21 +01:00
* tos | = kcb - > kprobe_old_flags ;
2005-04-16 15:20:36 -07:00
break ;
2007-12-18 18:05:58 +01:00
case 0xc2 : /* iret/ret/lret */
case 0xc3 :
2005-05-05 16:15:40 -07:00
case 0xca :
2007-12-18 18:05:58 +01:00
case 0xcb :
case 0xcf :
case 0xea : /* jmp absolute -- ip is correct */
/* ip is already adjusted, no more changes required */
2008-01-30 13:31:21 +01:00
p - > ainsn . boostable = 1 ;
2007-12-18 18:05:58 +01:00
goto no_change ;
case 0xe8 : /* call relative - Fix return addr */
2008-01-30 13:31:21 +01:00
* tos = orig_ip + ( * tos - copy_ip ) ;
2005-04-16 15:20:36 -07:00
break ;
2008-01-30 13:31:43 +01:00
# ifdef CONFIG_X86_32
2008-01-30 13:31:21 +01:00
case 0x9a : /* call absolute -- same as call absolute, indirect */
* tos = orig_ip + ( * tos - copy_ip ) ;
goto no_change ;
# endif
2005-04-16 15:20:36 -07:00
case 0xff :
2006-05-20 15:00:21 -07:00
if ( ( insn [ 1 ] & 0x30 ) = = 0x10 ) {
2008-01-30 13:31:21 +01:00
/*
* call absolute , indirect
* Fix return addr ; ip is correct .
* But this is not boostable
*/
* tos = orig_ip + ( * tos - copy_ip ) ;
2007-12-18 18:05:58 +01:00
goto no_change ;
2008-01-30 13:31:21 +01:00
} else if ( ( ( insn [ 1 ] & 0x31 ) = = 0x20 ) | |
( ( insn [ 1 ] & 0x31 ) = = 0x21 ) ) {
/*
* jmp near and far , absolute indirect
* ip is correct . And this is boostable
*/
2008-01-30 13:31:21 +01:00
p - > ainsn . boostable = 1 ;
2007-12-18 18:05:58 +01:00
goto no_change ;
2005-04-16 15:20:36 -07:00
}
default :
break ;
}
2008-01-30 13:31:21 +01:00
if ( p - > ainsn . boostable = = 0 ) {
2008-01-30 13:31:21 +01:00
if ( ( regs - > ip > copy_ip ) & &
( regs - > ip - copy_ip ) + 5 < MAX_INSN_SIZE ) {
2008-01-30 13:31:21 +01:00
/*
* These instructions can be executed directly if it
* jumps back to correct address .
*/
2010-02-25 08:34:46 -05:00
synthesize_reljump ( ( void * ) regs - > ip ,
( void * ) orig_ip + ( regs - > ip - copy_ip ) ) ;
2008-01-30 13:31:21 +01:00
p - > ainsn . boostable = 1 ;
} else {
p - > ainsn . boostable = - 1 ;
}
}
2008-01-30 13:31:21 +01:00
regs - > ip + = orig_ip - copy_ip ;
2008-01-30 13:30:56 +01:00
2007-12-18 18:05:58 +01:00
no_change :
2008-01-30 13:30:54 +01:00
restore_btf ( ) ;
2005-04-16 15:20:36 -07:00
}
2008-01-30 13:31:21 +01:00
/*
* Interrupts are disabled on entry as trap1 is an interrupt gate and they
tree-wide: fix assorted typos all over the place
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-11-14 13:09:05 -02:00
* remain disabled throughout this function .
2008-01-30 13:31:21 +01:00
*/
static int __kprobes post_kprobe_handler ( struct pt_regs * regs )
2005-04-16 15:20:36 -07:00
{
2005-11-07 01:00:12 -08:00
struct kprobe * cur = kprobe_running ( ) ;
struct kprobe_ctlblk * kcb = get_kprobe_ctlblk ( ) ;
if ( ! cur )
2005-04-16 15:20:36 -07:00
return 0 ;
2008-03-16 03:21:21 -05:00
resume_execution ( cur , regs , kcb ) ;
regs - > flags | = kcb - > kprobe_saved_flags ;
2005-11-07 01:00:12 -08:00
if ( ( kcb - > kprobe_status ! = KPROBE_REENTER ) & & cur - > post_handler ) {
kcb - > kprobe_status = KPROBE_HIT_SSDONE ;
cur - > post_handler ( cur , regs , 0 ) ;
2005-06-23 00:09:37 -07:00
}
2005-04-16 15:20:36 -07:00
2008-01-30 13:31:21 +01:00
/* Restore back the original saved kprobes variables and continue. */
2005-11-07 01:00:12 -08:00
if ( kcb - > kprobe_status = = KPROBE_REENTER ) {
restore_previous_kprobe ( kcb ) ;
2005-06-23 00:09:37 -07:00
goto out ;
}
2005-11-07 01:00:12 -08:00
reset_current_kprobe ( ) ;
2005-06-23 00:09:37 -07:00
out :
2005-04-16 15:20:36 -07:00
preempt_enable_no_resched ( ) ;
/*
2008-01-30 13:30:56 +01:00
* if somebody else is singlestepping across a probe point , flags
2005-04-16 15:20:36 -07:00
* will have TF set , in which case , continue the remaining processing
* of do_debug , as if this is not a probe hit .
*/
2008-01-30 13:31:27 +01:00
if ( regs - > flags & X86_EFLAGS_TF )
2005-04-16 15:20:36 -07:00
return 0 ;
return 1 ;
}
2005-09-06 15:19:28 -07:00
int __kprobes kprobe_fault_handler ( struct pt_regs * regs , int trapnr )
2005-04-16 15:20:36 -07:00
{
2005-11-07 01:00:12 -08:00
struct kprobe * cur = kprobe_running ( ) ;
struct kprobe_ctlblk * kcb = get_kprobe_ctlblk ( ) ;
2008-01-30 13:31:21 +01:00
switch ( kcb - > kprobe_status ) {
2006-03-26 01:38:23 -08:00
case KPROBE_HIT_SS :
case KPROBE_REENTER :
/*
* We are here because the instruction being single
* stepped caused a page fault . We reset the current
2008-01-30 13:30:56 +01:00
* kprobe and the ip points back to the probe address
2006-03-26 01:38:23 -08:00
* and allow the page fault handler to continue as a
* normal page fault .
*/
2008-01-30 13:30:56 +01:00
regs - > ip = ( unsigned long ) cur - > addr ;
2008-01-30 13:31:21 +01:00
regs - > flags | = kcb - > kprobe_old_flags ;
2006-03-26 01:38:23 -08:00
if ( kcb - > kprobe_status = = KPROBE_REENTER )
restore_previous_kprobe ( kcb ) ;
else
reset_current_kprobe ( ) ;
2005-04-16 15:20:36 -07:00
preempt_enable_no_resched ( ) ;
2006-03-26 01:38:23 -08:00
break ;
case KPROBE_HIT_ACTIVE :
case KPROBE_HIT_SSDONE :
/*
* We increment the nmissed count for accounting ,
2008-01-30 13:31:21 +01:00
* we can also use npre / npostfault count for accounting
2006-03-26 01:38:23 -08:00
* these specific fault cases .
*/
kprobes_inc_nmissed_count ( cur ) ;
/*
* We come here because instructions in the pre / post
* handler caused the page_fault , this could happen
* if handler tries to access user space by
* copy_from_user ( ) , get_user ( ) etc . Let the
* user - specified handler try to fix it first .
*/
if ( cur - > fault_handler & & cur - > fault_handler ( cur , regs , trapnr ) )
return 1 ;
/*
* In case the user - specified fault handler returned
* zero , try to fix up .
*/
2008-01-30 13:31:21 +01:00
if ( fixup_exception ( regs ) )
return 1 ;
2008-01-30 13:31:41 +01:00
2006-03-26 01:38:23 -08:00
/*
2008-01-30 13:31:21 +01:00
* fixup routine could not handle it ,
2006-03-26 01:38:23 -08:00
* Let do_page_fault ( ) fix it .
*/
break ;
default :
break ;
2005-04-16 15:20:36 -07:00
}
return 0 ;
}
/*
* Wrapper routine for handling exceptions .
*/
2005-09-06 15:19:28 -07:00
int __kprobes kprobe_exceptions_notify ( struct notifier_block * self ,
unsigned long val , void * data )
2005-04-16 15:20:36 -07:00
{
2008-01-30 13:33:23 +01:00
struct die_args * args = data ;
2005-11-07 01:00:07 -08:00
int ret = NOTIFY_DONE ;
2008-01-30 13:31:21 +01:00
if ( args - > regs & & user_mode_vm ( args - > regs ) )
2006-03-26 01:38:21 -08:00
return ret ;
2005-04-16 15:20:36 -07:00
switch ( val ) {
case DIE_INT3 :
if ( kprobe_handler ( args - > regs ) )
2005-11-07 01:00:07 -08:00
ret = NOTIFY_STOP ;
2005-04-16 15:20:36 -07:00
break ;
case DIE_DEBUG :
2009-06-01 23:47:06 +05:30
if ( post_kprobe_handler ( args - > regs ) ) {
/*
* Reset the BS bit in dr6 ( pointed by args - > err ) to
* denote completion of processing
*/
( * ( unsigned long * ) ERR_PTR ( args - > err ) ) & = ~ DR_STEP ;
2005-11-07 01:00:07 -08:00
ret = NOTIFY_STOP ;
2009-06-01 23:47:06 +05:30
}
2005-04-16 15:20:36 -07:00
break ;
case DIE_GPF :
x86: code clarification patch to Kprobes arch code
When developing the Kprobes arch code for ARM, I ran across some code
found in x86 and s390 Kprobes arch code which I didn't consider as
good as it could be.
Once I figured out what the code was doing, I changed the code
for ARM Kprobes to work the way I felt was more appropriate.
I've tested the code this way in ARM for about a year and would
like to push the same change to the other affected architectures.
The code in question is in kprobe_exceptions_notify() which
does:
====
/* kprobe_running() needs smp_processor_id() */
preempt_disable();
if (kprobe_running() &&
kprobe_fault_handler(args->regs, args->trapnr))
ret = NOTIFY_STOP;
preempt_enable();
====
For the moment, ignore the code having the preempt_disable()/
preempt_enable() pair in it.
The problem is that kprobe_running() needs to call smp_processor_id()
which will assert if preemption is enabled. That sanity check by
smp_processor_id() makes perfect sense since calling it with preemption
enabled would return an unreliable result.
But the function kprobe_exceptions_notify() can be called from a
context where preemption could be enabled. If that happens, the
assertion in smp_processor_id() happens and we're dead. So what
the original author did (speculation on my part!) is put in the
preempt_disable()/preempt_enable() pair to simply defeat the check.
Once I figured out what was going on, I considered this an
inappropriate approach. If kprobe_exceptions_notify() is called
from a preemptible context, we can't be in a kprobe processing
context at that time anyways since kprobes requires preemption to
already be disabled, so just check for preemption enabled, and if
so, blow out before ever calling kprobe_running(). I wrote the ARM
kprobe code like this:
====
/* To be potentially processing a kprobe fault and to
* trust the result from kprobe_running(), we have
* be non-preemptible. */
if (!preemptible() && kprobe_running() &&
kprobe_fault_handler(args->regs, args->trapnr))
ret = NOTIFY_STOP;
====
The above code has been working fine for ARM Kprobes for a year.
So I changed the x86 code (2.6.24-rc6) to be the same way and ran
the Systemtap tests on that kernel. As on ARM, Systemtap on x86
comes up with the same test results either way, so it's a neutral
external functional change (as expected).
This issue has been discussed previously on linux-arm-kernel and the
Systemtap mailing lists. Pointers to the by base for the two
discussions:
http://lists.arm.linux.org.uk/lurker/message/20071219.223225.1f5c2a5e.en.html
http://sourceware.org/ml/systemtap/2007-q1/msg00251.html
Signed-off-by: Quentin Barnes <qbarnes@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ananth N Mavinakayahanalli <ananth@in.ibm.com>
Acked-by: Ananth N Mavinakayahanalli <ananth@in.ibm.com>
2008-01-30 13:32:32 +01:00
/*
* To be potentially processing a kprobe fault and to
* trust the result from kprobe_running ( ) , we have
* be non - preemptible .
*/
if ( ! preemptible ( ) & & kprobe_running ( ) & &
2005-04-16 15:20:36 -07:00
kprobe_fault_handler ( args - > regs , args - > trapnr ) )
2005-11-07 01:00:07 -08:00
ret = NOTIFY_STOP ;
2005-04-16 15:20:36 -07:00
break ;
default :
break ;
}
2005-11-07 01:00:07 -08:00
return ret ;
2005-04-16 15:20:36 -07:00
}
2005-09-06 15:19:28 -07:00
int __kprobes setjmp_pre_handler ( struct kprobe * p , struct pt_regs * regs )
2005-04-16 15:20:36 -07:00
{
struct jprobe * jp = container_of ( p , struct jprobe , kp ) ;
unsigned long addr ;
2005-11-07 01:00:12 -08:00
struct kprobe_ctlblk * kcb = get_kprobe_ctlblk ( ) ;
2005-04-16 15:20:36 -07:00
2005-11-07 01:00:12 -08:00
kcb - > jprobe_saved_regs = * regs ;
2008-01-30 13:31:21 +01:00
kcb - > jprobe_saved_sp = stack_addr ( regs ) ;
addr = ( unsigned long ) ( kcb - > jprobe_saved_sp ) ;
2005-04-16 15:20:36 -07:00
/*
* As Linus pointed out , gcc assumes that the callee
* owns the argument space and could overwrite it , e . g .
* tailcall optimization . So , to be absolutely safe
* we also save and restore enough stack bytes to cover
* the argument area .
*/
2005-11-07 01:00:12 -08:00
memcpy ( kcb - > jprobes_stack , ( kprobe_opcode_t * ) addr ,
2008-01-30 13:31:21 +01:00
MIN_STACK_SIZE ( addr ) ) ;
2008-01-30 13:31:27 +01:00
regs - > flags & = ~ X86_EFLAGS_IF ;
2007-10-11 22:25:25 +02:00
trace_hardirqs_off ( ) ;
2008-01-30 13:30:56 +01:00
regs - > ip = ( unsigned long ) ( jp - > entry ) ;
2005-04-16 15:20:36 -07:00
return 1 ;
}
2005-09-06 15:19:28 -07:00
void __kprobes jprobe_return ( void )
2005-04-16 15:20:36 -07:00
{
2005-11-07 01:00:12 -08:00
struct kprobe_ctlblk * kcb = get_kprobe_ctlblk ( ) ;
2008-01-30 13:31:21 +01:00
asm volatile (
# ifdef CONFIG_X86_64
" xchg %%rbx,%%rsp \n "
# else
" xchgl %%ebx,%%esp \n "
# endif
" int3 \n "
" .globl jprobe_return_end \n "
" jprobe_return_end: \n "
" nop \n " : : " b "
( kcb - > jprobe_saved_sp ) : " memory " ) ;
2005-04-16 15:20:36 -07:00
}
2005-09-06 15:19:28 -07:00
int __kprobes longjmp_break_handler ( struct kprobe * p , struct pt_regs * regs )
2005-04-16 15:20:36 -07:00
{
2005-11-07 01:00:12 -08:00
struct kprobe_ctlblk * kcb = get_kprobe_ctlblk ( ) ;
2008-01-30 13:30:56 +01:00
u8 * addr = ( u8 * ) ( regs - > ip - 1 ) ;
2005-04-16 15:20:36 -07:00
struct jprobe * jp = container_of ( p , struct jprobe , kp ) ;
2008-01-30 13:31:21 +01:00
if ( ( addr > ( u8 * ) jprobe_return ) & &
( addr < ( u8 * ) jprobe_return_end ) ) {
2008-01-30 13:31:21 +01:00
if ( stack_addr ( regs ) ! = kcb - > jprobe_saved_sp ) {
2007-12-18 18:05:58 +01:00
struct pt_regs * saved_regs = & kcb - > jprobe_saved_regs ;
2008-01-30 13:31:21 +01:00
printk ( KERN_ERR
" current sp %p does not match saved sp %p \n " ,
2008-01-30 13:31:21 +01:00
stack_addr ( regs ) , kcb - > jprobe_saved_sp ) ;
2008-01-30 13:31:21 +01:00
printk ( KERN_ERR " Saved registers for jprobe %p \n " , jp ) ;
2005-04-16 15:20:36 -07:00
show_registers ( saved_regs ) ;
2008-01-30 13:31:21 +01:00
printk ( KERN_ERR " Current registers \n " ) ;
2005-04-16 15:20:36 -07:00
show_registers ( regs ) ;
BUG ( ) ;
}
2005-11-07 01:00:12 -08:00
* regs = kcb - > jprobe_saved_regs ;
2008-01-30 13:31:21 +01:00
memcpy ( ( kprobe_opcode_t * ) ( kcb - > jprobe_saved_sp ) ,
kcb - > jprobes_stack ,
MIN_STACK_SIZE ( kcb - > jprobe_saved_sp ) ) ;
2005-11-07 01:00:14 -08:00
preempt_enable_no_resched ( ) ;
2005-04-16 15:20:36 -07:00
return 1 ;
}
return 0 ;
}
2005-06-27 15:17:10 -07:00
2010-02-25 08:34:46 -05:00
# ifdef CONFIG_OPTPROBES
/* Insert a call instruction at address 'from', which calls address 'to'.*/
static void __kprobes synthesize_relcall ( void * from , void * to )
{
__synthesize_relative_insn ( from , to , RELATIVECALL_OPCODE ) ;
}
/* Insert a move instruction which sets a pointer to eax/rdi (1st arg). */
static void __kprobes synthesize_set_arg1 ( kprobe_opcode_t * addr ,
unsigned long val )
{
# ifdef CONFIG_X86_64
* addr + + = 0x48 ;
* addr + + = 0xbf ;
# else
* addr + + = 0xb8 ;
# endif
* ( unsigned long * ) addr = val ;
}
void __kprobes kprobes_optinsn_template_holder ( void )
{
asm volatile (
" .global optprobe_template_entry \n "
" optprobe_template_entry: \n "
# ifdef CONFIG_X86_64
/* We don't bother saving the ss register */
" pushq %rsp \n "
" pushfq \n "
SAVE_REGS_STRING
" movq %rsp, %rsi \n "
" .global optprobe_template_val \n "
" optprobe_template_val: \n "
ASM_NOP5
ASM_NOP5
" .global optprobe_template_call \n "
" optprobe_template_call: \n "
ASM_NOP5
/* Move flags to rsp */
" movq 144(%rsp), %rdx \n "
" movq %rdx, 152(%rsp) \n "
RESTORE_REGS_STRING
/* Skip flags entry */
" addq $8, %rsp \n "
" popfq \n "
# else /* CONFIG_X86_32 */
" pushf \n "
SAVE_REGS_STRING
" movl %esp, %edx \n "
" .global optprobe_template_val \n "
" optprobe_template_val: \n "
ASM_NOP5
" .global optprobe_template_call \n "
" optprobe_template_call: \n "
ASM_NOP5
RESTORE_REGS_STRING
" addl $4, %esp \n " /* skip cs */
" popf \n "
# endif
" .global optprobe_template_end \n "
" optprobe_template_end: \n " ) ;
}
# define TMPL_MOVE_IDX \
( ( long ) & optprobe_template_val - ( long ) & optprobe_template_entry )
# define TMPL_CALL_IDX \
( ( long ) & optprobe_template_call - ( long ) & optprobe_template_entry )
# define TMPL_END_IDX \
( ( long ) & optprobe_template_end - ( long ) & optprobe_template_entry )
# define INT3_SIZE sizeof(kprobe_opcode_t)
/* Optimized kprobe call back function: called from optinsn */
static void __kprobes optimized_callback ( struct optimized_kprobe * op ,
struct pt_regs * regs )
{
struct kprobe_ctlblk * kcb = get_kprobe_ctlblk ( ) ;
preempt_disable ( ) ;
if ( kprobe_running ( ) ) {
kprobes_inc_nmissed_count ( & op - > kp ) ;
} else {
/* Save skipped registers */
# ifdef CONFIG_X86_64
regs - > cs = __KERNEL_CS ;
# else
regs - > cs = __KERNEL_CS | get_kernel_rpl ( ) ;
regs - > gs = 0 ;
# endif
regs - > ip = ( unsigned long ) op - > kp . addr + INT3_SIZE ;
regs - > orig_ax = ~ 0UL ;
__get_cpu_var ( current_kprobe ) = & op - > kp ;
kcb - > kprobe_status = KPROBE_HIT_ACTIVE ;
opt_pre_handler ( & op - > kp , regs ) ;
__get_cpu_var ( current_kprobe ) = NULL ;
}
preempt_enable_no_resched ( ) ;
}
static int __kprobes copy_optimized_instructions ( u8 * dest , u8 * src )
{
int len = 0 , ret ;
while ( len < RELATIVEJUMP_SIZE ) {
ret = __copy_instruction ( dest + len , src + len , 1 ) ;
if ( ! ret | | ! can_boost ( dest + len ) )
return - EINVAL ;
len + = ret ;
}
/* Check whether the address range is reserved */
if ( ftrace_text_reserved ( src , src + len - 1 ) | |
alternatives_text_reserved ( src , src + len - 1 ) )
return - EBUSY ;
return len ;
}
/* Check whether insn is indirect jump */
static int __kprobes insn_is_indirect_jump ( struct insn * insn )
{
return ( ( insn - > opcode . bytes [ 0 ] = = 0xff & &
( X86_MODRM_REG ( insn - > modrm . value ) & 6 ) = = 4 ) | | /* Jump */
insn - > opcode . bytes [ 0 ] = = 0xea ) ; /* Segment based jump */
}
/* Check whether insn jumps into specified address range */
static int insn_jump_into_range ( struct insn * insn , unsigned long start , int len )
{
unsigned long target = 0 ;
switch ( insn - > opcode . bytes [ 0 ] ) {
case 0xe0 : /* loopne */
case 0xe1 : /* loope */
case 0xe2 : /* loop */
case 0xe3 : /* jcxz */
case 0xe9 : /* near relative jump */
case 0xeb : /* short relative jump */
break ;
case 0x0f :
if ( ( insn - > opcode . bytes [ 1 ] & 0xf0 ) = = 0x80 ) /* jcc near */
break ;
return 0 ;
default :
if ( ( insn - > opcode . bytes [ 0 ] & 0xf0 ) = = 0x70 ) /* jcc short */
break ;
return 0 ;
}
target = ( unsigned long ) insn - > next_byte + insn - > immediate . value ;
return ( start < = target & & target < = start + len ) ;
}
/* Decode whole function to ensure any instructions don't jump into target */
static int __kprobes can_optimize ( unsigned long paddr )
{
int ret ;
unsigned long addr , size = 0 , offset = 0 ;
struct insn insn ;
kprobe_opcode_t buf [ MAX_INSN_SIZE ] ;
/* Dummy buffers for lookup_symbol_attrs */
static char __dummy_buf [ KSYM_NAME_LEN ] ;
/* Lookup symbol including addr */
if ( ! kallsyms_lookup ( paddr , & size , & offset , NULL , __dummy_buf ) )
return 0 ;
/* Check there is enough space for a relative jump. */
if ( size - offset < RELATIVEJUMP_SIZE )
return 0 ;
/* Decode instructions */
addr = paddr - offset ;
while ( addr < paddr - offset + size ) { /* Decode until function end */
if ( search_exception_tables ( addr ) )
/*
* Since some fixup code will jumps into this function ,
* we can ' t optimize kprobe in this function .
*/
return 0 ;
kernel_insn_init ( & insn , ( void * ) addr ) ;
insn_get_opcode ( & insn ) ;
if ( insn . opcode . bytes [ 0 ] = = BREAKPOINT_INSTRUCTION ) {
ret = recover_probed_instruction ( buf , addr ) ;
if ( ret )
return 0 ;
kernel_insn_init ( & insn , buf ) ;
}
insn_get_length ( & insn ) ;
/* Recover address */
insn . kaddr = ( void * ) addr ;
insn . next_byte = ( void * ) ( addr + insn . length ) ;
/* Check any instructions don't jump into target */
if ( insn_is_indirect_jump ( & insn ) | |
insn_jump_into_range ( & insn , paddr + INT3_SIZE ,
RELATIVE_ADDR_SIZE ) )
return 0 ;
addr + = insn . length ;
}
return 1 ;
}
/* Check optimized_kprobe can actually be optimized. */
int __kprobes arch_check_optimized_kprobe ( struct optimized_kprobe * op )
{
int i ;
struct kprobe * p ;
for ( i = 1 ; i < op - > optinsn . size ; i + + ) {
p = get_kprobe ( op - > kp . addr + i ) ;
if ( p & & ! kprobe_disabled ( p ) )
return - EEXIST ;
}
return 0 ;
}
/* Check the addr is within the optimized instructions. */
int __kprobes arch_within_optimized_kprobe ( struct optimized_kprobe * op ,
unsigned long addr )
{
return ( ( unsigned long ) op - > kp . addr < = addr & &
( unsigned long ) op - > kp . addr + op - > optinsn . size > addr ) ;
}
/* Free optimized instruction slot */
static __kprobes
void __arch_remove_optimized_kprobe ( struct optimized_kprobe * op , int dirty )
{
if ( op - > optinsn . insn ) {
free_optinsn_slot ( op - > optinsn . insn , dirty ) ;
op - > optinsn . insn = NULL ;
op - > optinsn . size = 0 ;
}
}
void __kprobes arch_remove_optimized_kprobe ( struct optimized_kprobe * op )
{
__arch_remove_optimized_kprobe ( op , 1 ) ;
}
/*
* Copy replacing target instructions
* Target instructions MUST be relocatable ( checked inside )
*/
int __kprobes arch_prepare_optimized_kprobe ( struct optimized_kprobe * op )
{
u8 * buf ;
int ret ;
long rel ;
if ( ! can_optimize ( ( unsigned long ) op - > kp . addr ) )
return - EILSEQ ;
op - > optinsn . insn = get_optinsn_slot ( ) ;
if ( ! op - > optinsn . insn )
return - ENOMEM ;
/*
* Verify if the address gap is in 2 GB range , because this uses
* a relative jump .
*/
rel = ( long ) op - > optinsn . insn - ( long ) op - > kp . addr + RELATIVEJUMP_SIZE ;
if ( abs ( rel ) > 0x7fffffff )
return - ERANGE ;
buf = ( u8 * ) op - > optinsn . insn ;
/* Copy instructions into the out-of-line buffer */
ret = copy_optimized_instructions ( buf + TMPL_END_IDX , op - > kp . addr ) ;
if ( ret < 0 ) {
__arch_remove_optimized_kprobe ( op , 0 ) ;
return ret ;
}
op - > optinsn . size = ret ;
/* Copy arch-dep-instance from template */
memcpy ( buf , & optprobe_template_entry , TMPL_END_IDX ) ;
/* Set probe information */
synthesize_set_arg1 ( buf + TMPL_MOVE_IDX , ( unsigned long ) op ) ;
/* Set probe function call */
synthesize_relcall ( buf + TMPL_CALL_IDX , optimized_callback ) ;
/* Set returning jmp instruction at the tail of out-of-line buffer */
synthesize_reljump ( buf + TMPL_END_IDX + op - > optinsn . size ,
( u8 * ) op - > kp . addr + op - > optinsn . size ) ;
flush_icache_range ( ( unsigned long ) buf ,
( unsigned long ) buf + TMPL_END_IDX +
op - > optinsn . size + RELATIVEJUMP_SIZE ) ;
return 0 ;
}
/* Replace a breakpoint (int3) with a relative jump. */
int __kprobes arch_optimize_kprobe ( struct optimized_kprobe * op )
{
unsigned char jmp_code [ RELATIVEJUMP_SIZE ] ;
s32 rel = ( s32 ) ( ( long ) op - > optinsn . insn -
( ( long ) op - > kp . addr + RELATIVEJUMP_SIZE ) ) ;
/* Backup instructions which will be replaced by jump address */
memcpy ( op - > optinsn . copied_insn , op - > kp . addr + INT3_SIZE ,
RELATIVE_ADDR_SIZE ) ;
jmp_code [ 0 ] = RELATIVEJUMP_OPCODE ;
* ( s32 * ) ( & jmp_code [ 1 ] ) = rel ;
/*
* text_poke_smp doesn ' t support NMI / MCE code modifying .
* However , since kprobes itself also doesn ' t support NMI / MCE
* code probing , it ' s not a problem .
*/
text_poke_smp ( op - > kp . addr , jmp_code , RELATIVEJUMP_SIZE ) ;
return 0 ;
}
/* Replace a relative jump with a breakpoint (int3). */
void __kprobes arch_unoptimize_kprobe ( struct optimized_kprobe * op )
{
u8 buf [ RELATIVEJUMP_SIZE ] ;
/* Set int3 to first byte for kprobes */
buf [ 0 ] = BREAKPOINT_INSTRUCTION ;
memcpy ( buf + 1 , op - > optinsn . copied_insn , RELATIVE_ADDR_SIZE ) ;
text_poke_smp ( op - > kp . addr , buf , RELATIVEJUMP_SIZE ) ;
}
static int __kprobes setup_detour_execution ( struct kprobe * p ,
struct pt_regs * regs ,
int reenter )
{
struct optimized_kprobe * op ;
if ( p - > flags & KPROBE_FLAG_OPTIMIZED ) {
/* This kprobe is really able to run optimized path. */
op = container_of ( p , struct optimized_kprobe , kp ) ;
/* Detour through copied instructions */
regs - > ip = ( unsigned long ) op - > optinsn . insn + TMPL_END_IDX ;
if ( ! reenter )
reset_current_kprobe ( ) ;
preempt_enable_no_resched ( ) ;
return 1 ;
}
return 0 ;
}
# endif
2005-07-05 18:54:50 -07:00
int __init arch_init_kprobes ( void )
2005-06-27 15:17:10 -07:00
{
2008-01-30 13:31:21 +01:00
return 0 ;
2005-06-27 15:17:10 -07:00
}
2007-05-08 00:34:16 -07:00
int __kprobes arch_trampoline_kprobe ( struct kprobe * p )
{
return 0 ;
}