2371 Commits

Author SHA1 Message Date
Masatake YAMATO
1d78d22058 unwind: introduce markers specifying the needs of special care in unwinding
Some system calls require capturing the stack trace before they are
processed in kernel.  Typical one is execve.  Some system calls require
invalidating mmap cache after they are processed in kernel.

In current implementation these requirements are handled directly by
appropriate syscall handlers.  However, it is difficult to keep the
source code maintainable using this approach to cover all system calls
which have such requirements.

A more generic way to implement this is to flag all syscalls that
require special processing, and handle these flags right in
trace_syscall_entering instead of changing syscall handlers.

This patch just defines new flags: STACKTRACE_INVALIDATE_CACHE and
STACKTRACE_CAPTURE_ON_ENTER.

The names of macros are suggested by Dmitry Levin.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2014-05-30 22:57:56 +00:00
Masatake YAMATO
a0b4ee7b38 unwind: enable dwarf cache of libunwind
Here is the benchmark of the dwarf cache.

Target program:

    #include <sched.h>
    int main(void)
    {
      unsigned int max = 0x6fff, i;
      for (i = 0; i < max; i++)
	sched_yield();
      return 0;
    }

Command line:

	./strace -o /dev/null -k a.out

With the dwarf cache:

    real	0m12.081s
    user	0m3.858s
    sys 	0m8.194s

Without the dwarf cache:

    real	0m22.326s
    user	0m5.218s
    sys		0m16.952s

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2014-05-30 22:57:39 +00:00
Masatake YAMATO
b45b7faa1f unwind: report expected backtracing error
When a file mmap'ed to the target process is unlink'ed, backtracing the
stack would fail.  Current implementation reports it as
"backtracing_error".  To avoid confusion, the message is changed to
"expected_backtracing_error".

Here is the reproducer:

  $ cat ./p-deleted.c
  #include <unistd.h>

  int main(int argc, char **argv) {
    return unlink(argv[0]) < 0;
  }

  $ strace -e unlink -k ./p-deleted
  unlink("./p-deleted")                   = 0
   > /usr/lib64/libc-2.18.so(unlink+0x7) [0xe7f17]
   > /home/yamato/var/strace/t_unwind/p-deleted (deleted)(+0x0) [0x575]
   > /usr/lib64/libc-2.18.so(__libc_start_main+0xf5) [0x21d65]
   > backtracing_error [0x7ffff1365590]
  +++ exited with 0 +++

p-deleted is deleted therefore backtracing_error is reported.  This
patch records the deleted marker when making mmap cache and refers the
recorded information in the case "backtracing_error" to switch the
message.

Here is the output of this patch:

  $ strace -e unlink -k ./p-deleted
  unlink("./p-deleted")                   = 0
   > /usr/lib64/libc-2.18.so(unlink+0x7) [0xe7f17]
   > /home/yamato/var/strace/t_unwind/p-deleted (deleted)(+0x0) [0x575]
   > /usr/lib64/libc-2.18.so(__libc_start_main+0xf5) [0x21d65]
   > expected_backtracing_error [0x7ffff1365590]
  +++ exited with 0 +++

This solution is not perfect: if a file is unlink'ed after making the
mmap cache and before unwinding, strace cannot have a chance to record
the deleted marker.

In this version of patch, hardcoded magic number used in comparing "(delete)"
string is replaced with strlen as suggested by Dmitry Levin.

In old version of patch, the deleted entry was thrown away from mmap
cache to avoid to report "backtracing_error".  In this patch I keep it,
and just switch the error message.
Inspired by the review comment from Dmitry Levin.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2014-05-30 22:57:19 +00:00
Masatake YAMATO
2b09df9731 unwind: call unwind_tcb_fin before printing detached message
captured stacktrace is printed in unwind_tcb_fin if tcp->queue is not
empty.  This should happen before printing detached message, so
unwind_tcb_fin is moved to the top of droptcb.

This is implicitly suggested by Dmitry Levin in patch review process.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2014-05-30 22:56:38 +00:00
Masatake YAMATO
9bc6561588 unwind: implement automatic mmap cache invalidation
A mmap cache belonging to a tcb was updated when a system call which
changed the memory mapping was called.  This implementation was assumed
the mapping was changed only by the tcb.  However, this assumption is
incorrect if the target application is multi-threaded; more than two
tcbs can shared the same memory mapping and a tcb can modify it without
being noticed by the others.

This change introduces a global integer variable mmap_cache_generation,
and mmap_cache_generation field to struct tcb.  The variable
is incremented each time a process enters a syscall that can modify its
memory mapping.  Each tcb records the value of this variable at the
moment if  building its mmap cache.  Every mmap cache associated with
the given tcb can be validated by comparing its mmap_cache_generation
field with the variable mmap_cache_generation.

This implementation is inefficient.  If strace attaches two processes
which don't share the memory mapping, rebuilding mmap cache of a tcb
triggered by another tcb's mmap system call is not necessary.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2014-05-30 22:56:14 +00:00
Masatake YAMATO
f8e39d7b7a unwind: introduce queue_t for capturing stacktrace
This is the second step for splitting capturing from printing.

New `queue' field is added to tcb.  Captured stacktrace is stored here.
The field is initialized/finalized at unwind_tcb_init/unwind_tcb_fin.

New API function unwind_capture_stacktrace is added.  This function
captures the currest stack using stracktrace_walker and records it in
tcb.  It's printing is delayed to the next call of
unwind_print_stacktrace.

unwind_print_stacktrace is extended.  Now it checks queue field of
the given tcb at the start of function.  If the function finds a
captured stack trace, the latter is printed using stracktrace_walker.

Currently unwind_capture_stacktrace invocations are added directly to
handlers of mmap, munmap, mprotect, and execve.

Here is the difference of output with/without patch:

(without patch)
  execve("./test-fork", ["./test-fork"], [/* 56 vars */]) = 0
   > /usr/lib64/ld-2.18.so(check_one_fd.part.0+0x82) [0x11f0]

(with patch)
  execve("./test-fork", ["./test-fork"], [/* 54 vars */]) = 0
   > /usr/lib64/libc-2.18.so(execve+0x7) [0xbcd27]
   > /home/yamato/var/strace/strace(exec_or_die+0x10c) [0x26ac]
   > /home/yamato/var/strace/strace(startup_child+0x346) [0x134f6]
   > /home/yamato/var/strace/strace(init+0x89f) [0x13dff]
   > /home/yamato/var/strace/strace(main+0xa) [0x26ca]
   > /usr/lib64/libc-2.18.so(__libc_start_main+0xf5) [0x21d65]
   > /home/yamato/var/strace/strace(_start+0x29) [0x2799]

In older version output lines of captured elements were built when
printing.  In this version they are built when capturing the stack.
As result, unneeded dynamic memory allocations are avoided.
Suggested by Luca Clementi.

In older version the combination of snprintf and realloc were used.
In this version they are replaced with asprintf.
Suggested by Dmitry Levin.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2014-05-30 22:55:08 +00:00
Masatake YAMATO
4e121e5bb4 unwind: introduce own debug macro
* unwind.c (DPRINTF): New macro, to be utilized in debugging cache
management code.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2014-05-30 22:54:07 +00:00
Masatake YAMATO
2d534daaa6 unwind: introduce stacktrace_walker
In current implementation, the stack trace is captured and printed at
the same time, in trace_syscall_exiting.  This approach cannot
provide user expected information when a system call changes the
memory mapping.  In such cases, the stack trace should be captured on
entering syscall and printed on exiting.

As the initial step for splitting capturing from printing, this change
introduces stacktrace_walker utility function.  It can be used both for
capturing in trace_syscall_entering and printing in
trace_syscall_exiting.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2014-05-30 22:43:04 +00:00
Masatake YAMATO
6141392856 unwind: give all exported functions "unwind_" prefix
* unwind.c (init_unwind_addr_space): Rename to unwind_init.
(init_libunwind_ui): Rename to unwind_tcb_init.
(free_libunwind_ui): Rename to unwind_tcb_fin.
(delete_mmap_cache): Rename to unwind_cache_invalidate.
(print_stacktrace): Rename to unwind_print_stacktrace.
* defs.h: Update prototypes.
* mem.c: All callers updated.
* process.c: Likewise.
* strace.c: Likewise.
* syscall.c: Likewise.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2014-05-30 22:40:22 +00:00
Masatake YAMATO
7721499fc7 unwind: delete mmap cache in free_libunwind_ui
free_libunwind_ui is expected to release all unwind related resources
attached to tcp.

* strace.c (droptcb): Move delete_mmap_cache call ...
* unwind.c (free_libunwind_ui): ... to here.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2014-05-30 22:30:07 +00:00
Masatake YAMATO
b65042fbdb unwind: make alloc_mmap_cache function local
* defs.h (alloc_mmap_cache): Remove.
* unwind.c (alloc_mmap_cache): Add static qualifier.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
2014-05-30 22:28:15 +00:00
Masatake YAMATO
b4a2de8eff unwind: fix a bug in range updating of binary search
* unwind.c (print_stacktrace): Fix off-by-one error in binary search.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Luca Clementi <luca.clementi@gmail.com>
2014-05-30 22:26:42 +00:00
Luca Clementi
327064b637 Add -k option to print stack trace after each syscall
Print the stack trace of the traced process after each system call when
-k option is specified.  It is implemented using libunwind to unwind the
stack and to obtain the function name pointed by the IP.

Based on the code that was originally taken from strace-plus
of Philip J. Guo.

* configure.ac: Add --with-libunwind option.  Check libunwind support.
* Makefile.am: Add libunwind support.
* defs.h (struct tcb) [USE_LIBUNWIND]: Append libunwind specific fields.
[USE_LIBUNWIND] (stack_trace_enabled, alloc_mmap_cache,
delete_mmap_cache, print_stacktrace): New prototypes.
* mem.c (print_mmap, sys_munmap, sys_mprotect): Add libunwind support.
* process.c (sys_execve): Likewise.
* strace.c (usage, alloctcb, droptcb, init): Likewise.
* syscall.c (trace_syscall_exiting): Likewise.
* unwind.c: New file.
* strace.1: Document -k option.
2014-05-30 22:24:31 +00:00
6dbbe0737a sysctl: update CTL_*, KERN_*, NET_*, and VM_* constants
* configure.ac (AC_CHECK_DECLS): Add CTL_*, KERN_*, NET_*, and
VM_* constants.
* system.c (CTL_PROC, CTL_CPU): Remove definitions.
* xlat/sysctl_*.in: Update.
2014-05-30 22:10:21 +00:00
d8ad1ddc76 Check for constants used by waitid function
* configure.ac (AC_CHECK_DECLS): Add P_* constants.
2014-05-30 22:10:21 +00:00
baf60d92f1 Check for LO_FLAGS_READ_ONLY constant
* configure.ac (AC_CHECK_DECLS): Add LO_FLAGS_READ_ONLY.
2014-05-30 22:10:21 +00:00
d35bdcad13 Compress blank lines
Suppress empty lines left after automated xlat conversion.
2014-05-30 22:10:21 +00:00
63ebcfc559 xlat: cleanup the aftermath of automatic conversion 2014-05-30 22:10:00 +00:00
0ed617bd66 Generate xlat/*.in files
Automatically convert xlat structures from *.c files to xlat/*.in files
using "./generate_xlat_in.sh *.c" command.
2014-05-30 21:40:03 +00:00
297b59401c Rename several xlat structures to avoid collisions
* bjm.c (which): Rename to qm_which.
* ipc.c (msg_flags): Rename to ipc_msg_flags.
* time.c (which): Rename to itimer_which.
2014-05-30 21:39:04 +00:00
5153b5cd68 Enhance xlat generator
* xlat/gen.sh: Define all xlat structs not declared in defs.h as static.
Some symbolic constants are not macros, extend #ifdef check to cover
symbolic constants checked by AC_CHECK_DECLS.
Handle complex symbolic constants in SYMBOL|... form.
Handle symbolic constants in 1<<SYMBOL form.
Handle numeric constants.
Implement #unconditional directive that turns off preprocessor checks.
Implement #unterminated directive that turns off adding XLAT_END.
2014-05-30 21:33:02 +00:00
3e69bdf41a Use bootstrap script consistently
Now that ./xlat/gen.sh has to be run before autoreconf,
replace all autoreconf calls with ./bootstrap call.

* bootstrap: Forward arguments to autoreconf.
* build_static_example.sh: Replace autoreconf call with bootstrap call.
* make-dist: Likewise.
* qemu_multiarch_testing/README: Likewise.
2014-05-30 21:31:08 +00:00
Mike Frysinger
761ed9ba42 Implement xlat generator
* bootstrap: New file.
* xlat/gen.sh: Likewise.
* Makefile.am: Include xlat/Makemodule.am
(EXTRA_DIST): Add $(XLAT_INPUT_FILES), $(XLAT_HEADER_FILES), and
xlat/gen.sh.
2014-05-30 21:29:40 +00:00
e25fb4fd8e tests: fix SCM_RIGHTS test for big-endian systems
* tests/scm_rights.c (main): Send zero integer to avoid issues with
endianness.
* tests/scm_rights-fd.test: Update grep patterns.
2014-05-30 15:18:00 +00:00
f23b097fc5 Decode file descriptors passed via SCM_RIGHTS control messages
* net.c (printcmsghdr): Print descriptors from SCM_RIGHTS control
messages using printfd.
* tests/scm_rights.c: New file.
* tests/scm_rights-fd.test: New test.
* tests/Makefile.am (check_PROGRAMS): Add scm_rights.
(TESTS): Add scm_rights-fd.test.
* tests/.gitignore: Add scm_rights and uio.
2014-05-30 00:20:53 +00:00
772e32b67b tests: add a test for -c and -w options
* tests/count.test: New test.
* tests/Makefile.am (TESTS): Add it.
2014-05-30 00:20:44 +00:00
Mark Hills
e53bf23f1c Optionally produce stats on syscall latency
Time spent in system time is not useful where a syscall depends on some
non-CPU resource, eg. typically open() or stat() to a network drive.

This patch adds a new flag (-w) to produce a summary of the time
difference between beginning and end of the system call (ie. latency)

This functionality has been useful to profile slow processes that
are not CPU-bound.

Signed-off-by: Mark Hills <mark.hills@framestore.com>
2014-05-29 18:15:38 +00:00
ac5133d0cb Constify count_syscall function
* count.c (count_syscall): Add const qualifier to timeval argument and
rename it.  Store the wall clock time spent while in syscall in separate
timeval variable.
* defs.h (count_syscall): Update prototype.
* syscall.c (trace_syscall_exiting): Update count_syscall invocation.
2014-05-29 18:10:00 +00:00
447db45365 Constify tv_* functions
* defs.h (tv_nz, tv_cmp, tv_float, tv_add, tv_sub, tv_mul, tv_div): Add
const qualifier to read only arguments.
* util.c (tv_nz, tv_cmp, tv_float, tv_add, tv_sub, tv_mul, tv_div):
Likewise.
2014-05-29 17:59:01 +00:00
3a3b71c7d8 Use printstr for sethostname, setdomainname, and gethostname decoding
The argument passed to sethostname and setdomainname syscalls, as well
as the string returned by gethostname syscall, is not a pathname, so
printpathn is not the right method for its decoding.

* process.c (sys_sethostname, sys_setdomainname): Decode 1st argument
using printstr instead of printpathn.
[ALPHA] (sys_gethostname): Likewise.
2014-05-28 18:09:46 +00:00
James Hogan
3b09ebe724 Fix {get,set}rlimit decoding with unreliable SIZEOF_RLIM_T
When strace is built with large file support definitions in CFLAGS (as
may be provided by buildroot) the C library headers may expose a 64-bit
rlim_t even though the struct rlimit fields used by the system call
interface are only 32-bit.  The SIZEOF_RLIM_T will then be 8 which
results in bad decoding of the getrlimit and setrlimit syscalls.

This is fixed by replacing unreliable SIZEOF_RLIM_T based checks with
checks for current_wordsize.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
2014-05-21 00:22:07 +00:00
Masatake YAMATO
b2ede14797 Enhance setns syscall decoding
* process.c (sys_setns): New function.
Decode the 2nd syscall argument using clone_flags.
* linux/syscall.h (sys_setns): New prototype.
* linux/dummy.h (sys_setns): Remove.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
2014-05-13 23:22:47 +00:00
985425a30b mips: fix syscall entries that should have TP flag set 2014-05-12 20:37:48 +00:00
10b735a6f9 xtensa: fix unshare syscall entry 2014-05-12 20:37:48 +00:00
117d13d0de alpha, hppa, mips n64: fix waitid syscall entry 2014-05-12 20:37:48 +00:00
cd96f77ef8 Add TM flag to shmat and shmdt syscall entries 2014-05-12 20:37:20 +00:00
6556315493 Alias sys_vfork to sys_fork
* process.c (sys_vfork): Remove.
* linux/syscall.h (sys_vfork): Likewise.
* linux/dummy.h (sys_vfork): Alias to sys_fork.
* linux/alpha/syscallent.h: Fix vfork entry.
* util.c (setbpt): Do not check for sys_vfork.
* syscall.c (syscall_fixup_for_fork_exec): Likewise.
2014-05-12 20:26:24 +00:00
e51ce47b11 epoll_ctl: fix EPOLL_CTL_DEL argument decoding
* desc.c (sys_epoll_ctl): Do not parse the event structure for
EPOLL_CTL_DEL operation.

Reported-by: Марк Коренберг <socketpair@gmail.com>
2014-04-17 14:33:59 +00:00
fb7ae846c6 Update CLOCK_* constants
* time.c (clocknames): Add CLOCK_BOOTTIME, CLOCK_REALTIME_ALARM,
CLOCK_BOOTTIME_ALARM, CLOCK_SGI_CYCLE, and CLOCK_TAI.
Fixes RH#1088455.
2014-04-17 14:18:13 +00:00
7845a42b39 Fix preadv/pwritev offset decoding
* util.c (printllval): Add align argument.
* defs.h (printllval): Update prototype.
(printllval_aligned, printllval_unaligned): New macros.
* file.c (sys_readahead, sys_truncate64, sys_ftruncate64, sys_fadvise64,
sys_fadvise64_64, sys_sync_file_range, sys_sync_file_range2,
sys_fallocate): Replace printllval call with printllval_aligned.
* io.c (sys_pread, sys_pwrite): Likewise.
(sys_preadv, sys_pwritev): Replace printllval call with
printllval_unaligned.
* linux/arm/syscallent.h: Set the number of preadv and pwritev
arguments to 5.
* linux/mips/syscallent-o32.h: Likewise.
* linux/powerpc/syscallent.h: Likewise.
* linux/sh/syscallent.h: Likewise.
* linux/xtensa/syscallent.h: Likewise.

Reported-by: Dima Kogan <dima@secretsauce.net>
2014-04-17 13:39:49 +00:00
cc3d59199d tests: add a test for pread/pwrite and preadv/pwritev offset decoding
* tests/uio.c: New file.
* tests/uio.test: New test.
* tests/Makefile.am (check_PROGRAMS): Add uio.
(uio_CFLAGS): Define.
(TESTS): Add uio.test.
2014-04-16 23:49:33 +00:00
99a0544f01 Refactor LDT decoding
* configure.ac (AC_CHECK_TYPES): Remove struct user_desc.
* ldt.c: New file.
* Makefile.am (strace_SOURCES): Add ldt.c.
* mem.c: Do not include <asm/ldt.h>.
(print_ldt_entry): Remove.
(sys_modify_ldt, sys_set_thread_area, sys_get_thread_area): Move...
* ldt.c: ... here.
* process.c: Do not include <asm/ldt.h>.
(sys_clone) [I386 || X86_64 || X32]: Use print_user_desc.
2014-04-10 15:29:13 +00:00
Denys Vlasenko
329fa3919d Make int3 example in comments more cut-n-pastable
I found that I use it quite often. Lets make it so that
after cut-n-pasting it into a file, there is no need
to edit the result (e.g. no need to remove C comment
chars from every line.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
2014-04-10 09:57:17 +02:00
15bc281269 mips: enable decoding of set_thread_area
* linux/dummy.h [MIPS]: Do not redirect sys_set_thread_area to printargs.
* mem.c [MIPS] (sys_set_thread_area): Define.
2014-04-09 13:14:44 +00:00
662221cab3 x86_64, x32: enable decoding of modify_ldt, get_thread_area, and set_thread_area
* linux/dummy.h [X86_64 || X32]: Do not redirect sys_modify_ldt,
sys_get_thread_area, and sys_set_thread_area to printargs.
2014-04-09 12:46:05 +00:00
f94e84780e x32: decode clone LDT user_desc entries for x86 processes
* mem.c [X32]: Include asm/ldt.h.
[X32] (print_ldt_entry, sys_modify_ldt, sys_set_thread_area,
sys_get_thread_area): Define.
* process.c [X32]: Include asm/ldt.h.
(sys_clone) [X32]: Decode LDT entry if current_personality == 1.
2014-04-09 12:37:01 +00:00
Elliott Hughes
44655a451e x86-64: decode clone LDT user_desc entries for x86 processes
* mem.c [X86_64]: Include asm/ldt.h.
[X86_64] (print_ldt_entry, sys_modify_ldt, sys_set_thread_area,
sys_get_thread_area): Define.
* process.c [X86_64]: Include asm/ldt.h.
(sys_clone) [X86_64]: Decode LDT entry if current_personality == 1.

Signed-off-by: Elliott Hughes <enh@google.com>
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
2014-04-09 12:36:47 +00:00
2c4fb25766 x32: fix clone(2) argument order for x86 processes
Apply the same fix that was made for x86_64.

* process.c [X32] (ARG_CTID, ARG_TLS): Take current
personality into account.
2014-04-09 12:34:58 +00:00
Elliott Hughes
b563325f0a x86-64: fix clone(2) argument order for x86 processes
Without this patch, strace claims that parent_tidptr == tls, which is
clearly wrong.  It is expected that parent_tidptr == child_tidptr.

* process.c [X86_64] (ARG_CTID, ARG_TLS): Take current
personality into account.

Signed-off-by: Elliott Hughes <enh@google.com>
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
2014-04-09 12:33:12 +00:00
Elliott Hughes
391c0d8cc5 aarch64: Fix decoding of arm struct stat64
We need to handle this situation more like x86-64.  32-bit arm and i386
actually have a common struct stat64, except the arm one must not be
packed.  Additionally, on aarch64 the 32-bit personality is personality 0.

Signed-off-by: Elliott Hughes <enh@google.com>
2014-04-06 23:25:36 +00:00