IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Add support for Imagination Technologies Meta architecture (the
architecture/ABI is usually referred to as metag in code). The Meta
Linux kernel port is in the process of being upstreamed for v3.9 so it
uses generic system call numbers.
sys_lookup_dcookie writes a filename to buffer argument, so I've set
TF flag.
nfsservctl appears to be set to sys_ni_syscall in asm-generic/unistd.h
so I've left it blank.
truncate64/ftruncate64/pread64/pwrite64/readahead have unaligned 64bit
args which are packed tightly on metag, so less arguments on metag.
fchdir/llseek takes a file descriptor so s/TF/TD/
sync_file_range has 2 64bit args so uses 6 args, so s/4/6/
timerfd_create/msgget/msgctl/msgrcv/semget/segtimedop/semop/shmget/
shmctl/shmat/shmdt/recvmsg/migrate_pages have different number of args.
oldgetrlimit is just getrlimit for metag.
add TM flag to various memory syscalls.
metag doesn't directly use sys_mmap_pgoff for mmap2.
prlimit64/process_vm_readv/process_vm_writev take a pid so add TP flag.
fanotify_init doesn't appear to take a file descriptor so remove TD.
Add kcmp syscall.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Christian Svensson <blue@cmd.nu>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
By adding tcp->s_ent pointer tot syscall table entry,
we can replace sysent[tcp->scno] references by tcp->s_ent.
More importantly, we may ensure that tcp->s_ent is always valid,
regardless of tcp->scno value. This allows us to drop
SCNO_IS_VALID(tcp->scno) checks before we access syscall
table entry.
We can optimize (qual_flags[tcp->scno] & QUAL_foo) checks
with a similar technique.
Resulting code shrink:
text data bss dec hex filename
245975 700 19072 265747 40e13 strace.t3/strace
245703 700 19072 265475 40d03 strace.t4/strace
* count.c (count_syscall): Use cheaper SCNO_IN_RANGE() check.
* defs.h: Add "int qual_flg" and "const struct sysent *s_ent"
to struct tcb. Remove "int u_nargs" from it.
Add UNDEFINED_SCNO constant which will mark undefined scnos
in tcp->qual_flg.
* pathtrace.c (pathtrace_match): Drop SCNO_IS_VALID check.
Use tcp->s_ent instead of sysent[tcp->scno].
* process.c (sys_prctl): Use tcp->s_ent->nargs instead of tcp->u_nargs.
(sys_waitid): Likewise.
* strace.c (init): Add compile-time check that DEFAULT_QUAL_FLAGS
constant is consistent with init code.
* syscall.c (decode_socket_subcall): Use tcp->s_ent->nargs
instead of tcp->u_nargs. Set tcp->qual_flg and tcp->s_ent.
(decode_ipc_subcall): Likewise.
(printargs): Use tcp->s_ent->nargs instead of tcp->u_nargs.
(printargs_lu): Likewise.
(printargs_ld): Likewise.
(get_scno): [MIPS,ALPHA] Use cheaper SCNO_IN_RANGE() check.
If !SCNO_IS_VALID, set tcp->s_ent and tcp->qual_flg to default values.
(internal_fork): Use tcp->s_ent instead of sysent[tcp->scno].
(syscall_fixup_for_fork_exec): Remove SCNO_IS_VALID check.
Use tcp->s_ent instead of sysent[tcp->scno].
(get_syscall_args): Likewise.
(get_error): Drop SCNO_IS_VALID check where it is redundant.
(dumpio): Drop SCNO_IS_VALID check where it is redundant.
Use tcp->s_ent instead of sysent[tcp->scno].
(trace_syscall_entering): Use (tcp->qual_flg & UNDEFINED_SCNO) instead
of SCNO_IS_VALID check. Use tcp->s_ent instead of sysent[tcp->scno].
Drop SCNO_IS_VALID check where it is redundant.
Print undefined syscall name with undefined_scno_name(tcp).
(trace_syscall_exiting): Likewise.
* util.c (setbpt): Use tcp->s_ent instead of sysent[tcp->scno].
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
* defs.h: Declare new function printsiginfo_at(tcp, addr).
* process.c (sys_waitid): Use printsiginfo_at().
(sys_ptrace): Likewise.
* signal.c: (printsiginfo_at): Implement this new function.
(sys_rt_sigsuspend): Use printsiginfo_at().
(sys_rt_sigtimedwait): Likewise.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
* process.c: Add start_code and start_data members of struct user
in struct_user_offsets[], where appropriate.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
The maze of ifdefs/ifndefs was scaring new contributors.
Format it so that every arch has its own ifdef block.
* process.c: Deobfuscate definitions of struct user offsets.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
tilegx support has been in the kernel since 3.0.
In addition, fix some issues with the tilepro support already
present in strace, primarily the decision to use the
<asm/unistd.h> numbering space for system calls.
* defs.h [TILE]: Include <asm/ptrace.h> and provide an extern
struct pt_regs tile_regs for efficiency. Provide compat 32-bit
personality via SUPPORTED_PERSONALITIES, PERSONALITY0_WORDSIZE,
PERSONALITY1_WORDSIZE, and DEFAULT_PERSONALITY.
* linux/tile/errnoent1.h: New file, includes linux/errnoent.h.
* linux/tile/ioctlent1.h: New file, includes linux/ioctlent.h.
* linux/tile/signalent1.h: New file, includes linux/signalent.h.
* linux/tile/syscallent.h: Update with new asm-generic syscalls.
The version previously committed was the from the first tile patch
to LKML, which subsequently was changed to use <asm-generic/unistd.h>.
* linux/tile/syscallent1.h: Copy from linux/tile/syscallent.h.
* mem.c (addtileflags) [TILE]: use %ld properly for a "long" variable.
* process.c [TILE]: Choose clone arguments correctly and properly
suppress all "struct user" related offsets in user_struct_offsets.
* signal.c [TILE]: Use tile_regs not upeek.
* syscall.c (update_personality) [TILE]: Print mode.
(PT_FLAGS_COMPAT) [TILE]: Provide if not in system headers.
(tile_regs) [TILE]: Define 'struct pt_regs' variable to hold state.
(get_regs) [TILE]: use PTRACE_GETREGS to set tile_regs rather than using upeek.
(get_scno) [TILE]: Set personality.
(get_syscall_args) [TILE]: Use tile_regs.
(get_syscall_result) [TILE]: Update tile_regs.
(get_error) [TILE]: Use tile_regs.
(printcall) [TILE]: Print pc.
(arg0_offset, arg1_offset, restore_arg0, restore_arg1) [TILE]:
Properly handle tile call semantics and support tilegx.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
AArch64 has been included in linux from 3.7 onwards.
Add support for AArch64 in strace, tested on linux in a simulator.
* configure.ac: Support AArch64.
* defs.h [AARCH64]: Include <sys/ptrace.h>, define TCB_WAITEXECVE.
* ipc.c (indirect_ipccall): Support AArch64.
* process.c (struct_user_offsets): Likewise.
* syscall.c [AARCH64]: Include <asm/ptrace.h>, <sys/uio.h>, and
<elf.h>. Define struct user_pt_regs regs.
(get_scno, get_syscall_result): Support AArch64 using PTRACE_GETREGSET.
(get_syscall_args, get_error): Support AArch64.
* linux/aarch64/ioctlent.h.in: New file.
* linux/aarch64/syscallent.h: New file, based on linux 3.7 version of
asm-generic/unistd.h.
Signed-off-by: Steve McIntyre <steve.mcintyre@linaro.org>
X32 support is added to Linux kernel 3.4. In a nutshell, x32 is x86-64 with
32bit pointers. At system call level, x32 is also identical to x86-64,
as shown by many changes like "defined(X86_64) || defined(X32)". The
main differerence bewteen x32 and x86-64 is off_t in x32 is long long
instead of long.
This patch adds x32 support to strace. Tested on Linux/x32.
* configure.ac: Support X32.
* defs.h: Set SUPPORTED_PERSONALITIES to 3 for X86_64,
Set PERSONALITY2_WORDSIZE to 4 for X86_64.
Add tcb::ext_arg for X32.
* file.c (stat): New for X32.
(sys_lseek): Use 64-bit version for X32.
(printstat64): Check current_personality != 1 for X86_64.
* ipc.c (indirect_ipccall): Check current_personality == 1
for X86_64.
* mem.c (sys_mmap64): Also use tcp->u_arg for X32. Print NULL
for zero address. Call printllval for offset for X32.
* pathtrace.c (pathtrace_match): Don't check sys_old_mmap for
X32.
* process.c (ARG_FLAGS): Defined for X32.
(ARG_STACK): Likewise.
(ARG_PTID): Likewise.
(change_syscall): Handle X32.
(struct_user_offsets): Support X32.
(sys_arch_prctl): Likewise.
* signal.c: Include <asm/sigcontext.h> for X32.
(SA_RESTORER): Also define for X32.
* syscall.c (update_personality): Support X32 for X86_64.
(is_restart_error): Likewise.
(syscall_fixup_on_sysenter): Likewise.
(get_syscall_args): Likewise.
(get_syscall_result): Likewise.
(get_error): Likewise.
(__X32_SYSCALL_BIT): Define if not defined.
(__X32_SYSCALL_MASK): Likewise.
(get_scno): Check DS register value for X32. Use
__X32_SYSCALL_MASK on X32 system calls.
* util.c (printllval): Use ext_arg for X32.
(printcall): Support X32.
(change_syscall): Likewise.
(arg0_offset): Likewise.
(arg1_offset): Likewise.
* Makefile.am (EXTRA_DIST): Add linux/x32/errnoent.h,
linux/x32/ioctlent.h.in, linux/x32/signalent.h,
linux/x32/syscallent.h, linux/x86_64/errnoent2.h,
linux/x86_64/ioctlent2.h, linux/x86_64/signalent2.h and
linux/x86_64/syscallent2.h.
* linux/x32/errnoent.h: New.
* linux/x32/ioctlent.h.in: Likewise.
* linux/x32/signalent.h: Likewise.
* linux/x32/syscallent.h: Likewise.
* linux/x86_64/errnoent2.h: Likewise.
* linux/x86_64/ioctlent2.h: Likewise.
* linux/x86_64/signalent2.h: Likewise.
* linux/x86_64/syscallent2.h: Likewise.
Signed-off-by: H.J. Lu <hongjiu.lu@intel.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
text data bss dec hex filename
237917 672 18980 257569 3ee21 strace
237845 672 18980 257497 3edd9 strace_new
* defs.h: Remove declarations of internal_fork and internal_exec.
* process.c: Remove definitions of internal_fork and internal_exec.
* syscall.c: Move them here.
(internal_syscall): Return void instead of int. We were always
returning zero, and callers weren't checking it anyway.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
The files not mentioned in changelog below had only
copyright notices fixes and indentation fixes.
* defs.h: Include <stdint.h> and <inttypes.h>.
* file.c: Do not include <inttypes.h>.
Move struct kernel_dirent declaration below top include block.
* block.c: Do not include <stdint.h> and <inttypes.h>.
* quota.c: Likewise.
* desc.c: Likewise.
* signal.c: Likewise.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
* defs.h: Include <signal.h> unconditionally.
Other files were doing it unconditionally, so no harm done.
* bjm.c: Remove system includes which are already included by defs.h.
* pathtrace.c: Likewise.
* process.c: Likewise.
* signal.c: Likewise.
* strace.c: Likewise.
* stream.c: Likewise.
* syscall.c: Likewise.
* system.c: Likewise.
* util.c: Likewise.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Our logic which was deciding whether to print "<unfinished ...>"
thingy wasn't working properly for -ff case.
* defs.h: Group log generation-related declarations together.
Add a large comment which explains how it works.
Add declaration of line_ended() function.
* strace.c (line_ended): New function which sets up internal data
to indicate that previous line was finished.
(printleader): Change logic to fix log generation in -ff mode.
(newoutf): Make check for -ff mode consistent with other places.
(droptcb): Print "<detached ...>" if last line for this tcp wasn't finished.
(cleanup): Remove code to print "<unfinished ...>", printleader()
or detach() will do it instead.
(trace): Remove code to print "<unfinished ...>".
Add code which finishes threaded execve's incomplete line
with " <pid changed to PID ...>" message. Replace printing_tcp = NULL
followed by fflush() by line_ended() call.
* process.c (sys_exit): Call line_ended() to indicate that we finished priting.
* syscall.c (trace_syscall_exiting): Set printing_tcp to current tcp.
Call line_ended() to indicate that we finished priting.
Remove call to fflush(), it is done by line_ended() now.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
text data bss dec hex filename
237384 672 19044 257100 3ec4c strace.before
236448 672 19044 256164 3e8a4 strace
* defs.h: Declare new functions printargs_lu(), printargs_ld()
which simply print syscall all args as unsigned or signed longs.
* desc.c (sys_epoll_create): Call printargs_ld() instead of open-coding it.
* linux/syscall.h: Remove declarations of the following functions:
sys_alarm, sys_getresgid, sys_getsid, sys_nice, sys_setgid, sys_setpgid,
sys_setpgrp, sys_setregid, sys_setresgid.
* process.c (sys_setgid): Delete this function: now aliased to sys_setuid().
(sys_getresgid): Delete this function: now aliased to sys_getresuid().
(sys_setregid): Delete this function: now aliased to sys_setreuid().
(sys_setresgid): Delete this function: now aliased to sys_setresuid().
(sys_setpgrp): Delete this function: now aliased to printargs_lu().
(sys_getsid): Likewise.
(sys_setpgid): Likewise.
(sys_alarm): Likewise.
(sys_getpgrp): Delete this function: was unused - was already shadowed
by a define in linux/dummy.h.
(sys_setsid): Likewise.
(sys_getpgid): Likewise.
* resource.c (sys_nice): Delete this function: now aliased to printargs_ld().
* linux/dummy.h: Define new aliases (see above for the list).
* syscall.c (printargs_lu): New function.
(printargs_ld): New function.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Conditions such as defined(LINUX) are always true now,
defined(FREEBSD) etc are always false.
When if directive has them as subexpressions, it can be simplified.
Another trivial changes here are fixes for directive indentation.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
This change is generated by running every source through the following command:
unifdef -DLINUX -Dlinux -USUNOS4 -USVR4 -UUNIXWARE -UFREEBSD
-USUNOS4_KERNEL_ARCH_KLUDGE -UHAVE_MP_PROCFS
-UHAVE_POLLABLE_PROCFS -UHAVE_PR_SYSCALL -UUSE_PROCFS file.c
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
All new code is predicated on "ifdef USE_SEIZE". If it is not defined,
behavior is not changed.
If USE_SEIZE is enabled and run-time check shows that PTRACE_SEIZE works, then:
- All attaching is done with PTRACE_SEIZE + PTRACE_INTERRUPT.
This means that we no longer generate (and possibly race with) SIGSTOP.
- PTRACE_EVENT_STOP will be generated if tracee is group-stopped.
When we detect it, we issue PTRACE_LISTEN instead of PTRACE_SYSCALL.
This leaves tracee stopped. This fixes the inability to SIGSTOP or ^Z
a straced process.
* defs.h: Add commented-out "define USE_SEIZE 1" and define PTRACE_SEIZE
and related constants.
* strace.c: New variable post_attach_sigstop shows whether we age going
to expect SIGSTOP on attach (IOW: are we going to use PTRACE_SEIZE).
(ptrace_attach_or_seize): New function. Uses PTRACE_ATTACH or
PTRACE_SEIZE + PTRACE_INTERRUPT to attach to given pid.
(startup_attach): Use ptrace_attach_or_seize() instead of ptrace(PTRACE_ATTACH).
(startup_child): Conditionally use alternative attach method using PTRACE_SEIZE.
(test_ptrace_setoptions_followfork): More robust parameters to PTRACE_TRACEME.
(test_ptrace_seize): New function to test whether PTRACE_SEIZE works.
(main): Call test_ptrace_seize() while initializing.
(trace): If PTRACE_EVENT_STOP is seen, restart using PTRACE_LISTEN in order
to not let tracee run.
* process.c: Decode PTRACE_SEIZE, PTRACE_INTERRUPT, PTRACE_LISTEN.
* util.c (ptrace_restart): Add "LISTEN" to a possible error message.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Currently, we use PTRACE_PEEKDATA to read things like filenames and
data passed by I/O syscalls.
PTRACE_PEEKDATA gets one word per syscall. This is VERY expensive.
For example, in order to print fstat syscall, we need to perform
more than twenty trips into kernel to fetch one struct stat!
Kernel 3.2 got a new syscall, process_vm_readv(), which can be used to
copy data blocks out of process' address space.
This change uses it in umoven() and umovestr() functions if possible,
with fallback to old method if process_vm_readv() fails.
If it returns ENOSYS, we don't try to use it anymore, eliminating
overhead of trying it on older kernels.
Result of "time strace -oLOG ls -l /usr/lib >/dev/null":
before patch: 0.372s
After patch: 0.262s
* util.c (process_vm_readv): Wrapper to call process_vm_readv syscall.
(umoven): Use process_vm_readv for block reads of tracee memory.
(umovestr): Likewise.
* linux/syscall.h: Declare new function sys_process_vm_readv.
* process.c (sys_process_vm_readv): Decoder for new syscall.
* linux/i386/syscallent.h: Add process_vm_readv, process_vm_writev syscalls.
* linux/x86_64/syscallent.h: Likewise.
* linux/powerpc/syscallent.h: Likewise.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
* defs.h: Rename tcp_last to printing_tcp. Explain what it means.
Remove printtrailer() function.
* process.c (sys_exit): Convert printtrailer() call to "printing_tcp = NULL".
* strace.c: Add new variable printing_tcp.
(cleanup): Convert printtrailer() call to "printing_tcp = NULL".
(trace): Likewise.
(trace): Fix checks for incomplete line - it was working wrongly if last syscall was exit.
(printleader): Set printing_tcp.
(printtrailer): Remove this function.
* syscall.c: Remove tcp_last variable.
(trace_syscall_entering): Don't set printing_tcp, printleader call now does it.
(trace_syscall_exiting): Convert printtrailer() call to "printing_tcp = NULL".
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
We set ptrace options when we see post-attach SIGSTOP.
This is wrong: it's better to set them right away on the very first
stop (whichever it will be). It also will make adding SEIZE support easier,
since SEIZE has no post-attach SIGSTOP.
We do it by adding a new bit, TCB_IGNORE_ONE_SIGSTOP, and treating
TCB_STARTUP and TCB_IGNORE_ONE_SIGSTOP as two slightly different things.
* defs.h: Add a new flag bit, TCB_IGNORE_ONE_SIGSTOP.
* process.c (internal_fork): Set TCB_IGNORE_ONE_SIGSTOP on a newly added child.
* strace.c (startup_attach): Set TCB_IGNORE_ONE_SIGSTOP after attach.
Fix a case when "strace -p PID" found PID dead but sone other of its threads
still alive.
(startup_child): Set TCB_IGNORE_ONE_SIGSTOP after attach, _if needed_.
This fixes a bogus case where we can ignore a _real_ SIGSTOP on NOMMU.
(detach): Perform anti-SIGSTOP dance only if TCB_IGNORE_ONE_SIGSTOP is set,
not if TCB_STARTUP is set.
(trace): Set TCB_IGNORE_ONE_SIGSTOP after attach.
Clear TCB_STARTUP and initialize tracee on the very first tracee stop.
Clear TCB_IGNORE_ONE_SIGSTOP when SIGSTOP is seen.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
This fixes logic in detach() which thinks that TCB_STARTUP
means that we are already attached, but did not see SIGSTOP yet.
This also allows to get rid of TCB_ATTACH_DONE flag.
* process.c (internal_fork): Set TCB_STARTUP after attach.
* strace.c (startup_attach): Likewise.
(startup_child): Likewise.
(alloc_tcb): Do not set TCB_STARTUP on tcb allocation - we are
not attached yet.
(trace): Set TCB_STARTUP when we detech an auto-attached child.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
tabto is used in many lines of strace output.
On glibc, tprintf("%*s", col - curcol, "") is noticeably slow
compared to tprintf(" "). Use the latter.
Observed ~15% reduction of time spent in userspace.
* defs.h: Drop extern declaration of acolumn. Make tabto()
take no parameters.
* process.c (sys_exit): Call tabto() with no parameters.
* syscall.c (trace_syscall_exiting): Call tabto() with no parameters.
* strace.c: Make acolumn static, add static char *acolumn_spaces.
(main): Allocate acolumn_spaces as a string of spaces.
(printleader): Call tabto() with no parameters.
(tabto): Use simpler method to print lots of spaces.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
* syscall.c (internal_syscall): Call internal_exec only if
SUNOS4 || (LINUX && TCB_WAITEXECVE).
* process.c (internal_exec): Define this function only if
SUNOS4 || (LINUX && TCB_WAITEXECVE).
(printwaitn): Don't check wordsize if SUPPORTED_PERSONALITIES == 1.
* signal.c (sys_kill): Likewise.
* syscall.c (is_negated_errno): Likewise.
(trace_syscall_exiting): Fold a tprintf into tprintfs which follow it.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
tcp->parent is used for only two things:
(1) to send signal on detach via tgkill (need to know tgid).
Solution: use tkill, it needs only tid.
(2) to optimize out ptrace options setting for new tracees.
Not a big deal if we drop this optimization: "set options" op is fast,
doing it just one extra time once per each tracee is hardly measurable.
TCB_CLONE_THREAD is a misnomer. It used only to flag sibling we attached to
in startup_attach. This is used to prevent infinite recursive rescanning
of /proc/PID/task.
Despite the name, there is no guarantee it is set only on non-leader:
if one would run "strace -f -p THREAD_ID" and THREAD_ID is *not*
a thread leader, strace will happily attach to it and all siblings
and will think that THREAD_ID is the leader! Which is a bug, but
since we no longer detach when we think tracee is going to die,
this bug no longer matters, because we do not use the knowledge
about thread group leaders for anything. (We used it to delay
leader's exit).
IOW: after this patch strace has no need to know about threads, parents
and children, and so on. Therefore it does not track that information.
It treats all tracees as independent entities. Overall,
this simplifies code a lot.
* defs.h: Add TCB_ATTACH_DONE flag, remove TCB_CLONE_THREAD flag
and struct tcb::parent field.
* process.c (internal_fork): Don't set tcpchild->parent.
* strace.c (startup_attach): Use TCB_ATTACH_DONE flag instead of
TCB_CLONE_THREAD to avoid attach attempts on already-attached threads.
Unlike TCB_CLONE_THREAD, TCB_ATTACH_DONE bit is used only temporarily,
and only in this function. We clear it on every tcb before we return.
(detach): Use tkill instead of tgkill.
(trace): Set ptrace options on new tracees unconditionally,
not only when tcp->parent == NULL.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Since we no longer suspend waitpid'ing tracees, we have only one case when
we suspend tracee: when we pick up a new tracee created by clone/fork/vfork.
Background: on some other OSes, attach to child is done this way:
get fork's result (pid), loop ptrace(PTRACE_ATTACH) until you hook up
new process/thread. This is ugly and not safe, but what matters for us
is that it doesn't require suspending. Suspending is required
on Linux only, because on Linux attach to child is done differently.
On Linux, we use two methods of catching new tracee:
adding CLONE_THREAD bit to syscall (if needed, we change
[v]fork into clone before that), or using ptrace options.
In both cases, it may be so that new tracee appears before one which
created it returns from syscall. In this case, current code
suspends new tracee until its creator returns. Only then
strace can determine who is its parent (it needs child's pid for this,
which is visible in parent's [v]fork/clone result).
This is inherently racy. For example, what if SIGKILL kills
creator after it succeeded creating child, but before it returns?
Looks like we will have child suspended forever.
But after previous commit, we DO NOT NEED parent<->child link for anything.
Therefore we do not need suspending too. Bingo!
This patch removes suspending code. Now new tracees will be continued
right away. Next patch will remove tcp->parent member.
* defs.h: Remove TCB_SUSPENDED constant
* process.c (handle_new_child): Delete this function.
(internal_fork): Do not call handle_new_child on syscall exit.
* strace.c (handle_ptrace_event): Delete this function.
(trace): Do not suspend new child; remove all handling
of now impossible TCB_SUSPENDED condition.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Current code plays some ungodly tricks, trying to not detach
thread group leader until all threads exit.
Also, it detaches from a tracee when signal delivery is detected
which will cause tracee to exit.
This operation is racy (not to mention the determination
whether signal is set to SIG_DFL is a horrible hack):
after we determined that this signal is indeed fatal
but before we detach and let process die,
*other thread* may set a handler to this signal, and
we will leak the process, falsely displaying it as killed!
I need to look in the past to figure out why we even do it.
First guess is that it's a workaround for old kernel bugs:
kernel used to deliver exit notifications to the tracer,
not to real parent. These workarounds are ancient
(internal_exit is from 1995).
The patch deletes the hacks. We no longer need tcp->nclone_threads,
TCB_EXITING and TCB_GROUP_EXITING. We also lose a few rather
ugly functions.
I also added a new message: "+++ exited with EXITCODE +++"
which shows exact moment strace got exit notification.
It is analogous to existing "+++ killed by SIG +++" message.
* defs.h: Delete struct tcb::nclone_threads field,
TCB_EXITING and TCB_GROUP_EXITING constants,
declarations of sigishandled() and internal_exit().
* process.c (internal_exit): Delete this function.
(handle_new_child): Don't ++tcp->nclone_threads.
* signal.c (parse_sigset_t): Delete this function.
(sigishandled): Delete this function.
* strace.c (startup_attach): Don't tcbtab[tcbi]->nclone_threads++.
(droptcb): Don't delay dropping if tcp->nclone_threads > 0,
don't drop parent if its nclone_threads reached 0:
just drop (only) this tcb unconditionally.
(detach): don't drop parent.
(handle_group_exit): Delete this function.
(handle_ptrace_event): Instead of handle_group_exit, just drop tcb;
do not panic if we see WIFEXITED from an attached pid;
print "+++ exited with EXITCODE +++" for every WIFEXITED pid.
* syscall.c (internal_syscall): Do not treat sys_exit specially -
don't call internal_exit on it.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>