David Drysdale 51f39a1f0c syscalls: implement execveat() system call
This patchset adds execveat(2) for x86, and is derived from Meredydd
Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528).

The primary aim of adding an execveat syscall is to allow an
implementation of fexecve(3) that does not rely on the /proc filesystem,
at least for executables (rather than scripts).  The current glibc version
of fexecve(3) is implemented via /proc, which causes problems in sandboxed
or otherwise restricted environments.

Given the desire for a /proc-free fexecve() implementation, HPA suggested
(https://lkml.org/lkml/2006/7/11/556) that an execveat(2) syscall would be
an appropriate generalization.

Also, having a new syscall means that it can take a flags argument without
back-compatibility concerns.  The current implementation just defines the
AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW flags, but other flags could be
added in future -- for example, flags for new namespaces (as suggested at
https://lkml.org/lkml/2006/7/11/474).

Related history:
 - https://lkml.org/lkml/2006/12/27/123 is an example of someone
   realizing that fexecve() is likely to fail in a chroot environment.
 - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=514043 covered
   documenting the /proc requirement of fexecve(3) in its manpage, to
   "prevent other people from wasting their time".
 - https://bugzilla.redhat.com/show_bug.cgi?id=241609 described a
   problem where a process that did setuid() could not fexecve()
   because it no longer had access to /proc/self/fd; this has since
   been fixed.

This patch (of 4):

Add a new execveat(2) system call.  execveat() is to execve() as openat()
is to open(): it takes a file descriptor that refers to a directory, and
resolves the filename relative to that.

In addition, if the filename is empty and AT_EMPTY_PATH is specified,
execveat() executes the file to which the file descriptor refers.  This
replicates the functionality of fexecve(), which is a system call in other
UNIXen, but in Linux glibc it depends on opening "/proc/self/fd/<fd>" (and
so relies on /proc being mounted).

The filename fed to the executed program as argv[0] (or the name of the
script fed to a script interpreter) will be of the form "/dev/fd/<fd>"
(for an empty filename) or "/dev/fd/<fd>/<filename>", effectively
reflecting how the executable was found.  This does however mean that
execution of a script in a /proc-less environment won't work; also, script
execution via an O_CLOEXEC file descriptor fails (as the file will not be
accessible after exec).

Based on patches by Meredydd Luff.

Signed-off-by: David Drysdale <drysdale@google.com>
Cc: Meredydd Luff <meredydd@senatehouse.org>
Cc: Shuah Khan <shuah.kh@samsung.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Rich Felker <dalias@aerifal.cx>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13 12:42:51 -08:00

85 lines
1.7 KiB
C

#include <linux/init.h>
#include <linux/types.h>
#include <linux/audit.h>
#include <asm/unistd.h>
static unsigned dir_class[] = {
#include <asm-generic/audit_dir_write.h>
~0U
};
static unsigned read_class[] = {
#include <asm-generic/audit_read.h>
~0U
};
static unsigned write_class[] = {
#include <asm-generic/audit_write.h>
~0U
};
static unsigned chattr_class[] = {
#include <asm-generic/audit_change_attr.h>
~0U
};
static unsigned signal_class[] = {
#include <asm-generic/audit_signal.h>
~0U
};
int audit_classify_arch(int arch)
{
if (audit_is_compat(arch))
return 1;
else
return 0;
}
int audit_classify_syscall(int abi, unsigned syscall)
{
if (audit_is_compat(abi))
return audit_classify_compat_syscall(abi, syscall);
switch(syscall) {
#ifdef __NR_open
case __NR_open:
return 2;
#endif
#ifdef __NR_openat
case __NR_openat:
return 3;
#endif
#ifdef __NR_socketcall
case __NR_socketcall:
return 4;
#endif
#ifdef __NR_execveat
case __NR_execveat:
#endif
case __NR_execve:
return 5;
default:
return 0;
}
}
static int __init audit_classes_init(void)
{
#ifdef CONFIG_AUDIT_COMPAT_GENERIC
audit_register_class(AUDIT_CLASS_WRITE_32, compat_write_class);
audit_register_class(AUDIT_CLASS_READ_32, compat_read_class);
audit_register_class(AUDIT_CLASS_DIR_WRITE_32, compat_dir_class);
audit_register_class(AUDIT_CLASS_CHATTR_32, compat_chattr_class);
audit_register_class(AUDIT_CLASS_SIGNAL_32, compat_signal_class);
#endif
audit_register_class(AUDIT_CLASS_WRITE, write_class);
audit_register_class(AUDIT_CLASS_READ, read_class);
audit_register_class(AUDIT_CLASS_DIR_WRITE, dir_class);
audit_register_class(AUDIT_CLASS_CHATTR, chattr_class);
audit_register_class(AUDIT_CLASS_SIGNAL, signal_class);
return 0;
}
__initcall(audit_classes_init);