2005-04-17 02:20:36 +04:00
/*
* linux / fs / compat . c
*
* Kernel compatibililty routines for e . g . 32 bit syscall support
* on 64 bit kernels .
*
* Copyright ( C ) 2002 Stephen Rothwell , IBM Corporation
* Copyright ( C ) 1997 - 2000 Jakub Jelinek ( jakub @ redhat . com )
* Copyright ( C ) 1998 Eddie C . Dost ( ecd @ skynet . be )
* Copyright ( C ) 2001 , 2002 Andi Kleen , SuSE Labs
2010-07-18 16:27:13 +04:00
* Copyright ( C ) 2003 Pavel Machek ( pavel @ ucw . cz )
2005-04-17 02:20:36 +04:00
*
* This program is free software ; you can redistribute it and / or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation .
*/
# include <linux/compat.h>
# include <linux/ncp_mount.h>
2005-04-18 21:54:51 +04:00
# include <linux/nfs4_mount.h>
2005-04-17 02:20:36 +04:00
# include <linux/syscalls.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 11:04:11 +03:00
# include <linux/slab.h>
2016-12-24 22:46:01 +03:00
# include <linux/uaccess.h>
2006-09-30 22:52:18 +04:00
# include "internal.h"
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-19 04:44:05 +03:00
2005-04-17 02:20:36 +04:00
struct compat_ncp_mount_data {
compat_int_t version ;
compat_uint_t ncp_fd ;
2005-09-07 02:16:40 +04:00
__compat_uid_t mounted_uid ;
2005-04-17 02:20:36 +04:00
compat_pid_t wdog_pid ;
unsigned char mounted_vol [ NCP_VOLNAME_LEN + 1 ] ;
compat_uint_t time_out ;
compat_uint_t retry_count ;
compat_uint_t flags ;
2005-09-07 02:16:40 +04:00
__compat_uid_t uid ;
__compat_gid_t gid ;
2005-04-17 02:20:36 +04:00
compat_mode_t file_mode ;
compat_mode_t dir_mode ;
} ;
struct compat_ncp_mount_data_v4 {
compat_int_t version ;
compat_ulong_t flags ;
compat_ulong_t mounted_uid ;
compat_long_t wdog_pid ;
compat_uint_t ncp_fd ;
compat_uint_t time_out ;
compat_uint_t retry_count ;
compat_ulong_t uid ;
compat_ulong_t gid ;
compat_ulong_t file_mode ;
compat_ulong_t dir_mode ;
} ;
static void * do_ncp_super_data_conv ( void * raw_data )
{
int version = * ( unsigned int * ) raw_data ;
if ( version = = 3 ) {
struct compat_ncp_mount_data * c_n = raw_data ;
struct ncp_mount_data * n = raw_data ;
n - > dir_mode = c_n - > dir_mode ;
n - > file_mode = c_n - > file_mode ;
n - > gid = c_n - > gid ;
n - > uid = c_n - > uid ;
memmove ( n - > mounted_vol , c_n - > mounted_vol , ( sizeof ( c_n - > mounted_vol ) + 3 * sizeof ( unsigned int ) ) ) ;
n - > wdog_pid = c_n - > wdog_pid ;
n - > mounted_uid = c_n - > mounted_uid ;
} else if ( version = = 4 ) {
struct compat_ncp_mount_data_v4 * c_n = raw_data ;
struct ncp_mount_data_v4 * n = raw_data ;
n - > dir_mode = c_n - > dir_mode ;
n - > file_mode = c_n - > file_mode ;
n - > gid = c_n - > gid ;
n - > uid = c_n - > uid ;
n - > retry_count = c_n - > retry_count ;
n - > time_out = c_n - > time_out ;
n - > ncp_fd = c_n - > ncp_fd ;
n - > wdog_pid = c_n - > wdog_pid ;
n - > mounted_uid = c_n - > mounted_uid ;
n - > flags = c_n - > flags ;
} else if ( version ! = 5 ) {
return NULL ;
}
return raw_data ;
}
2005-04-18 21:54:51 +04:00
struct compat_nfs_string {
compat_uint_t len ;
2005-04-28 02:39:03 +04:00
compat_uptr_t data ;
2005-04-18 21:54:51 +04:00
} ;
static inline void compat_nfs_string ( struct nfs_string * dst ,
struct compat_nfs_string * src )
{
dst - > data = compat_ptr ( src - > data ) ;
dst - > len = src - > len ;
}
struct compat_nfs4_mount_data_v1 {
compat_int_t version ;
compat_int_t flags ;
compat_int_t rsize ;
compat_int_t wsize ;
compat_int_t timeo ;
compat_int_t retrans ;
compat_int_t acregmin ;
compat_int_t acregmax ;
compat_int_t acdirmin ;
compat_int_t acdirmax ;
struct compat_nfs_string client_addr ;
struct compat_nfs_string mnt_path ;
struct compat_nfs_string hostname ;
compat_uint_t host_addrlen ;
2005-04-28 02:39:03 +04:00
compat_uptr_t host_addr ;
2005-04-18 21:54:51 +04:00
compat_int_t proto ;
compat_int_t auth_flavourlen ;
2005-04-28 02:39:03 +04:00
compat_uptr_t auth_flavours ;
2005-04-18 21:54:51 +04:00
} ;
static int do_nfs4_super_data_conv ( void * raw_data )
{
int version = * ( compat_uint_t * ) raw_data ;
if ( version = = 1 ) {
struct compat_nfs4_mount_data_v1 * raw = raw_data ;
struct nfs4_mount_data * real = raw_data ;
/* copy the fields backwards */
real - > auth_flavours = compat_ptr ( raw - > auth_flavours ) ;
real - > auth_flavourlen = raw - > auth_flavourlen ;
real - > proto = raw - > proto ;
real - > host_addr = compat_ptr ( raw - > host_addr ) ;
real - > host_addrlen = raw - > host_addrlen ;
compat_nfs_string ( & real - > hostname , & raw - > hostname ) ;
compat_nfs_string ( & real - > mnt_path , & raw - > mnt_path ) ;
compat_nfs_string ( & real - > client_addr , & raw - > client_addr ) ;
real - > acdirmax = raw - > acdirmax ;
real - > acdirmin = raw - > acdirmin ;
real - > acregmax = raw - > acregmax ;
real - > acregmin = raw - > acregmin ;
real - > retrans = raw - > retrans ;
real - > timeo = raw - > timeo ;
real - > wsize = raw - > wsize ;
real - > rsize = raw - > rsize ;
real - > flags = raw - > flags ;
real - > version = raw - > version ;
}
return 0 ;
}
2005-04-17 02:20:36 +04:00
# define NCPFS_NAME "ncpfs"
2005-04-18 21:54:51 +04:00
# define NFS4_NAME "nfs4"
2005-04-17 02:20:36 +04:00
2014-03-04 19:07:52 +04:00
COMPAT_SYSCALL_DEFINE5 ( mount , const char __user * , dev_name ,
const char __user * , dir_name ,
const char __user * , type , compat_ulong_t , flags ,
const void __user * , data )
2005-04-17 02:20:36 +04:00
{
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-19 00:05:45 +04:00
char * kernel_type ;
2015-12-15 02:44:44 +03:00
void * options ;
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-19 00:05:45 +04:00
char * kernel_dev ;
2005-04-17 02:20:36 +04:00
int retval ;
2014-08-28 21:26:03 +04:00
kernel_type = copy_mount_string ( type ) ;
retval = PTR_ERR ( kernel_type ) ;
if ( IS_ERR ( kernel_type ) )
2005-04-17 02:20:36 +04:00
goto out ;
2014-08-28 21:26:03 +04:00
kernel_dev = copy_mount_string ( dev_name ) ;
retval = PTR_ERR ( kernel_dev ) ;
if ( IS_ERR ( kernel_dev ) )
2014-09-14 17:15:10 +04:00
goto out1 ;
2005-04-17 02:20:36 +04:00
2015-12-15 02:44:44 +03:00
options = copy_mount_options ( data ) ;
retval = PTR_ERR ( options ) ;
if ( IS_ERR ( options ) )
2014-09-14 17:15:10 +04:00
goto out2 ;
2005-04-17 02:20:36 +04:00
2015-12-15 02:44:44 +03:00
if ( kernel_type & & options ) {
2010-10-05 00:55:57 +04:00
if ( ! strcmp ( kernel_type , NCPFS_NAME ) ) {
2015-12-15 02:44:44 +03:00
do_ncp_super_data_conv ( options ) ;
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-19 00:05:45 +04:00
} else if ( ! strcmp ( kernel_type , NFS4_NAME ) ) {
2015-12-15 02:44:44 +03:00
retval = - EINVAL ;
if ( do_nfs4_super_data_conv ( options ) )
2014-09-14 17:15:10 +04:00
goto out3 ;
2005-04-17 02:20:36 +04:00
}
}
2015-12-15 02:44:44 +03:00
retval = do_mount ( kernel_dev , dir_name , kernel_type , flags , options ) ;
2005-04-17 02:20:36 +04:00
out3 :
2015-12-15 02:44:44 +03:00
kfree ( options ) ;
2005-04-17 02:20:36 +04:00
out2 :
2014-09-14 17:15:10 +04:00
kfree ( kernel_dev ) ;
2005-04-17 02:20:36 +04:00
out1 :
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-19 00:05:45 +04:00
kfree ( kernel_type ) ;
2005-04-17 02:20:36 +04:00
out :
return retval ;
}