2019-05-27 09:55:05 +03:00
// SPDX-License-Identifier: GPL-2.0-or-later
2006-06-28 15:26:45 +04:00
/*
2005-04-17 02:20:36 +04:00
Copyright ( C ) 2002 Richard Henderson
2010-08-05 22:59:13 +04:00
Copyright ( C ) 2001 Rusty Russell , 2002 , 2010 Rusty Russell IBM .
2005-04-17 02:20:36 +04:00
*/
2020-04-19 18:55:06 +03:00
# define INCLUDE_VERMAGIC
2011-05-23 22:51:41 +04:00
# include <linux/export.h>
2016-07-23 21:01:45 +03:00
# include <linux/extable.h>
2005-04-17 02:20:36 +04:00
# include <linux/moduleloader.h>
2019-07-04 21:57:34 +03:00
# include <linux/module_signature.h>
2015-04-29 21:36:05 +03:00
# include <linux/trace_events.h>
2005-04-17 02:20:36 +04:00
# include <linux/init.h>
2007-05-08 11:28:38 +04:00
# include <linux/kallsyms.h>
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
# include <linux/file.h>
2008-10-06 13:19:27 +04:00
# include <linux/fs.h>
2007-10-17 10:26:40 +04:00
# include <linux/sysfs.h>
2005-09-13 12:25:16 +04:00
# include <linux/kernel.h>
2005-04-17 02:20:36 +04:00
# include <linux/slab.h>
# include <linux/vmalloc.h>
# include <linux/elf.h>
2008-10-06 13:19:27 +04:00
# include <linux/proc_fs.h>
2012-10-16 01:02:07 +04:00
# include <linux/security.h>
2005-04-17 02:20:36 +04:00
# include <linux/seq_file.h>
# include <linux/syscalls.h>
# include <linux/fcntl.h>
# include <linux/rcupdate.h>
2006-01-11 23:17:46 +03:00
# include <linux/capability.h>
2005-04-17 02:20:36 +04:00
# include <linux/cpu.h>
# include <linux/moduleparam.h>
# include <linux/errno.h>
# include <linux/err.h>
# include <linux/vermagic.h>
# include <linux/notifier.h>
2006-10-18 09:47:25 +04:00
# include <linux/sched.h>
2005-04-17 02:20:36 +04:00
# include <linux/device.h>
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
# include <linux/string.h>
2006-03-23 14:00:24 +03:00
# include <linux/mutex.h>
2008-08-30 12:09:00 +04:00
# include <linux/rculist.h>
2016-12-24 22:46:01 +03:00
# include <linux/uaccess.h>
2005-04-17 02:20:36 +04:00
# include <asm/cacheflush.h>
2017-07-07 01:35:58 +03:00
# include <linux/set_memory.h>
2009-09-22 04:03:57 +04:00
# include <asm/mmu_context.h>
2006-06-09 23:53:55 +04:00
# include <linux/license.h>
2008-02-08 15:18:42 +03:00
# include <asm/sections.h>
tracing: Kernel Tracepoints
Implementation of kernel tracepoints. Inspired from the Linux Kernel
Markers. Allows complete typing verification by declaring both tracing
statement inline functions and probe registration/unregistration static
inline functions within the same macro "DEFINE_TRACE". No format string
is required. See the tracepoint Documentation and Samples patches for
usage examples.
Taken from the documentation patch :
"A tracepoint placed in code provides a hook to call a function (probe)
that you can provide at runtime. A tracepoint can be "on" (a probe is
connected to it) or "off" (no probe is attached). When a tracepoint is
"off" it has no effect, except for adding a tiny time penalty (checking
a condition for a branch) and space penalty (adding a few bytes for the
function call at the end of the instrumented function and adds a data
structure in a separate section). When a tracepoint is "on", the
function you provide is called each time the tracepoint is executed, in
the execution context of the caller. When the function provided ends its
execution, it returns to the caller (continuing from the tracepoint
site).
You can put tracepoints at important locations in the code. They are
lightweight hooks that can pass an arbitrary number of parameters, which
prototypes are described in a tracepoint declaration placed in a header
file."
Addition and removal of tracepoints is synchronized by RCU using the
scheduler (and preempt_disable) as guarantees to find a quiescent state
(this is really RCU "classic"). The update side uses rcu_barrier_sched()
with call_rcu_sched() and the read/execute side uses
"preempt_disable()/preempt_enable()".
We make sure the previous array containing probes, which has been
scheduled for deletion by the rcu callback, is indeed freed before we
proceed to the next update. It therefore limits the rate of modification
of a single tracepoint to one update per RCU period. The objective here
is to permit fast batch add/removal of probes on _different_
tracepoints.
Changelog :
- Use #name ":" #proto as string to identify the tracepoint in the
tracepoint table. This will make sure not type mismatch happens due to
connexion of a probe with the wrong type to a tracepoint declared with
the same name in a different header.
- Add tracepoint_entry_free_old.
- Change __TO_TRACE to get rid of the 'i' iterator.
Masami Hiramatsu <mhiramat@redhat.com> :
Tested on x86-64.
Performance impact of a tracepoint : same as markers, except that it
adds about 70 bytes of instructions in an unlikely branch of each
instrumented function (the for loop, the stack setup and the function
call). It currently adds a memory read, a test and a conditional branch
at the instrumentation site (in the hot path). Immediate values will
eventually change this into a load immediate, test and branch, which
removes the memory read which will make the i-cache impact smaller
(changing the memory read for a load immediate removes 3-4 bytes per
site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it
also saves the d-cache hit).
About the performance impact of tracepoints (which is comparable to
markers), even without immediate values optimizations, tests done by
Hideo Aoki on ia64 show no regression. His test case was using hackbench
on a kernel where scheduler instrumentation (about 5 events in code
scheduler code) was added.
Quoting Hideo Aoki about Markers :
I evaluated overhead of kernel marker using linux-2.6-sched-fixes git
tree, which includes several markers for LTTng, using an ia64 server.
While the immediate trace mark feature isn't implemented on ia64, there
is no major performance regression. So, I think that we don't have any
issues to propose merging marker point patches into Linus's tree from
the viewpoint of performance impact.
I prepared two kernels to evaluate. The first one was compiled without
CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS.
I downloaded the original hackbench from the following URL:
http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c
I ran hackbench 5 times in each condition and calculated the average and
difference between the kernels.
The parameter of hackbench: every 50 from 50 to 800
The number of CPUs of the server: 2, 4, and 8
Below is the results. As you can see, major performance regression
wasn't found in any case. Even if number of processes increases,
differences between marker-enabled kernel and marker- disabled kernel
doesn't increase. Moreover, if number of CPUs increases, the differences
doesn't increase either.
Curiously, marker-enabled kernel is better than marker-disabled kernel
in more than half cases, although I guess it comes from the difference
of memory access pattern.
* 2 CPUs
Number of | without | with | diff | diff |
processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
--------------------------------------------------------------
50 | 4.811 | 4.872 | +0.061 | +1.27 |
100 | 9.854 | 10.309 | +0.454 | +4.61 |
150 | 15.602 | 15.040 | -0.562 | -3.6 |
200 | 20.489 | 20.380 | -0.109 | -0.53 |
250 | 25.798 | 25.652 | -0.146 | -0.56 |
300 | 31.260 | 30.797 | -0.463 | -1.48 |
350 | 36.121 | 35.770 | -0.351 | -0.97 |
400 | 42.288 | 42.102 | -0.186 | -0.44 |
450 | 47.778 | 47.253 | -0.526 | -1.1 |
500 | 51.953 | 52.278 | +0.325 | +0.63 |
550 | 58.401 | 57.700 | -0.701 | -1.2 |
600 | 63.334 | 63.222 | -0.112 | -0.18 |
650 | 68.816 | 68.511 | -0.306 | -0.44 |
700 | 74.667 | 74.088 | -0.579 | -0.78 |
750 | 78.612 | 79.582 | +0.970 | +1.23 |
800 | 85.431 | 85.263 | -0.168 | -0.2 |
--------------------------------------------------------------
* 4 CPUs
Number of | without | with | diff | diff |
processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
--------------------------------------------------------------
50 | 2.586 | 2.584 | -0.003 | -0.1 |
100 | 5.254 | 5.283 | +0.030 | +0.56 |
150 | 8.012 | 8.074 | +0.061 | +0.76 |
200 | 11.172 | 11.000 | -0.172 | -1.54 |
250 | 13.917 | 14.036 | +0.119 | +0.86 |
300 | 16.905 | 16.543 | -0.362 | -2.14 |
350 | 19.901 | 20.036 | +0.135 | +0.68 |
400 | 22.908 | 23.094 | +0.186 | +0.81 |
450 | 26.273 | 26.101 | -0.172 | -0.66 |
500 | 29.554 | 29.092 | -0.461 | -1.56 |
550 | 32.377 | 32.274 | -0.103 | -0.32 |
600 | 35.855 | 35.322 | -0.533 | -1.49 |
650 | 39.192 | 38.388 | -0.804 | -2.05 |
700 | 41.744 | 41.719 | -0.025 | -0.06 |
750 | 45.016 | 44.496 | -0.520 | -1.16 |
800 | 48.212 | 47.603 | -0.609 | -1.26 |
--------------------------------------------------------------
* 8 CPUs
Number of | without | with | diff | diff |
processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] |
--------------------------------------------------------------
50 | 2.094 | 2.072 | -0.022 | -1.07 |
100 | 4.162 | 4.273 | +0.111 | +2.66 |
150 | 6.485 | 6.540 | +0.055 | +0.84 |
200 | 8.556 | 8.478 | -0.078 | -0.91 |
250 | 10.458 | 10.258 | -0.200 | -1.91 |
300 | 12.425 | 12.750 | +0.325 | +2.62 |
350 | 14.807 | 14.839 | +0.032 | +0.22 |
400 | 16.801 | 16.959 | +0.158 | +0.94 |
450 | 19.478 | 19.009 | -0.470 | -2.41 |
500 | 21.296 | 21.504 | +0.208 | +0.98 |
550 | 23.842 | 23.979 | +0.137 | +0.57 |
600 | 26.309 | 26.111 | -0.198 | -0.75 |
650 | 28.705 | 28.446 | -0.259 | -0.9 |
700 | 31.233 | 31.394 | +0.161 | +0.52 |
750 | 34.064 | 33.720 | -0.344 | -1.01 |
800 | 36.320 | 36.114 | -0.206 | -0.57 |
--------------------------------------------------------------
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: 'Peter Zijlstra' <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 20:16:16 +04:00
# include <linux/tracepoint.h>
2008-08-14 23:45:09 +04:00
# include <linux/ftrace.h>
2016-03-17 03:55:39 +03:00
# include <linux/livepatch.h>
2009-01-07 19:45:46 +03:00
# include <linux/async.h>
2009-02-20 10:29:08 +03:00
# include <linux/percpu.h>
2009-06-11 16:23:20 +04:00
# include <linux/kmemleak.h>
2010-09-17 19:09:00 +04:00
# include <linux/jump_label.h>
2010-11-17 00:35:16 +03:00
# include <linux/pfn.h>
2011-04-20 13:10:52 +04:00
# include <linux/bsearch.h>
2016-08-03 00:03:47 +03:00
# include <linux/dynamic_debug.h>
2017-02-04 21:10:38 +03:00
# include <linux/audit.h>
2012-10-22 11:39:41 +04:00
# include <uapi/linux/module.h>
2012-09-26 13:09:40 +04:00
# include "module-internal.h"
2005-04-17 02:20:36 +04:00
2009-08-17 12:56:28 +04:00
# define CREATE_TRACE_POINTS
# include <trace/events/module.h>
2005-04-17 02:20:36 +04:00
# ifndef ARCH_SHF_SMALL
# define ARCH_SHF_SMALL 0
# endif
2010-11-17 00:35:16 +03:00
/*
* Modules ' sections will be aligned on page boundaries
2019-08-20 17:53:10 +03:00
* to ensure complete separation of code and data , but
* only when CONFIG_ARCH_HAS_STRICT_MODULE_RWX = y
2010-11-17 00:35:16 +03:00
*/
2019-08-20 17:53:10 +03:00
# ifdef CONFIG_ARCH_HAS_STRICT_MODULE_RWX
2010-11-17 00:35:16 +03:00
# define debug_align(X) ALIGN(X, PAGE_SIZE)
2019-08-20 17:53:10 +03:00
# else
# define debug_align(X) (X)
# endif
2010-11-17 00:35:16 +03:00
2005-04-17 02:20:36 +04:00
/* If this is set, the section belongs in the init part of the module */
# define INIT_OFFSET_MASK (1UL << (BITS_PER_LONG-1))
2010-06-05 21:17:36 +04:00
/*
* Mutex protects :
* 1 ) List of modules ( also safely readable with preempt_disable ) ,
* 2 ) module_use links ,
* 3 ) module_addr_min / module_addr_max .
2014-11-10 02:00:29 +03:00
* ( delete and add uses RCU list operations ) . */
2008-12-06 03:03:59 +03:00
DEFINE_MUTEX ( module_mutex ) ;
EXPORT_SYMBOL_GPL ( module_mutex ) ;
2005-04-17 02:20:36 +04:00
static LIST_HEAD ( modules ) ;
2010-05-21 06:04:21 +04:00
2019-04-26 03:11:37 +03:00
/* Work queue for freeing init sections in success case */
static struct work_struct init_free_wq ;
static struct llist_head init_free_list ;
2015-05-27 04:39:37 +03:00
# ifdef CONFIG_MODULES_TREE_LOOKUP
2012-09-26 13:09:40 +04:00
2015-05-27 04:39:37 +03:00
/*
* Use a latched RB - tree for __module_address ( ) ; this allows us to use
* RCU - sched lookups of the address from any context .
*
2015-05-27 04:39:37 +03:00
* This is conditional on PERF_EVENTS | | TRACING because those can really hit
* __module_address ( ) hard by doing a lot of stack unwinding ; potentially from
* NMI context .
2015-05-27 04:39:37 +03:00
*/
static __always_inline unsigned long __mod_tree_val ( struct latch_tree_node * n )
2012-09-26 13:09:40 +04:00
{
2015-11-26 02:14:08 +03:00
struct module_layout * layout = container_of ( n , struct module_layout , mtn . node ) ;
2012-09-26 13:09:40 +04:00
2015-11-26 02:14:08 +03:00
return ( unsigned long ) layout - > base ;
2015-05-27 04:39:37 +03:00
}
static __always_inline unsigned long __mod_tree_size ( struct latch_tree_node * n )
{
2015-11-26 02:14:08 +03:00
struct module_layout * layout = container_of ( n , struct module_layout , mtn . node ) ;
2015-05-27 04:39:37 +03:00
2015-11-26 02:14:08 +03:00
return ( unsigned long ) layout - > size ;
2015-05-27 04:39:37 +03:00
}
static __always_inline bool
mod_tree_less ( struct latch_tree_node * a , struct latch_tree_node * b )
{
return __mod_tree_val ( a ) < __mod_tree_val ( b ) ;
}
static __always_inline int
mod_tree_comp ( void * key , struct latch_tree_node * n )
{
unsigned long val = ( unsigned long ) key ;
unsigned long start , end ;
start = __mod_tree_val ( n ) ;
if ( val < start )
return - 1 ;
end = start + __mod_tree_size ( n ) ;
if ( val > = end )
return 1 ;
2012-09-26 13:09:40 +04:00
return 0 ;
}
2015-05-27 04:39:37 +03:00
static const struct latch_tree_ops mod_tree_ops = {
. less = mod_tree_less ,
. comp = mod_tree_comp ,
} ;
2015-05-27 04:39:38 +03:00
static struct mod_tree_root {
struct latch_tree_root root ;
unsigned long addr_min ;
unsigned long addr_max ;
} mod_tree __cacheline_aligned = {
. addr_min = - 1UL ,
2012-09-26 13:09:40 +04:00
} ;
2015-05-27 04:39:38 +03:00
# define module_addr_min mod_tree.addr_min
# define module_addr_max mod_tree.addr_max
static noinline void __mod_tree_insert ( struct mod_tree_node * node )
{
latch_tree_insert ( & node - > node , & mod_tree . root , & mod_tree_ops ) ;
}
static void __mod_tree_remove ( struct mod_tree_node * node )
{
latch_tree_erase ( & node - > node , & mod_tree . root , & mod_tree_ops ) ;
}
2015-05-27 04:39:37 +03:00
/*
* These modifications : insert , remove_init and remove ; are serialized by the
* module_mutex .
*/
static void mod_tree_insert ( struct module * mod )
{
2015-11-26 02:14:08 +03:00
mod - > core_layout . mtn . mod = mod ;
mod - > init_layout . mtn . mod = mod ;
2015-05-27 04:39:37 +03:00
2015-11-26 02:14:08 +03:00
__mod_tree_insert ( & mod - > core_layout . mtn ) ;
if ( mod - > init_layout . size )
__mod_tree_insert ( & mod - > init_layout . mtn ) ;
2015-05-27 04:39:37 +03:00
}
static void mod_tree_remove_init ( struct module * mod )
{
2015-11-26 02:14:08 +03:00
if ( mod - > init_layout . size )
__mod_tree_remove ( & mod - > init_layout . mtn ) ;
2015-05-27 04:39:37 +03:00
}
static void mod_tree_remove ( struct module * mod )
{
2015-11-26 02:14:08 +03:00
__mod_tree_remove ( & mod - > core_layout . mtn ) ;
2015-05-27 04:39:37 +03:00
mod_tree_remove_init ( mod ) ;
}
2015-05-27 04:39:37 +03:00
static struct module * mod_find ( unsigned long addr )
2015-05-27 04:39:37 +03:00
{
struct latch_tree_node * ltn ;
2015-05-27 04:39:38 +03:00
ltn = latch_tree_find ( ( void * ) addr , & mod_tree . root , & mod_tree_ops ) ;
2015-05-27 04:39:37 +03:00
if ( ! ltn )
return NULL ;
return container_of ( ltn , struct mod_tree_node , node ) - > mod ;
}
2015-05-27 04:39:37 +03:00
# else /* MODULES_TREE_LOOKUP */
2015-05-27 04:39:38 +03:00
static unsigned long module_addr_min = - 1UL , module_addr_max = 0 ;
2015-05-27 04:39:37 +03:00
static void mod_tree_insert ( struct module * mod ) { }
static void mod_tree_remove_init ( struct module * mod ) { }
static void mod_tree_remove ( struct module * mod ) { }
static struct module * mod_find ( unsigned long addr )
{
struct module * mod ;
2019-12-03 09:14:04 +03:00
list_for_each_entry_rcu ( mod , & modules , list ,
lockdep_is_held ( & module_mutex ) ) {
2015-05-27 04:39:37 +03:00
if ( within_module ( addr , mod ) )
return mod ;
}
return NULL ;
}
# endif /* MODULES_TREE_LOOKUP */
2015-05-27 04:39:38 +03:00
/*
* Bounds of module text , for speeding up __module_address .
* Protected by module_mutex .
*/
static void __mod_update_bounds ( void * base , unsigned int size )
{
unsigned long min = ( unsigned long ) base ;
unsigned long max = min + size ;
if ( min < module_addr_min )
module_addr_min = min ;
if ( max > module_addr_max )
module_addr_max = max ;
}
static void mod_update_bounds ( struct module * mod )
{
2015-11-26 02:14:08 +03:00
__mod_update_bounds ( mod - > core_layout . base , mod - > core_layout . size ) ;
if ( mod - > init_layout . size )
__mod_update_bounds ( mod - > init_layout . base , mod - > init_layout . size ) ;
2015-05-27 04:39:38 +03:00
}
2010-05-21 06:04:21 +04:00
# ifdef CONFIG_KGDB_KDB
struct list_head * kdb_modules = & modules ; /* kdb needs the list of modules */
# endif /* CONFIG_KGDB_KDB */
2015-05-27 04:39:35 +03:00
static void module_assert_mutex ( void )
{
lockdep_assert_held ( & module_mutex ) ;
}
static void module_assert_mutex_or_preempt ( void )
{
# ifdef CONFIG_LOCKDEP
if ( unlikely ( ! debug_locks ) )
return ;
2016-07-18 23:29:24 +03:00
WARN_ON_ONCE ( ! rcu_read_lock_sched_held ( ) & &
2015-05-27 04:39:35 +03:00
! lockdep_is_held ( & module_mutex ) ) ;
# endif
}
2015-05-27 04:39:39 +03:00
static bool sig_enforce = IS_ENABLED ( CONFIG_MODULE_SIG_FORCE ) ;
2012-09-26 13:09:40 +04:00
module_param ( sig_enforce , bool_enable_only , 0644 ) ;
2005-04-17 02:20:36 +04:00
2017-10-24 20:37:00 +03:00
/*
* Export sig_enforce kernel cmdline parameter to allow other subsystems rely
* on that instead of directly to CONFIG_MODULE_SIG_FORCE config .
*/
bool is_module_sig_enforced ( void )
{
return sig_enforce ;
}
EXPORT_SYMBOL ( is_module_sig_enforced ) ;
2019-01-28 03:03:45 +03:00
void set_module_sig_enforced ( void )
{
sig_enforce = true ;
}
2009-04-14 11:27:18 +04:00
/* Block module loading/unloading? */
int modules_disabled = 0 ;
2012-02-01 06:33:14 +04:00
core_param ( nomodule , modules_disabled , bint , 0 ) ;
2009-04-14 11:27:18 +04:00
2008-01-30 01:13:18 +03:00
/* Waiting for a module to finish initializing? */
static DECLARE_WAIT_QUEUE_HEAD ( module_wq ) ;
[PATCH] Notifier chain update: API changes
The kernel's implementation of notifier chains is unsafe. There is no
protection against entries being added to or removed from a chain while the
chain is in use. The issues were discussed in this thread:
http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2
We noticed that notifier chains in the kernel fall into two basic usage
classes:
"Blocking" chains are always called from a process context
and the callout routines are allowed to sleep;
"Atomic" chains can be called from an atomic context and
the callout routines are not allowed to sleep.
We decided to codify this distinction and make it part of the API. Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name). New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain. The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.
With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed. For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections. (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)
There are some limitations, which should not be too hard to live with. For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem. Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain. (This did happen in a couple of places and the code
had to be changed to avoid it.)
Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization. Instead we use RCU. The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.
Here is the list of chains that we adjusted and their classifications. None
of them use the raw API, so for the moment it is only a placeholder.
ATOMIC CHAINS
-------------
arch/i386/kernel/traps.c: i386die_chain
arch/ia64/kernel/traps.c: ia64die_chain
arch/powerpc/kernel/traps.c: powerpc_die_chain
arch/sparc64/kernel/traps.c: sparc64die_chain
arch/x86_64/kernel/traps.c: die_chain
drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list
kernel/panic.c: panic_notifier_list
kernel/profile.c: task_free_notifier
net/bluetooth/hci_core.c: hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain
net/ipv6/addrconf.c: inet6addr_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain
net/netlink/af_netlink.c: netlink_chain
BLOCKING CHAINS
---------------
arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain
arch/s390/kernel/process.c: idle_chain
arch/x86_64/kernel/process.c idle_notifier
drivers/base/memory.c: memory_chain
drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list
drivers/macintosh/adb.c: adb_client_list
drivers/macintosh/via-pmu.c sleep_notifier_list
drivers/macintosh/via-pmu68k.c sleep_notifier_list
drivers/macintosh/windfarm_core.c wf_client_list
drivers/usb/core/notify.c usb_notifier_list
drivers/video/fbmem.c fb_notifier_list
kernel/cpu.c cpu_chain
kernel/module.c module_notify_list
kernel/profile.c munmap_notifier
kernel/profile.c task_exit_notifier
kernel/sys.c reboot_notifier_list
net/core/dev.c netdev_chain
net/decnet/dn_dev.c: dnaddr_chain
net/ipv4/devinet.c: inetaddr_chain
It's possible that some of these classifications are wrong. If they are,
please let us know or submit a patch to fix them. Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)
The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.
[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 13:16:30 +04:00
static BLOCKING_NOTIFIER_HEAD ( module_notify_list ) ;
2005-04-17 02:20:36 +04:00
2014-11-10 02:01:29 +03:00
int register_module_notifier ( struct notifier_block * nb )
2005-04-17 02:20:36 +04:00
{
[PATCH] Notifier chain update: API changes
The kernel's implementation of notifier chains is unsafe. There is no
protection against entries being added to or removed from a chain while the
chain is in use. The issues were discussed in this thread:
http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2
We noticed that notifier chains in the kernel fall into two basic usage
classes:
"Blocking" chains are always called from a process context
and the callout routines are allowed to sleep;
"Atomic" chains can be called from an atomic context and
the callout routines are not allowed to sleep.
We decided to codify this distinction and make it part of the API. Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name). New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain. The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.
With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed. For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections. (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)
There are some limitations, which should not be too hard to live with. For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem. Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain. (This did happen in a couple of places and the code
had to be changed to avoid it.)
Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization. Instead we use RCU. The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.
Here is the list of chains that we adjusted and their classifications. None
of them use the raw API, so for the moment it is only a placeholder.
ATOMIC CHAINS
-------------
arch/i386/kernel/traps.c: i386die_chain
arch/ia64/kernel/traps.c: ia64die_chain
arch/powerpc/kernel/traps.c: powerpc_die_chain
arch/sparc64/kernel/traps.c: sparc64die_chain
arch/x86_64/kernel/traps.c: die_chain
drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list
kernel/panic.c: panic_notifier_list
kernel/profile.c: task_free_notifier
net/bluetooth/hci_core.c: hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain
net/ipv6/addrconf.c: inet6addr_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain
net/netlink/af_netlink.c: netlink_chain
BLOCKING CHAINS
---------------
arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain
arch/s390/kernel/process.c: idle_chain
arch/x86_64/kernel/process.c idle_notifier
drivers/base/memory.c: memory_chain
drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list
drivers/macintosh/adb.c: adb_client_list
drivers/macintosh/via-pmu.c sleep_notifier_list
drivers/macintosh/via-pmu68k.c sleep_notifier_list
drivers/macintosh/windfarm_core.c wf_client_list
drivers/usb/core/notify.c usb_notifier_list
drivers/video/fbmem.c fb_notifier_list
kernel/cpu.c cpu_chain
kernel/module.c module_notify_list
kernel/profile.c munmap_notifier
kernel/profile.c task_exit_notifier
kernel/sys.c reboot_notifier_list
net/core/dev.c netdev_chain
net/decnet/dn_dev.c: dnaddr_chain
net/ipv4/devinet.c: inetaddr_chain
It's possible that some of these classifications are wrong. If they are,
please let us know or submit a patch to fix them. Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)
The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.
[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 13:16:30 +04:00
return blocking_notifier_chain_register ( & module_notify_list , nb ) ;
2005-04-17 02:20:36 +04:00
}
EXPORT_SYMBOL ( register_module_notifier ) ;
2014-11-10 02:01:29 +03:00
int unregister_module_notifier ( struct notifier_block * nb )
2005-04-17 02:20:36 +04:00
{
[PATCH] Notifier chain update: API changes
The kernel's implementation of notifier chains is unsafe. There is no
protection against entries being added to or removed from a chain while the
chain is in use. The issues were discussed in this thread:
http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2
We noticed that notifier chains in the kernel fall into two basic usage
classes:
"Blocking" chains are always called from a process context
and the callout routines are allowed to sleep;
"Atomic" chains can be called from an atomic context and
the callout routines are not allowed to sleep.
We decided to codify this distinction and make it part of the API. Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name). New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain. The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.
With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed. For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections. (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)
There are some limitations, which should not be too hard to live with. For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem. Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain. (This did happen in a couple of places and the code
had to be changed to avoid it.)
Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization. Instead we use RCU. The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.
Here is the list of chains that we adjusted and their classifications. None
of them use the raw API, so for the moment it is only a placeholder.
ATOMIC CHAINS
-------------
arch/i386/kernel/traps.c: i386die_chain
arch/ia64/kernel/traps.c: ia64die_chain
arch/powerpc/kernel/traps.c: powerpc_die_chain
arch/sparc64/kernel/traps.c: sparc64die_chain
arch/x86_64/kernel/traps.c: die_chain
drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list
kernel/panic.c: panic_notifier_list
kernel/profile.c: task_free_notifier
net/bluetooth/hci_core.c: hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain
net/ipv6/addrconf.c: inet6addr_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain
net/netlink/af_netlink.c: netlink_chain
BLOCKING CHAINS
---------------
arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain
arch/s390/kernel/process.c: idle_chain
arch/x86_64/kernel/process.c idle_notifier
drivers/base/memory.c: memory_chain
drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list
drivers/macintosh/adb.c: adb_client_list
drivers/macintosh/via-pmu.c sleep_notifier_list
drivers/macintosh/via-pmu68k.c sleep_notifier_list
drivers/macintosh/windfarm_core.c wf_client_list
drivers/usb/core/notify.c usb_notifier_list
drivers/video/fbmem.c fb_notifier_list
kernel/cpu.c cpu_chain
kernel/module.c module_notify_list
kernel/profile.c munmap_notifier
kernel/profile.c task_exit_notifier
kernel/sys.c reboot_notifier_list
net/core/dev.c netdev_chain
net/decnet/dn_dev.c: dnaddr_chain
net/ipv4/devinet.c: inetaddr_chain
It's possible that some of these classifications are wrong. If they are,
please let us know or submit a patch to fix them. Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)
The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.
[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 13:16:30 +04:00
return blocking_notifier_chain_unregister ( & module_notify_list , nb ) ;
2005-04-17 02:20:36 +04:00
}
EXPORT_SYMBOL ( unregister_module_notifier ) ;
2016-11-16 18:45:48 +03:00
/*
* We require a truly strong try_module_get ( ) : 0 means success .
* Otherwise an error is returned due to ongoing or failed
* initialization etc .
*/
2005-04-17 02:20:36 +04:00
static inline int strong_try_module_get ( struct module * mod )
{
2013-01-12 05:08:44 +04:00
BUG_ON ( mod & & mod - > state = = MODULE_STATE_UNFORMED ) ;
2005-04-17 02:20:36 +04:00
if ( mod & & mod - > state = = MODULE_STATE_COMING )
2008-01-30 01:13:18 +03:00
return - EBUSY ;
if ( try_module_get ( mod ) )
2005-04-17 02:20:36 +04:00
return 0 ;
2008-01-30 01:13:18 +03:00
else
return - ENOENT ;
2005-04-17 02:20:36 +04:00
}
2013-01-21 10:47:39 +04:00
static inline void add_taint_module ( struct module * mod , unsigned flag ,
enum lockdep_ok lockdep_ok )
2006-10-11 12:21:48 +04:00
{
2013-01-21 10:47:39 +04:00
add_taint ( flag , lockdep_ok ) ;
2016-09-21 14:47:22 +03:00
set_bit ( flag , & mod - > taints ) ;
2006-10-11 12:21:48 +04:00
}
2007-05-09 09:26:28 +04:00
/*
* A thread that wants to hold a reference to a module only while it
* is running can call this to safely exit . nfsd and lockd use this .
2005-04-17 02:20:36 +04:00
*/
2016-04-11 22:32:09 +03:00
void __noreturn __module_put_and_exit ( struct module * mod , long code )
2005-04-17 02:20:36 +04:00
{
module_put ( mod ) ;
do_exit ( code ) ;
}
EXPORT_SYMBOL ( __module_put_and_exit ) ;
2007-10-18 14:06:07 +04:00
2005-04-17 02:20:36 +04:00
/* Find a module section: 0 means not found. */
2010-08-05 22:59:10 +04:00
static unsigned int find_sec ( const struct load_info * info , const char * name )
2005-04-17 02:20:36 +04:00
{
unsigned int i ;
2010-08-05 22:59:10 +04:00
for ( i = 1 ; i < info - > hdr - > e_shnum ; i + + ) {
Elf_Shdr * shdr = & info - > sechdrs [ i ] ;
2005-04-17 02:20:36 +04:00
/* Alloc bit cleared means "ignore it." */
2010-08-05 22:59:10 +04:00
if ( ( shdr - > sh_flags & SHF_ALLOC )
& & strcmp ( info - > secstrings + shdr - > sh_name , name ) = = 0 )
2005-04-17 02:20:36 +04:00
return i ;
2010-08-05 22:59:10 +04:00
}
2005-04-17 02:20:36 +04:00
return 0 ;
}
2008-10-22 19:00:13 +04:00
/* Find a module section, or NULL. */
2010-08-05 22:59:10 +04:00
static void * section_addr ( const struct load_info * info , const char * name )
2008-10-22 19:00:13 +04:00
{
/* Section 0 has sh_addr 0. */
2010-08-05 22:59:10 +04:00
return ( void * ) info - > sechdrs [ find_sec ( info , name ) ] . sh_addr ;
2008-10-22 19:00:13 +04:00
}
/* Find a module section, or NULL. Fill in number of "objects" in section. */
2010-08-05 22:59:10 +04:00
static void * section_objs ( const struct load_info * info ,
2008-10-22 19:00:13 +04:00
const char * name ,
size_t object_size ,
unsigned int * num )
{
2010-08-05 22:59:10 +04:00
unsigned int sec = find_sec ( info , name ) ;
2008-10-22 19:00:13 +04:00
/* Section 0 has sh_addr 0 and sh_size 0. */
2010-08-05 22:59:10 +04:00
* num = info - > sechdrs [ sec ] . sh_size / object_size ;
return ( void * ) info - > sechdrs [ sec ] . sh_addr ;
2008-10-22 19:00:13 +04:00
}
2005-04-17 02:20:36 +04:00
/* Provided by the linker */
extern const struct kernel_symbol __start___ksymtab [ ] ;
extern const struct kernel_symbol __stop___ksymtab [ ] ;
extern const struct kernel_symbol __start___ksymtab_gpl [ ] ;
extern const struct kernel_symbol __stop___ksymtab_gpl [ ] ;
2006-03-21 00:17:13 +03:00
extern const struct kernel_symbol __start___ksymtab_gpl_future [ ] ;
extern const struct kernel_symbol __stop___ksymtab_gpl_future [ ] ;
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
extern const s32 __start___kcrctab [ ] ;
extern const s32 __start___kcrctab_gpl [ ] ;
extern const s32 __start___kcrctab_gpl_future [ ] ;
2008-07-23 04:24:26 +04:00
# ifdef CONFIG_UNUSED_SYMBOLS
extern const struct kernel_symbol __start___ksymtab_unused [ ] ;
extern const struct kernel_symbol __stop___ksymtab_unused [ ] ;
extern const struct kernel_symbol __start___ksymtab_unused_gpl [ ] ;
extern const struct kernel_symbol __stop___ksymtab_unused_gpl [ ] ;
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
extern const s32 __start___kcrctab_unused [ ] ;
extern const s32 __start___kcrctab_unused_gpl [ ] ;
2008-07-23 04:24:26 +04:00
# endif
2005-04-17 02:20:36 +04:00
# ifndef CONFIG_MODVERSIONS
# define symversion(base, idx) NULL
# else
2006-03-28 13:56:20 +04:00
# define symversion(base, idx) ((base != NULL) ? ((base) + (idx)) : NULL)
2005-04-17 02:20:36 +04:00
# endif
2008-07-23 04:24:25 +04:00
static bool each_symbol_in_section ( const struct symsearch * arr ,
unsigned int arrsize ,
struct module * owner ,
bool ( * fn ) ( const struct symsearch * syms ,
struct module * owner ,
2011-04-19 23:49:58 +04:00
void * data ) ,
2008-07-23 04:24:25 +04:00
void * data )
2008-05-02 06:14:59 +04:00
{
2011-04-19 23:49:58 +04:00
unsigned int j ;
2008-05-02 06:14:59 +04:00
2008-07-23 04:24:25 +04:00
for ( j = 0 ; j < arrsize ; j + + ) {
2011-04-19 23:49:58 +04:00
if ( fn ( & arr [ j ] , owner , data ) )
return true ;
2006-06-28 15:26:45 +04:00
}
2008-07-23 04:24:25 +04:00
return false ;
2008-05-02 06:14:59 +04:00
}
2008-07-23 04:24:25 +04:00
/* Returns true as soon as fn returns true, otherwise false. */
2020-07-30 09:10:22 +03:00
static bool each_symbol_section ( bool ( * fn ) ( const struct symsearch * arr ,
2011-04-19 23:49:58 +04:00
struct module * owner ,
void * data ) ,
void * data )
2008-05-02 06:14:59 +04:00
{
struct module * mod ;
module: reduce stack usage for each_symbol()
And now that I'm looking at that call-chain (to see if it would make sense
to use some other more specific lock - doesn't look like it: all the
readers are using RCU and this is the only writer), I also give you this
trivial one-liner. It changes each_symbol() to not put that constant array
on the stack, resulting in changing
movq $C.388.31095, %rsi #, tmp85
subq $376, %rsp #,
movq %rdi, %rbx # fn, fn
leaq -208(%rbp), %rdi #, tmp84
movq %rbx, %rdx # fn,
rep movsl
xorl %esi, %esi #
leaq -208(%rbp), %rdi #, tmp87
movq %r12, %rcx # data,
call each_symbol_in_section.clone.0 #
into
xorl %esi, %esi #
subq $216, %rsp #,
movq %rdi, %rbx # fn, fn
movq $arr.31078, %rdi #,
call each_symbol_in_section.clone.0 #
which is not so much about being obviously shorter and simpler because we
don't unnecessarily copy that constant array around onto the stack, but
also about having a much smaller stack footprint (376 vs 216 bytes - see
the update of 'rsp').
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-08-05 22:59:05 +04:00
static const struct symsearch arr [ ] = {
2008-05-02 06:14:59 +04:00
{ __start___ksymtab , __stop___ksymtab , __start___kcrctab ,
2008-07-23 04:24:25 +04:00
NOT_GPL_ONLY , false } ,
2008-05-02 06:14:59 +04:00
{ __start___ksymtab_gpl , __stop___ksymtab_gpl ,
2008-07-23 04:24:25 +04:00
__start___kcrctab_gpl ,
GPL_ONLY , false } ,
2008-05-02 06:14:59 +04:00
{ __start___ksymtab_gpl_future , __stop___ksymtab_gpl_future ,
2008-07-23 04:24:25 +04:00
__start___kcrctab_gpl_future ,
WILL_BE_GPL_ONLY , false } ,
2008-07-23 04:24:26 +04:00
# ifdef CONFIG_UNUSED_SYMBOLS
2008-05-02 06:14:59 +04:00
{ __start___ksymtab_unused , __stop___ksymtab_unused ,
2008-07-23 04:24:25 +04:00
__start___kcrctab_unused ,
NOT_GPL_ONLY , true } ,
2008-05-02 06:14:59 +04:00
{ __start___ksymtab_unused_gpl , __stop___ksymtab_unused_gpl ,
2008-07-23 04:24:25 +04:00
__start___kcrctab_unused_gpl ,
GPL_ONLY , true } ,
2008-07-23 04:24:26 +04:00
# endif
2008-05-02 06:14:59 +04:00
} ;
2006-06-28 15:26:45 +04:00
2015-05-27 04:39:35 +03:00
module_assert_mutex_or_preempt ( ) ;
2008-07-23 04:24:25 +04:00
if ( each_symbol_in_section ( arr , ARRAY_SIZE ( arr ) , NULL , fn , data ) )
return true ;
2006-06-28 15:26:45 +04:00
2019-12-03 09:14:04 +03:00
list_for_each_entry_rcu ( mod , & modules , list ,
lockdep_is_held ( & module_mutex ) ) {
2008-05-02 06:14:59 +04:00
struct symsearch arr [ ] = {
{ mod - > syms , mod - > syms + mod - > num_syms , mod - > crcs ,
2008-07-23 04:24:25 +04:00
NOT_GPL_ONLY , false } ,
2008-05-02 06:14:59 +04:00
{ mod - > gpl_syms , mod - > gpl_syms + mod - > num_gpl_syms ,
2008-07-23 04:24:25 +04:00
mod - > gpl_crcs ,
GPL_ONLY , false } ,
2008-05-02 06:14:59 +04:00
{ mod - > gpl_future_syms ,
mod - > gpl_future_syms + mod - > num_gpl_future_syms ,
2008-07-23 04:24:25 +04:00
mod - > gpl_future_crcs ,
WILL_BE_GPL_ONLY , false } ,
2008-07-23 04:24:26 +04:00
# ifdef CONFIG_UNUSED_SYMBOLS
2008-05-02 06:14:59 +04:00
{ mod - > unused_syms ,
mod - > unused_syms + mod - > num_unused_syms ,
2008-07-23 04:24:25 +04:00
mod - > unused_crcs ,
NOT_GPL_ONLY , true } ,
2008-05-02 06:14:59 +04:00
{ mod - > unused_gpl_syms ,
mod - > unused_gpl_syms + mod - > num_unused_gpl_syms ,
2008-07-23 04:24:25 +04:00
mod - > unused_gpl_crcs ,
GPL_ONLY , true } ,
2008-07-23 04:24:26 +04:00
# endif
2008-05-02 06:14:59 +04:00
} ;
2013-01-12 05:08:44 +04:00
if ( mod - > state = = MODULE_STATE_UNFORMED )
continue ;
2008-07-23 04:24:25 +04:00
if ( each_symbol_in_section ( arr , ARRAY_SIZE ( arr ) , mod , fn , data ) )
return true ;
}
return false ;
}
struct find_symbol_arg {
/* Input */
const char * name ;
bool gplok ;
bool warn ;
/* Output */
struct module * owner ;
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
const s32 * crc ;
2008-12-06 03:03:56 +03:00
const struct kernel_symbol * sym ;
2020-07-30 09:10:26 +03:00
enum mod_license license ;
2008-07-23 04:24:25 +04:00
} ;
2018-11-19 19:43:58 +03:00
static bool check_exported_symbol ( const struct symsearch * syms ,
struct module * owner ,
unsigned int symnum , void * data )
2008-07-23 04:24:25 +04:00
{
struct find_symbol_arg * fsa = data ;
if ( ! fsa - > gplok ) {
2020-07-30 09:10:25 +03:00
if ( syms - > license = = GPL_ONLY )
2008-07-23 04:24:25 +04:00
return false ;
2020-07-30 09:10:25 +03:00
if ( syms - > license = = WILL_BE_GPL_ONLY & & fsa - > warn ) {
2013-11-13 03:11:28 +04:00
pr_warn ( " Symbol %s is being used by a non-GPL module, "
" which will not be allowed in the future \n " ,
fsa - > name ) ;
2006-03-21 00:17:13 +03:00
}
2005-04-17 02:20:36 +04:00
}
2008-05-02 06:14:59 +04:00
2008-07-23 04:24:26 +04:00
# ifdef CONFIG_UNUSED_SYMBOLS
2008-07-23 04:24:25 +04:00
if ( syms - > unused & & fsa - > warn ) {
2013-11-13 03:11:28 +04:00
pr_warn ( " Symbol %s is marked as UNUSED, however this module is "
" using it. \n " , fsa - > name ) ;
pr_warn ( " This symbol will go away in the future. \n " ) ;
2015-03-24 05:01:40 +03:00
pr_warn ( " Please evaluate if this is the right api to use and "
" if it really is, submit a report to the linux kernel "
" mailing list together with submitting your code for "
2013-11-13 03:11:28 +04:00
" inclusion. \n " ) ;
2008-07-23 04:24:25 +04:00
}
2008-07-23 04:24:26 +04:00
# endif
2008-07-23 04:24:25 +04:00
fsa - > owner = owner ;
fsa - > crc = symversion ( syms - > crcs , symnum ) ;
2008-12-06 03:03:56 +03:00
fsa - > sym = & syms - > start [ symnum ] ;
2020-07-30 09:10:26 +03:00
fsa - > license = syms - > license ;
2008-07-23 04:24:25 +04:00
return true ;
}
module: use relative references for __ksymtab entries
An ordinary arm64 defconfig build has ~64 KB worth of __ksymtab entries,
each consisting of two 64-bit fields containing absolute references, to
the symbol itself and to a char array containing its name, respectively.
When we build the same configuration with KASLR enabled, we end up with an
additional ~192 KB of relocations in the .init section, i.e., one 24 byte
entry for each absolute reference, which all need to be processed at boot
time.
Given how the struct kernel_symbol that describes each entry is completely
local to module.c (except for the references emitted by EXPORT_SYMBOL()
itself), we can easily modify it to contain two 32-bit relative references
instead. This reduces the size of the __ksymtab section by 50% for all
64-bit architectures, and gets rid of the runtime relocations entirely for
architectures implementing KASLR, either via standard PIE linking (arm64)
or using custom host tools (x86).
Note that the binary search involving __ksymtab contents relies on each
section being sorted by symbol name. This is implemented based on the
input section names, not the names in the ksymtab entries, so this patch
does not interfere with that.
Given that the use of place-relative relocations requires support both in
the toolchain and in the module loader, we cannot enable this feature for
all architectures. So make it dependent on whether
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS is defined.
Link: http://lkml.kernel.org/r/20180704083651.24360-4-ard.biesheuvel@linaro.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morris <james.morris@microsoft.com>
Cc: James Morris <jmorris@namei.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 07:56:09 +03:00
static unsigned long kernel_symbol_value ( const struct kernel_symbol * sym )
{
# ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
return ( unsigned long ) offset_to_ptr ( & sym - > value_offset ) ;
# else
return sym - > value ;
# endif
}
static const char * kernel_symbol_name ( const struct kernel_symbol * sym )
{
# ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
return offset_to_ptr ( & sym - > name_offset ) ;
# else
return sym - > name ;
# endif
}
module: add support for symbol namespaces.
The EXPORT_SYMBOL_NS() and EXPORT_SYMBOL_NS_GPL() macros can be used to
export a symbol to a specific namespace. There are no _GPL_FUTURE and
_UNUSED variants because these are currently unused, and I'm not sure
they are necessary.
I didn't add EXPORT_SYMBOL_NS() for ASM exports; this patch sets the
namespace of ASM exports to NULL by default. In case of relative
references, it will be relocatable to NULL. If there's a need, this
should be pretty easy to add.
A module that wants to use a symbol exported to a namespace must add a
MODULE_IMPORT_NS() statement to their module code; otherwise, modpost
will complain when building the module, and the kernel module loader
will emit an error and fail when loading the module.
MODULE_IMPORT_NS() adds a modinfo tag 'import_ns' to the module. That
tag can be observed by the modinfo command, modpost and kernel/module.c
at the time of loading the module.
The ELF symbols are renamed to include the namespace with an asm label;
for example, symbol 'usb_stor_suspend' in namespace USB_STORAGE becomes
'usb_stor_suspend.USB_STORAGE'. This allows modpost to do namespace
checking, without having to go through all the effort of parsing ELF and
relocation records just to get to the struct kernel_symbols.
On x86_64 I saw no difference in binary size (compression), but at
runtime this will require a word of memory per export to hold the
namespace. An alternative could be to store namespaced symbols in their
own section and use a separate 'struct namespaced_kernel_symbol' for
that section, at the cost of making the module loader more complex.
Co-developed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-06 13:32:27 +03:00
static const char * kernel_symbol_namespace ( const struct kernel_symbol * sym )
{
# ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
2019-09-11 15:26:46 +03:00
if ( ! sym - > namespace_offset )
return NULL ;
module: add support for symbol namespaces.
The EXPORT_SYMBOL_NS() and EXPORT_SYMBOL_NS_GPL() macros can be used to
export a symbol to a specific namespace. There are no _GPL_FUTURE and
_UNUSED variants because these are currently unused, and I'm not sure
they are necessary.
I didn't add EXPORT_SYMBOL_NS() for ASM exports; this patch sets the
namespace of ASM exports to NULL by default. In case of relative
references, it will be relocatable to NULL. If there's a need, this
should be pretty easy to add.
A module that wants to use a symbol exported to a namespace must add a
MODULE_IMPORT_NS() statement to their module code; otherwise, modpost
will complain when building the module, and the kernel module loader
will emit an error and fail when loading the module.
MODULE_IMPORT_NS() adds a modinfo tag 'import_ns' to the module. That
tag can be observed by the modinfo command, modpost and kernel/module.c
at the time of loading the module.
The ELF symbols are renamed to include the namespace with an asm label;
for example, symbol 'usb_stor_suspend' in namespace USB_STORAGE becomes
'usb_stor_suspend.USB_STORAGE'. This allows modpost to do namespace
checking, without having to go through all the effort of parsing ELF and
relocation records just to get to the struct kernel_symbols.
On x86_64 I saw no difference in binary size (compression), but at
runtime this will require a word of memory per export to hold the
namespace. An alternative could be to store namespaced symbols in their
own section and use a separate 'struct namespaced_kernel_symbol' for
that section, at the cost of making the module loader more complex.
Co-developed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-06 13:32:27 +03:00
return offset_to_ptr ( & sym - > namespace_offset ) ;
# else
return sym - > namespace ;
# endif
}
2019-09-09 14:39:02 +03:00
static int cmp_name ( const void * name , const void * sym )
2011-04-20 13:10:52 +04:00
{
2019-09-09 14:39:02 +03:00
return strcmp ( name , kernel_symbol_name ( sym ) ) ;
2011-04-20 13:10:52 +04:00
}
2018-11-19 19:43:58 +03:00
static bool find_exported_symbol_in_section ( const struct symsearch * syms ,
struct module * owner ,
void * data )
2011-04-19 23:49:58 +04:00
{
struct find_symbol_arg * fsa = data ;
2011-04-20 13:10:52 +04:00
struct kernel_symbol * sym ;
sym = bsearch ( fsa - > name , syms - > start , syms - > stop - syms - > start ,
sizeof ( struct kernel_symbol ) , cmp_name ) ;
2018-11-19 19:43:58 +03:00
if ( sym ! = NULL & & check_exported_symbol ( syms , owner ,
sym - syms - > start , data ) )
2011-04-20 13:10:52 +04:00
return true ;
2011-04-19 23:49:58 +04:00
return false ;
}
2018-11-19 19:43:58 +03:00
/* Find an exported symbol and return it, along with, (optional) crc and
2010-06-05 21:17:36 +04:00
* ( optional ) module which owns it . Needs preempt disabled or module_mutex . */
2020-07-30 09:10:21 +03:00
static const struct kernel_symbol * find_symbol ( const char * name ,
2008-12-06 03:03:59 +03:00
struct module * * owner ,
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
const s32 * * crc ,
2020-07-30 09:10:26 +03:00
enum mod_license * license ,
2008-12-06 03:03:59 +03:00
bool gplok ,
bool warn )
2008-07-23 04:24:25 +04:00
{
struct find_symbol_arg fsa ;
fsa . name = name ;
fsa . gplok = gplok ;
fsa . warn = warn ;
2018-11-19 19:43:58 +03:00
if ( each_symbol_section ( find_exported_symbol_in_section , & fsa ) ) {
2008-07-23 04:24:25 +04:00
if ( owner )
* owner = fsa . owner ;
if ( crc )
* crc = fsa . crc ;
2020-07-30 09:10:26 +03:00
if ( license )
* license = fsa . license ;
2008-12-06 03:03:56 +03:00
return fsa . sym ;
2008-07-23 04:24:25 +04:00
}
2011-12-06 23:11:31 +04:00
pr_debug ( " Failed to find symbol %s \n " , name ) ;
2008-12-06 03:03:56 +03:00
return NULL ;
2005-04-17 02:20:36 +04:00
}
2015-07-28 23:22:14 +03:00
/*
* Search for module by name : must hold module_mutex ( or preempt disabled
* for read - only access ) .
*/
2013-07-02 10:05:11 +04:00
static struct module * find_module_all ( const char * name , size_t len ,
2013-01-12 05:08:44 +04:00
bool even_unformed )
2005-04-17 02:20:36 +04:00
{
struct module * mod ;
2015-07-28 23:22:14 +03:00
module_assert_mutex_or_preempt ( ) ;
2015-05-27 04:39:35 +03:00
2019-12-03 09:14:04 +03:00
list_for_each_entry_rcu ( mod , & modules , list ,
lockdep_is_held ( & module_mutex ) ) {
2013-01-12 05:08:44 +04:00
if ( ! even_unformed & & mod - > state = = MODULE_STATE_UNFORMED )
continue ;
2013-07-02 10:05:11 +04:00
if ( strlen ( mod - > name ) = = len & & ! memcmp ( mod - > name , name , len ) )
2005-04-17 02:20:36 +04:00
return mod ;
}
return NULL ;
}
2013-01-12 05:08:44 +04:00
struct module * find_module ( const char * name )
{
2015-07-28 23:22:14 +03:00
module_assert_mutex ( ) ;
2013-07-02 10:05:11 +04:00
return find_module_all ( name , strlen ( name ) , false ) ;
2013-01-12 05:08:44 +04:00
}
2008-12-06 03:03:59 +03:00
EXPORT_SYMBOL_GPL ( find_module ) ;
2005-04-17 02:20:36 +04:00
# ifdef CONFIG_SMP
2009-02-20 10:29:08 +03:00
2010-03-10 12:56:10 +03:00
static inline void __percpu * mod_percpu ( struct module * mod )
2009-02-20 10:29:08 +03:00
{
2010-03-10 12:56:10 +03:00
return mod - > percpu ;
}
2009-02-20 10:29:08 +03:00
2013-07-03 04:36:29 +04:00
static int percpu_modalloc ( struct module * mod , struct load_info * info )
2010-03-10 12:56:10 +03:00
{
2013-07-03 04:36:29 +04:00
Elf_Shdr * pcpusec = & info - > sechdrs [ info - > index . pcpu ] ;
unsigned long align = pcpusec - > sh_addralign ;
if ( ! pcpusec - > sh_size )
return 0 ;
2009-02-20 10:29:08 +03:00
if ( align > PAGE_SIZE ) {
2013-11-13 03:11:28 +04:00
pr_warn ( " %s: per-cpu alignment %li > %li \n " ,
mod - > name , align , PAGE_SIZE ) ;
2009-02-20 10:29:08 +03:00
align = PAGE_SIZE ;
}
2013-07-03 04:36:29 +04:00
mod - > percpu = __alloc_reserved_percpu ( pcpusec - > sh_size , align ) ;
2010-03-10 12:56:10 +03:00
if ( ! mod - > percpu ) {
2013-11-13 03:11:28 +04:00
pr_warn ( " %s: Could not allocate %lu bytes percpu data \n " ,
mod - > name , ( unsigned long ) pcpusec - > sh_size ) ;
2010-03-10 12:56:10 +03:00
return - ENOMEM ;
}
2013-07-03 04:36:29 +04:00
mod - > percpu_size = pcpusec - > sh_size ;
2010-03-10 12:56:10 +03:00
return 0 ;
2009-02-20 10:29:08 +03:00
}
2010-03-10 12:56:10 +03:00
static void percpu_modfree ( struct module * mod )
2009-02-20 10:29:08 +03:00
{
2010-03-10 12:56:10 +03:00
free_percpu ( mod - > percpu ) ;
2009-02-20 10:29:08 +03:00
}
2010-08-05 22:59:10 +04:00
static unsigned int find_pcpusec ( struct load_info * info )
2009-02-20 10:29:07 +03:00
{
2010-08-05 22:59:10 +04:00
return find_sec ( info , " .data..percpu " ) ;
2009-02-20 10:29:07 +03:00
}
2010-03-10 12:56:10 +03:00
static void percpu_modcopy ( struct module * mod ,
const void * from , unsigned long size )
2009-02-20 10:29:07 +03:00
{
int cpu ;
for_each_possible_cpu ( cpu )
2010-03-10 12:56:10 +03:00
memcpy ( per_cpu_ptr ( mod - > percpu , cpu ) , from , size ) ;
2009-02-20 10:29:07 +03:00
}
2017-02-27 17:37:36 +03:00
bool __is_module_percpu_address ( unsigned long addr , unsigned long * can_addr )
2010-03-10 12:57:54 +03:00
{
struct module * mod ;
unsigned int cpu ;
preempt_disable ( ) ;
list_for_each_entry_rcu ( mod , & modules , list ) {
2013-01-12 05:08:44 +04:00
if ( mod - > state = = MODULE_STATE_UNFORMED )
continue ;
2010-03-10 12:57:54 +03:00
if ( ! mod - > percpu_size )
continue ;
for_each_possible_cpu ( cpu ) {
void * start = per_cpu_ptr ( mod - > percpu , cpu ) ;
2017-02-27 17:37:36 +03:00
void * va = ( void * ) addr ;
2010-03-10 12:57:54 +03:00
2017-02-27 17:37:36 +03:00
if ( va > = start & & va < start + mod - > percpu_size ) {
2017-03-20 14:26:55 +03:00
if ( can_addr ) {
2017-02-27 17:37:36 +03:00
* can_addr = ( unsigned long ) ( va - start ) ;
2017-03-20 14:26:55 +03:00
* can_addr + = ( unsigned long )
per_cpu_ptr ( mod - > percpu ,
get_boot_cpu_id ( ) ) ;
}
2010-03-10 12:57:54 +03:00
preempt_enable ( ) ;
return true ;
}
}
}
preempt_enable ( ) ;
return false ;
2009-02-20 10:29:07 +03:00
}
2017-02-27 17:37:36 +03:00
/**
* is_module_percpu_address - test whether address is from module static percpu
* @ addr : address to test
*
* Test whether @ addr belongs to module static percpu area .
*
* RETURNS :
* % true if @ addr is from module static percpu area
*/
bool is_module_percpu_address ( unsigned long addr )
{
return __is_module_percpu_address ( addr , NULL ) ;
}
2005-04-17 02:20:36 +04:00
# else /* ... !CONFIG_SMP */
2009-02-20 10:29:07 +03:00
2010-03-10 12:56:10 +03:00
static inline void __percpu * mod_percpu ( struct module * mod )
2005-04-17 02:20:36 +04:00
{
return NULL ;
}
2013-07-03 04:36:29 +04:00
static int percpu_modalloc ( struct module * mod , struct load_info * info )
2010-03-10 12:56:10 +03:00
{
2013-07-03 04:36:29 +04:00
/* UP modules shouldn't have this section: ENOMEM isn't quite right */
if ( info - > sechdrs [ info - > index . pcpu ] . sh_size ! = 0 )
return - ENOMEM ;
return 0 ;
2010-03-10 12:56:10 +03:00
}
static inline void percpu_modfree ( struct module * mod )
2005-04-17 02:20:36 +04:00
{
}
2010-08-05 22:59:10 +04:00
static unsigned int find_pcpusec ( struct load_info * info )
2005-04-17 02:20:36 +04:00
{
return 0 ;
}
2010-03-10 12:56:10 +03:00
static inline void percpu_modcopy ( struct module * mod ,
const void * from , unsigned long size )
2005-04-17 02:20:36 +04:00
{
/* pcpusec should be 0, and size of that section should be 0. */
BUG_ON ( size ! = 0 ) ;
}
2010-03-10 12:57:54 +03:00
bool is_module_percpu_address ( unsigned long addr )
{
return false ;
}
2009-02-20 10:29:07 +03:00
2017-02-27 17:37:36 +03:00
bool __is_module_percpu_address ( unsigned long addr , unsigned long * can_addr )
{
return false ;
}
2005-04-17 02:20:36 +04:00
# endif /* CONFIG_SMP */
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
# define MODINFO_ATTR(field) \
static void setup_modinfo_ # # field ( struct module * mod , const char * s ) \
{ \
mod - > field = kstrdup ( s , GFP_KERNEL ) ; \
} \
static ssize_t show_modinfo_ # # field ( struct module_attribute * mattr , \
2011-07-24 16:36:04 +04:00
struct module_kobject * mk , char * buffer ) \
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
{ \
2013-08-20 10:04:21 +04:00
return scnprintf ( buffer , PAGE_SIZE , " %s \n " , mk - > mod - > field ) ; \
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
} \
static int modinfo_ # # field # # _exists ( struct module * mod ) \
{ \
return mod - > field ! = NULL ; \
} \
static void free_modinfo_ # # field ( struct module * mod ) \
{ \
2007-10-18 14:06:07 +04:00
kfree ( mod - > field ) ; \
mod - > field = NULL ; \
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
} \
static struct module_attribute modinfo_ # # field = { \
2007-06-13 22:45:17 +04:00
. attr = { . name = __stringify ( field ) , . mode = 0444 } , \
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
. show = show_modinfo_ # # field , \
. setup = setup_modinfo_ # # field , \
. test = modinfo_ # # field # # _exists , \
. free = free_modinfo_ # # field , \
} ;
MODINFO_ATTR ( version ) ;
MODINFO_ATTR ( srcversion ) ;
2008-01-25 23:08:33 +03:00
static char last_unloaded_module [ MODULE_NAME_LEN + 1 ] ;
2006-02-17 00:50:23 +03:00
# ifdef CONFIG_MODULE_UNLOAD
2010-03-29 22:25:18 +04:00
EXPORT_TRACEPOINT_SYMBOL ( module_get ) ;
2014-11-10 02:00:29 +03:00
/* MODULE_REF_BASE is the base reference count by kmodule loader. */
# define MODULE_REF_BASE 1
2005-04-17 02:20:36 +04:00
/* Init the unload section of the module. */
2010-08-05 22:59:04 +04:00
static int module_unload_init ( struct module * mod )
2005-04-17 02:20:36 +04:00
{
2014-11-10 02:00:29 +03:00
/*
* Initialize reference counter to MODULE_REF_BASE .
* refcnt = = 0 means module is going .
*/
atomic_set ( & mod - > refcnt , MODULE_REF_BASE ) ;
2010-08-05 22:59:04 +04:00
2010-05-31 23:19:37 +04:00
INIT_LIST_HEAD ( & mod - > source_list ) ;
INIT_LIST_HEAD ( & mod - > target_list ) ;
2010-01-05 09:34:50 +03:00
2005-04-17 02:20:36 +04:00
/* Hold reference count during initialization. */
2014-11-10 02:00:29 +03:00
atomic_inc ( & mod - > refcnt ) ;
2010-08-05 22:59:04 +04:00
return 0 ;
2005-04-17 02:20:36 +04:00
}
/* Does a already use b? */
static int already_uses ( struct module * a , struct module * b )
{
struct module_use * use ;
2010-05-31 23:19:37 +04:00
list_for_each_entry ( use , & b - > source_list , source_list ) {
if ( use - > source = = a ) {
2011-12-06 23:11:31 +04:00
pr_debug ( " %s uses %s! \n " , a - > name , b - > name ) ;
2005-04-17 02:20:36 +04:00
return 1 ;
}
}
2011-12-06 23:11:31 +04:00
pr_debug ( " %s does not use %s! \n " , a - > name , b - > name ) ;
2005-04-17 02:20:36 +04:00
return 0 ;
}
2010-05-31 23:19:37 +04:00
/*
* Module a uses b
* - we add ' a ' as a " source " , ' b ' as a " target " of module use
* - the module_use is added to the list of ' b ' sources ( so
* ' b ' can walk the list to see who sourced them ) , and of ' a '
* targets ( so ' a ' can see what modules it targets ) .
*/
static int add_module_usage ( struct module * a , struct module * b )
{
struct module_use * use ;
2011-12-06 23:11:31 +04:00
pr_debug ( " Allocating new usage for %s. \n " , a - > name ) ;
2010-05-31 23:19:37 +04:00
use = kmalloc ( sizeof ( * use ) , GFP_ATOMIC ) ;
2017-10-06 17:27:26 +03:00
if ( ! use )
2010-05-31 23:19:37 +04:00
return - ENOMEM ;
use - > source = a ;
use - > target = b ;
list_add ( & use - > source_list , & b - > source_list ) ;
list_add ( & use - > target_list , & a - > target_list ) ;
return 0 ;
}
2010-06-05 21:17:36 +04:00
/* Module a uses b: caller needs module_mutex() */
2020-07-30 09:10:20 +03:00
static int ref_module ( struct module * a , struct module * b )
2005-04-17 02:20:36 +04:00
{
2010-06-05 21:17:35 +04:00
int err ;
2007-01-18 15:26:15 +03:00
2010-06-05 21:17:37 +04:00
if ( b = = NULL | | already_uses ( a , b ) )
2010-05-26 03:48:30 +04:00
return 0 ;
2010-06-05 21:17:37 +04:00
/* If module isn't available, we fail. */
err = strong_try_module_get ( b ) ;
2008-01-30 01:13:18 +03:00
if ( err )
2010-06-05 21:17:37 +04:00
return err ;
2005-04-17 02:20:36 +04:00
2010-05-31 23:19:37 +04:00
err = add_module_usage ( a , b ) ;
if ( err ) {
2005-04-17 02:20:36 +04:00
module_put ( b ) ;
2010-06-05 21:17:37 +04:00
return err ;
2005-04-17 02:20:36 +04:00
}
2010-06-05 21:17:37 +04:00
return 0 ;
2005-04-17 02:20:36 +04:00
}
/* Clear the unload stuff of the module. */
static void module_unload_free ( struct module * mod )
{
2010-05-31 23:19:37 +04:00
struct module_use * use , * tmp ;
2005-04-17 02:20:36 +04:00
2010-06-05 21:17:36 +04:00
mutex_lock ( & module_mutex ) ;
2010-05-31 23:19:37 +04:00
list_for_each_entry_safe ( use , tmp , & mod - > target_list , target_list ) {
struct module * i = use - > target ;
2011-12-06 23:11:31 +04:00
pr_debug ( " %s unusing %s \n " , mod - > name , i - > name ) ;
2010-05-31 23:19:37 +04:00
module_put ( i ) ;
list_del ( & use - > source_list ) ;
list_del ( & use - > target_list ) ;
kfree ( use ) ;
2005-04-17 02:20:36 +04:00
}
2010-06-05 21:17:36 +04:00
mutex_unlock ( & module_mutex ) ;
2005-04-17 02:20:36 +04:00
}
# ifdef CONFIG_MODULE_FORCE_UNLOAD
2006-01-08 12:04:29 +03:00
static inline int try_force_unload ( unsigned int flags )
2005-04-17 02:20:36 +04:00
{
int ret = ( flags & O_TRUNC ) ;
if ( ret )
2013-01-21 10:47:39 +04:00
add_taint ( TAINT_FORCED_RMMOD , LOCKDEP_NOW_UNRELIABLE ) ;
2005-04-17 02:20:36 +04:00
return ret ;
}
# else
2006-01-08 12:04:29 +03:00
static inline int try_force_unload ( unsigned int flags )
2005-04-17 02:20:36 +04:00
{
return 0 ;
}
# endif /* CONFIG_MODULE_FORCE_UNLOAD */
2014-11-10 02:00:29 +03:00
/* Try to release refcount of module, 0 means success. */
static int try_release_module_ref ( struct module * mod )
2005-04-17 02:20:36 +04:00
{
2014-11-10 02:00:29 +03:00
int ret ;
2005-04-17 02:20:36 +04:00
2014-11-10 02:00:29 +03:00
/* Try to decrement refcnt which we set at loading */
ret = atomic_sub_return ( MODULE_REF_BASE , & mod - > refcnt ) ;
BUG_ON ( ret < 0 ) ;
if ( ret )
/* Someone can put this right now, recover with checking */
ret = atomic_add_unless ( & mod - > refcnt , MODULE_REF_BASE , 0 ) ;
2005-04-17 02:20:36 +04:00
2014-11-10 02:00:29 +03:00
return ret ;
}
2005-04-17 02:20:36 +04:00
2014-11-10 02:00:29 +03:00
static int try_stop_module ( struct module * mod , int flags , int * forced )
{
2008-07-23 04:24:25 +04:00
/* If it's not unused, quit unless we're forcing. */
2014-11-10 02:00:29 +03:00
if ( try_release_module_ref ( mod ) ! = 0 ) {
* forced = try_force_unload ( flags ) ;
if ( ! ( * forced ) )
2005-04-17 02:20:36 +04:00
return - EWOULDBLOCK ;
}
/* Mark it as dying. */
2014-11-10 02:00:29 +03:00
mod - > state = MODULE_STATE_GOING ;
2005-04-17 02:20:36 +04:00
2014-11-10 02:00:29 +03:00
return 0 ;
2005-04-17 02:20:36 +04:00
}
2015-01-22 03:43:14 +03:00
/**
* module_refcount - return the refcount or - 1 if unloading
*
* @ mod : the module we ' re checking
*
* Returns :
* - 1 if the module is in the process of unloading
* otherwise the number of references in the kernel to the module
*/
int module_refcount ( struct module * mod )
2005-04-17 02:20:36 +04:00
{
2015-01-22 03:43:14 +03:00
return atomic_read ( & mod - > refcnt ) - MODULE_REF_BASE ;
2005-04-17 02:20:36 +04:00
}
EXPORT_SYMBOL ( module_refcount ) ;
/* This exists whether we can unload or not */
static void free_module ( struct module * mod ) ;
2009-01-14 16:14:10 +03:00
SYSCALL_DEFINE2 ( delete_module , const char __user * , name_user ,
unsigned int , flags )
2005-04-17 02:20:36 +04:00
{
struct module * mod ;
2007-02-24 01:54:57 +03:00
char name [ MODULE_NAME_LEN ] ;
2005-04-17 02:20:36 +04:00
int ret , forced = 0 ;
2009-04-03 02:49:29 +04:00
if ( ! capable ( CAP_SYS_MODULE ) | | modules_disabled )
2007-02-24 01:54:57 +03:00
return - EPERM ;
if ( strncpy_from_user ( name , name_user , MODULE_NAME_LEN - 1 ) < 0 )
return - EFAULT ;
name [ MODULE_NAME_LEN - 1 ] = ' \0 ' ;
2017-05-02 17:16:04 +03:00
audit_log_kern_module ( name ) ;
2010-05-06 20:49:20 +04:00
if ( mutex_lock_interruptible ( & module_mutex ) ! = 0 )
return - EINTR ;
2005-04-17 02:20:36 +04:00
mod = find_module ( name ) ;
if ( ! mod ) {
ret = - ENOENT ;
goto out ;
}
2010-05-31 23:19:37 +04:00
if ( ! list_empty ( & mod - > source_list ) ) {
2005-04-17 02:20:36 +04:00
/* Other modules depend on us: get rid of them first. */
ret = - EWOULDBLOCK ;
goto out ;
}
/* Doing init or already dying? */
if ( mod - > state ! = MODULE_STATE_LIVE ) {
2013-09-17 00:18:51 +04:00
/* FIXME: if (force), slam module count damn the torpedoes */
2011-12-06 23:11:31 +04:00
pr_debug ( " %s already dying \n " , mod - > name ) ;
2005-04-17 02:20:36 +04:00
ret = - EBUSY ;
goto out ;
}
/* If it has an init func, it must have an exit func to unload */
2007-10-17 10:26:27 +04:00
if ( mod - > init & & ! mod - > exit ) {
2006-01-08 12:04:29 +03:00
forced = try_force_unload ( flags ) ;
2005-04-17 02:20:36 +04:00
if ( ! forced ) {
/* This module can't be removed */
ret = - EBUSY ;
goto out ;
}
}
/* Stop the machine so refcounts can't move and disable module. */
ret = try_stop_module ( mod , flags , & forced ) ;
if ( ret ! = 0 )
goto out ;
2008-04-21 16:34:31 +04:00
mutex_unlock ( & module_mutex ) ;
2011-03-31 05:57:33 +04:00
/* Final destruction now no one is using it. */
2008-04-21 16:34:31 +04:00
if ( mod - > exit ! = NULL )
2005-04-17 02:20:36 +04:00
mod - > exit ( ) ;
2008-04-21 16:34:31 +04:00
blocking_notifier_call_chain ( & module_notify_list ,
MODULE_STATE_GOING , mod ) ;
2016-03-17 03:55:39 +03:00
klp_module_going ( mod ) ;
2016-02-17 01:32:33 +03:00
ftrace_release_mod ( mod ) ;
2009-01-07 19:45:46 +03:00
async_synchronize_full ( ) ;
2010-06-05 21:17:36 +04:00
2008-01-25 23:08:33 +03:00
/* Store the name of the last unloaded module for diagnostic purposes */
2008-01-30 01:13:20 +03:00
strlcpy ( last_unloaded_module , mod - > name , sizeof ( last_unloaded_module ) ) ;
2005-04-17 02:20:36 +04:00
2010-06-05 21:17:36 +04:00
free_module ( mod ) ;
2019-11-13 12:29:50 +03:00
/* someone could wait for the module in add_unformed_module() */
wake_up_all ( & module_wq ) ;
2010-06-05 21:17:36 +04:00
return 0 ;
out :
2006-03-23 14:00:46 +03:00
mutex_unlock ( & module_mutex ) ;
2005-04-17 02:20:36 +04:00
return ret ;
}
2008-12-08 09:26:29 +03:00
static inline void print_unload_info ( struct seq_file * m , struct module * mod )
2005-04-17 02:20:36 +04:00
{
struct module_use * use ;
int printed_something = 0 ;
2015-01-22 03:43:14 +03:00
seq_printf ( m , " %i " , module_refcount ( mod ) ) ;
2005-04-17 02:20:36 +04:00
2014-11-10 02:01:29 +03:00
/*
* Always include a trailing , so userspace can differentiate
* between this and the old multi - field proc format .
*/
2010-05-31 23:19:37 +04:00
list_for_each_entry ( use , & mod - > source_list , source_list ) {
2005-04-17 02:20:36 +04:00
printed_something = 1 ;
2010-05-31 23:19:37 +04:00
seq_printf ( m , " %s, " , use - > source - > name ) ;
2005-04-17 02:20:36 +04:00
}
if ( mod - > init ! = NULL & & mod - > exit = = NULL ) {
printed_something = 1 ;
2014-11-10 02:01:29 +03:00
seq_puts ( m , " [permanent], " ) ;
2005-04-17 02:20:36 +04:00
}
if ( ! printed_something )
2014-11-10 02:01:29 +03:00
seq_puts ( m , " - " ) ;
2005-04-17 02:20:36 +04:00
}
void __symbol_put ( const char * symbol )
{
struct module * owner ;
2007-07-16 10:41:46 +04:00
preempt_disable ( ) ;
2020-07-30 09:10:26 +03:00
if ( ! find_symbol ( symbol , & owner , NULL , NULL , true , false ) )
2005-04-17 02:20:36 +04:00
BUG ( ) ;
module_put ( owner ) ;
2007-07-16 10:41:46 +04:00
preempt_enable ( ) ;
2005-04-17 02:20:36 +04:00
}
EXPORT_SYMBOL ( __symbol_put ) ;
2009-08-26 16:32:54 +04:00
/* Note this assumes addr is a function, which it currently always is. */
2005-04-17 02:20:36 +04:00
void symbol_put_addr ( void * addr )
{
2006-05-15 20:44:06 +04:00
struct module * modaddr ;
2009-08-26 16:32:54 +04:00
unsigned long a = ( unsigned long ) dereference_function_descriptor ( addr ) ;
2005-04-17 02:20:36 +04:00
2009-08-26 16:32:54 +04:00
if ( core_kernel_text ( a ) )
2006-05-15 20:44:06 +04:00
return ;
2005-04-17 02:20:36 +04:00
2015-08-20 04:04:59 +03:00
/*
* Even though we hold a reference on the module ; we still need to
* disable preemption in order to safely traverse the data structure .
*/
preempt_disable ( ) ;
2009-08-26 16:32:54 +04:00
modaddr = __module_text_address ( a ) ;
2009-03-31 23:05:31 +04:00
BUG_ON ( ! modaddr ) ;
2006-05-15 20:44:06 +04:00
module_put ( modaddr ) ;
2015-08-20 04:04:59 +03:00
preempt_enable ( ) ;
2005-04-17 02:20:36 +04:00
}
EXPORT_SYMBOL_GPL ( symbol_put_addr ) ;
static ssize_t show_refcnt ( struct module_attribute * mattr ,
2011-07-24 16:36:04 +04:00
struct module_kobject * mk , char * buffer )
2005-04-17 02:20:36 +04:00
{
2015-01-22 03:43:14 +03:00
return sprintf ( buffer , " %i \n " , module_refcount ( mk - > mod ) ) ;
2005-04-17 02:20:36 +04:00
}
2012-01-13 03:02:15 +04:00
static struct module_attribute modinfo_refcnt =
__ATTR ( refcnt , 0444 , show_refcnt , NULL ) ;
2005-04-17 02:20:36 +04:00
2012-03-26 06:20:52 +04:00
void __module_get ( struct module * module )
{
if ( module ) {
preempt_disable ( ) ;
2014-11-10 01:59:29 +03:00
atomic_inc ( & module - > refcnt ) ;
2012-03-26 06:20:52 +04:00
trace_module_get ( module , _RET_IP_ ) ;
preempt_enable ( ) ;
}
}
EXPORT_SYMBOL ( __module_get ) ;
bool try_module_get ( struct module * module )
{
bool ret = true ;
if ( module ) {
preempt_disable ( ) ;
2014-11-10 02:00:29 +03:00
/* Note: here, we can fail to get a reference */
if ( likely ( module_is_live ( module ) & &
atomic_inc_not_zero ( & module - > refcnt ) ! = 0 ) )
2012-03-26 06:20:52 +04:00
trace_module_get ( module , _RET_IP_ ) ;
2014-11-10 02:00:29 +03:00
else
2012-03-26 06:20:52 +04:00
ret = false ;
preempt_enable ( ) ;
}
return ret ;
}
EXPORT_SYMBOL ( try_module_get ) ;
2006-10-18 09:47:25 +04:00
void module_put ( struct module * module )
{
2014-11-10 02:00:29 +03:00
int ret ;
2006-10-18 09:47:25 +04:00
if ( module ) {
2010-01-05 09:34:50 +03:00
preempt_disable ( ) ;
2014-11-10 02:00:29 +03:00
ret = atomic_dec_if_positive ( & module - > refcnt ) ;
WARN_ON ( ret < 0 ) ; /* Failed to put refcount */
2010-03-24 05:57:43 +03:00
trace_module_put ( module , _RET_IP_ ) ;
2010-01-05 09:34:50 +03:00
preempt_enable ( ) ;
2006-10-18 09:47:25 +04:00
}
}
EXPORT_SYMBOL ( module_put ) ;
2005-04-17 02:20:36 +04:00
# else /* !CONFIG_MODULE_UNLOAD */
2008-12-08 09:26:29 +03:00
static inline void print_unload_info ( struct seq_file * m , struct module * mod )
2005-04-17 02:20:36 +04:00
{
/* We don't know the usage count, or what modules are using. */
2014-11-10 02:01:29 +03:00
seq_puts ( m , " - - " ) ;
2005-04-17 02:20:36 +04:00
}
static inline void module_unload_free ( struct module * mod )
{
}
2020-07-30 09:10:20 +03:00
static int ref_module ( struct module * a , struct module * b )
2005-04-17 02:20:36 +04:00
{
2010-06-05 21:17:37 +04:00
return strong_try_module_get ( b ) ;
2005-04-17 02:20:36 +04:00
}
2010-08-05 22:59:04 +04:00
static inline int module_unload_init ( struct module * mod )
2005-04-17 02:20:36 +04:00
{
2010-08-05 22:59:04 +04:00
return 0 ;
2005-04-17 02:20:36 +04:00
}
# endif /* CONFIG_MODULE_UNLOAD */
2012-01-16 03:32:55 +04:00
static size_t module_flags_taint ( struct module * mod , char * buf )
{
size_t l = 0 ;
2016-09-21 14:47:22 +03:00
int i ;
for ( i = 0 ; i < TAINT_FLAGS_COUNT ; i + + ) {
if ( taint_flags [ i ] . module & & test_bit ( i , & mod - > taints ) )
2017-01-02 05:25:25 +03:00
buf [ l + + ] = taint_flags [ i ] . c_true ;
2016-09-21 14:47:22 +03:00
}
2012-01-16 03:32:55 +04:00
return l ;
}
2006-11-24 14:15:25 +03:00
static ssize_t show_initstate ( struct module_attribute * mattr ,
2011-07-24 16:36:04 +04:00
struct module_kobject * mk , char * buffer )
2006-11-24 14:15:25 +03:00
{
const char * state = " unknown " ;
2011-07-24 16:36:04 +04:00
switch ( mk - > mod - > state ) {
2006-11-24 14:15:25 +03:00
case MODULE_STATE_LIVE :
state = " live " ;
break ;
case MODULE_STATE_COMING :
state = " coming " ;
break ;
case MODULE_STATE_GOING :
state = " going " ;
break ;
2013-01-12 05:08:44 +04:00
default :
BUG ( ) ;
2006-11-24 14:15:25 +03:00
}
return sprintf ( buffer , " %s \n " , state ) ;
}
2012-01-13 03:02:15 +04:00
static struct module_attribute modinfo_initstate =
__ATTR ( initstate , 0444 , show_initstate , NULL ) ;
2006-11-24 14:15:25 +03:00
2011-07-24 16:36:04 +04:00
static ssize_t store_uevent ( struct module_attribute * mattr ,
struct module_kobject * mk ,
const char * buffer , size_t count )
{
2018-12-05 14:27:44 +03:00
int rc ;
rc = kobject_synth_uevent ( & mk - > kobj , buffer , count ) ;
return rc ? rc : count ;
2011-07-24 16:36:04 +04:00
}
2012-01-13 03:02:15 +04:00
struct module_attribute module_uevent =
__ATTR ( uevent , 0200 , NULL , store_uevent ) ;
static ssize_t show_coresize ( struct module_attribute * mattr ,
struct module_kobject * mk , char * buffer )
{
2015-11-26 02:14:08 +03:00
return sprintf ( buffer , " %u \n " , mk - > mod - > core_layout . size ) ;
2012-01-13 03:02:15 +04:00
}
static struct module_attribute modinfo_coresize =
__ATTR ( coresize , 0444 , show_coresize , NULL ) ;
static ssize_t show_initsize ( struct module_attribute * mattr ,
struct module_kobject * mk , char * buffer )
{
2015-11-26 02:14:08 +03:00
return sprintf ( buffer , " %u \n " , mk - > mod - > init_layout . size ) ;
2012-01-13 03:02:15 +04:00
}
static struct module_attribute modinfo_initsize =
__ATTR ( initsize , 0444 , show_initsize , NULL ) ;
static ssize_t show_taint ( struct module_attribute * mattr ,
struct module_kobject * mk , char * buffer )
{
size_t l ;
l = module_flags_taint ( mk - > mod , buffer ) ;
buffer [ l + + ] = ' \n ' ;
return l ;
}
static struct module_attribute modinfo_taint =
__ATTR ( taint , 0444 , show_taint , NULL ) ;
2011-07-24 16:36:04 +04:00
2006-02-17 00:50:23 +03:00
static struct module_attribute * modinfo_attrs [ ] = {
2012-01-13 03:02:15 +04:00
& module_uevent ,
2006-02-17 00:50:23 +03:00
& modinfo_version ,
& modinfo_srcversion ,
2012-01-13 03:02:15 +04:00
& modinfo_initstate ,
& modinfo_coresize ,
& modinfo_initsize ,
& modinfo_taint ,
2006-02-17 00:50:23 +03:00
# ifdef CONFIG_MODULE_UNLOAD
2012-01-13 03:02:15 +04:00
& modinfo_refcnt ,
2006-02-17 00:50:23 +03:00
# endif
NULL ,
} ;
2005-04-17 02:20:36 +04:00
static const char vermagic [ ] = VERMAGIC_STRING ;
2009-03-31 23:05:33 +04:00
static int try_to_force_load ( struct module * mod , const char * reason )
2008-05-05 04:04:16 +04:00
{
# ifdef CONFIG_MODULE_FORCE_LOAD
2008-10-16 09:01:41 +04:00
if ( ! test_taint ( TAINT_FORCED_MODULE ) )
2013-11-13 03:11:28 +04:00
pr_warn ( " %s: %s: kernel tainted. \n " , mod - > name , reason ) ;
2013-01-21 10:47:39 +04:00
add_taint_module ( mod , TAINT_FORCED_MODULE , LOCKDEP_NOW_UNRELIABLE ) ;
2008-05-05 04:04:16 +04:00
return 0 ;
# else
return - ENOEXEC ;
# endif
}
2005-04-17 02:20:36 +04:00
# ifdef CONFIG_MODVERSIONS
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
static u32 resolve_rel_crc ( const s32 * crc )
2009-12-16 01:28:32 +03:00
{
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
return * ( u32 * ) ( ( void * ) crc + * crc ) ;
2009-12-16 01:28:32 +03:00
}
2017-04-22 01:35:26 +03:00
static int check_version ( const struct load_info * info ,
2005-04-17 02:20:36 +04:00
const char * symname ,
2014-11-10 02:01:29 +03:00
struct module * mod ,
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
const s32 * crc )
2005-04-17 02:20:36 +04:00
{
2017-04-22 01:35:26 +03:00
Elf_Shdr * sechdrs = info - > sechdrs ;
unsigned int versindex = info - > index . vers ;
2005-04-17 02:20:36 +04:00
unsigned int i , num_versions ;
struct modversion_info * versions ;
/* Exporting module didn't supply crcs? OK, we're already tainted. */
if ( ! crc )
return 1 ;
2008-05-09 10:24:21 +04:00
/* No versions at all? modprobe --force does this. */
if ( versindex = = 0 )
return try_to_force_load ( mod , symname ) = = 0 ;
2005-04-17 02:20:36 +04:00
versions = ( void * ) sechdrs [ versindex ] . sh_addr ;
num_versions = sechdrs [ versindex ] . sh_size
/ sizeof ( struct modversion_info ) ;
for ( i = 0 ; i < num_versions ; i + + ) {
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
u32 crcval ;
2005-04-17 02:20:36 +04:00
if ( strcmp ( versions [ i ] . name , symname ) ! = 0 )
continue ;
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
if ( IS_ENABLED ( CONFIG_MODULE_REL_CRCS ) )
crcval = resolve_rel_crc ( crc ) ;
else
crcval = * crc ;
if ( versions [ i ] . crc = = crcval )
2005-04-17 02:20:36 +04:00
return 1 ;
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
pr_debug ( " Found checksum %X vs module %lX \n " ,
crcval , versions [ i ] . crc ) ;
2008-05-05 04:04:16 +04:00
goto bad_version ;
2005-04-17 02:20:36 +04:00
}
2008-05-05 04:04:16 +04:00
2016-11-30 02:20:14 +03:00
/* Broken toolchain. Warn once, then let it go.. */
2017-04-22 01:35:27 +03:00
pr_warn_once ( " %s: no symbol version for %s \n " , info - > name , symname ) ;
2016-11-30 02:20:14 +03:00
return 1 ;
2008-05-05 04:04:16 +04:00
bad_version :
2014-11-10 02:01:29 +03:00
pr_warn ( " %s: disagrees about version of symbol %s \n " ,
2017-04-22 01:35:27 +03:00
info - > name , symname ) ;
2008-05-05 04:04:16 +04:00
return 0 ;
2005-04-17 02:20:36 +04:00
}
2017-04-22 01:35:26 +03:00
static inline int check_modstruct_version ( const struct load_info * info ,
2005-04-17 02:20:36 +04:00
struct module * mod )
{
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
const s32 * crc ;
2005-04-17 02:20:36 +04:00
2015-05-27 04:39:35 +03:00
/*
* Since this should be found in kernel ( which can ' t be removed ) , no
* locking is necessary - - use preempt_disable ( ) to placate lockdep .
*/
preempt_disable ( ) ;
2020-07-30 09:10:26 +03:00
if ( ! find_symbol ( " module_layout " , NULL , & crc , NULL , true , false ) ) {
2015-05-27 04:39:35 +03:00
preempt_enable ( ) ;
2005-04-17 02:20:36 +04:00
BUG ( ) ;
2015-05-27 04:39:35 +03:00
}
preempt_enable ( ) ;
2018-06-23 18:37:44 +03:00
return check_version ( info , " module_layout " , mod , crc ) ;
2005-04-17 02:20:36 +04:00
}
2008-05-09 10:25:28 +04:00
/* First part is kernel version, which we ignore if module has crcs. */
static inline int same_magic ( const char * amagic , const char * bmagic ,
bool has_crcs )
2005-04-17 02:20:36 +04:00
{
2008-05-09 10:25:28 +04:00
if ( has_crcs ) {
amagic + = strcspn ( amagic , " " ) ;
bmagic + = strcspn ( bmagic , " " ) ;
}
2005-04-17 02:20:36 +04:00
return strcmp ( amagic , bmagic ) = = 0 ;
}
# else
2017-04-22 01:35:26 +03:00
static inline int check_version ( const struct load_info * info ,
2005-04-17 02:20:36 +04:00
const char * symname ,
2014-11-10 02:01:29 +03:00
struct module * mod ,
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
const s32 * crc )
2005-04-17 02:20:36 +04:00
{
return 1 ;
}
2017-04-22 01:35:26 +03:00
static inline int check_modstruct_version ( const struct load_info * info ,
2005-04-17 02:20:36 +04:00
struct module * mod )
{
return 1 ;
}
2008-05-09 10:25:28 +04:00
static inline int same_magic ( const char * amagic , const char * bmagic ,
bool has_crcs )
2005-04-17 02:20:36 +04:00
{
return strcmp ( amagic , bmagic ) = = 0 ;
}
# endif /* CONFIG_MODVERSIONS */
module: add support for symbol namespaces.
The EXPORT_SYMBOL_NS() and EXPORT_SYMBOL_NS_GPL() macros can be used to
export a symbol to a specific namespace. There are no _GPL_FUTURE and
_UNUSED variants because these are currently unused, and I'm not sure
they are necessary.
I didn't add EXPORT_SYMBOL_NS() for ASM exports; this patch sets the
namespace of ASM exports to NULL by default. In case of relative
references, it will be relocatable to NULL. If there's a need, this
should be pretty easy to add.
A module that wants to use a symbol exported to a namespace must add a
MODULE_IMPORT_NS() statement to their module code; otherwise, modpost
will complain when building the module, and the kernel module loader
will emit an error and fail when loading the module.
MODULE_IMPORT_NS() adds a modinfo tag 'import_ns' to the module. That
tag can be observed by the modinfo command, modpost and kernel/module.c
at the time of loading the module.
The ELF symbols are renamed to include the namespace with an asm label;
for example, symbol 'usb_stor_suspend' in namespace USB_STORAGE becomes
'usb_stor_suspend.USB_STORAGE'. This allows modpost to do namespace
checking, without having to go through all the effort of parsing ELF and
relocation records just to get to the struct kernel_symbols.
On x86_64 I saw no difference in binary size (compression), but at
runtime this will require a word of memory per export to hold the
namespace. An alternative could be to store namespaced symbols in their
own section and use a separate 'struct namespaced_kernel_symbol' for
that section, at the cost of making the module loader more complex.
Co-developed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-06 13:32:27 +03:00
static char * get_modinfo ( const struct load_info * info , const char * tag ) ;
static char * get_next_modinfo ( const struct load_info * info , const char * tag ,
char * prev ) ;
static int verify_namespace_is_imported ( const struct load_info * info ,
const struct kernel_symbol * sym ,
struct module * mod )
{
const char * namespace ;
char * imported_namespace ;
namespace = kernel_symbol_namespace ( sym ) ;
2019-10-18 12:31:43 +03:00
if ( namespace & & namespace [ 0 ] ) {
module: add support for symbol namespaces.
The EXPORT_SYMBOL_NS() and EXPORT_SYMBOL_NS_GPL() macros can be used to
export a symbol to a specific namespace. There are no _GPL_FUTURE and
_UNUSED variants because these are currently unused, and I'm not sure
they are necessary.
I didn't add EXPORT_SYMBOL_NS() for ASM exports; this patch sets the
namespace of ASM exports to NULL by default. In case of relative
references, it will be relocatable to NULL. If there's a need, this
should be pretty easy to add.
A module that wants to use a symbol exported to a namespace must add a
MODULE_IMPORT_NS() statement to their module code; otherwise, modpost
will complain when building the module, and the kernel module loader
will emit an error and fail when loading the module.
MODULE_IMPORT_NS() adds a modinfo tag 'import_ns' to the module. That
tag can be observed by the modinfo command, modpost and kernel/module.c
at the time of loading the module.
The ELF symbols are renamed to include the namespace with an asm label;
for example, symbol 'usb_stor_suspend' in namespace USB_STORAGE becomes
'usb_stor_suspend.USB_STORAGE'. This allows modpost to do namespace
checking, without having to go through all the effort of parsing ELF and
relocation records just to get to the struct kernel_symbols.
On x86_64 I saw no difference in binary size (compression), but at
runtime this will require a word of memory per export to hold the
namespace. An alternative could be to store namespaced symbols in their
own section and use a separate 'struct namespaced_kernel_symbol' for
that section, at the cost of making the module loader more complex.
Co-developed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-06 13:32:27 +03:00
imported_namespace = get_modinfo ( info , " import_ns " ) ;
while ( imported_namespace ) {
if ( strcmp ( namespace , imported_namespace ) = = 0 )
return 0 ;
imported_namespace = get_next_modinfo (
info , " import_ns " , imported_namespace ) ;
}
2019-09-06 13:32:29 +03:00
# ifdef CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS
pr_warn (
# else
pr_err (
# endif
" %s: module uses symbol (%s) from namespace %s, but does not import it. \n " ,
mod - > name , kernel_symbol_name ( sym ) , namespace ) ;
# ifndef CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS
module: add support for symbol namespaces.
The EXPORT_SYMBOL_NS() and EXPORT_SYMBOL_NS_GPL() macros can be used to
export a symbol to a specific namespace. There are no _GPL_FUTURE and
_UNUSED variants because these are currently unused, and I'm not sure
they are necessary.
I didn't add EXPORT_SYMBOL_NS() for ASM exports; this patch sets the
namespace of ASM exports to NULL by default. In case of relative
references, it will be relocatable to NULL. If there's a need, this
should be pretty easy to add.
A module that wants to use a symbol exported to a namespace must add a
MODULE_IMPORT_NS() statement to their module code; otherwise, modpost
will complain when building the module, and the kernel module loader
will emit an error and fail when loading the module.
MODULE_IMPORT_NS() adds a modinfo tag 'import_ns' to the module. That
tag can be observed by the modinfo command, modpost and kernel/module.c
at the time of loading the module.
The ELF symbols are renamed to include the namespace with an asm label;
for example, symbol 'usb_stor_suspend' in namespace USB_STORAGE becomes
'usb_stor_suspend.USB_STORAGE'. This allows modpost to do namespace
checking, without having to go through all the effort of parsing ELF and
relocation records just to get to the struct kernel_symbols.
On x86_64 I saw no difference in binary size (compression), but at
runtime this will require a word of memory per export to hold the
namespace. An alternative could be to store namespaced symbols in their
own section and use a separate 'struct namespaced_kernel_symbol' for
that section, at the cost of making the module loader more complex.
Co-developed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-06 13:32:27 +03:00
return - EINVAL ;
2019-09-06 13:32:29 +03:00
# endif
module: add support for symbol namespaces.
The EXPORT_SYMBOL_NS() and EXPORT_SYMBOL_NS_GPL() macros can be used to
export a symbol to a specific namespace. There are no _GPL_FUTURE and
_UNUSED variants because these are currently unused, and I'm not sure
they are necessary.
I didn't add EXPORT_SYMBOL_NS() for ASM exports; this patch sets the
namespace of ASM exports to NULL by default. In case of relative
references, it will be relocatable to NULL. If there's a need, this
should be pretty easy to add.
A module that wants to use a symbol exported to a namespace must add a
MODULE_IMPORT_NS() statement to their module code; otherwise, modpost
will complain when building the module, and the kernel module loader
will emit an error and fail when loading the module.
MODULE_IMPORT_NS() adds a modinfo tag 'import_ns' to the module. That
tag can be observed by the modinfo command, modpost and kernel/module.c
at the time of loading the module.
The ELF symbols are renamed to include the namespace with an asm label;
for example, symbol 'usb_stor_suspend' in namespace USB_STORAGE becomes
'usb_stor_suspend.USB_STORAGE'. This allows modpost to do namespace
checking, without having to go through all the effort of parsing ELF and
relocation records just to get to the struct kernel_symbols.
On x86_64 I saw no difference in binary size (compression), but at
runtime this will require a word of memory per export to hold the
namespace. An alternative could be to store namespaced symbols in their
own section and use a separate 'struct namespaced_kernel_symbol' for
that section, at the cost of making the module loader more complex.
Co-developed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-06 13:32:27 +03:00
}
return 0 ;
}
2020-07-29 00:33:33 +03:00
static bool inherit_taint ( struct module * mod , struct module * owner )
{
if ( ! owner | | ! test_bit ( TAINT_PROPRIETARY_MODULE , & owner - > taints ) )
return true ;
if ( mod - > using_gplonly_symbols ) {
pr_err ( " %s: module using GPL-only symbols uses symbols from proprietary module %s. \n " ,
mod - > name , owner - > name ) ;
return false ;
}
if ( ! test_bit ( TAINT_PROPRIETARY_MODULE , & mod - > taints ) ) {
pr_warn ( " %s: module uses symbols from proprietary module %s, inheriting taint. \n " ,
mod - > name , owner - > name ) ;
set_bit ( TAINT_PROPRIETARY_MODULE , & mod - > taints ) ;
}
return true ;
}
module: add support for symbol namespaces.
The EXPORT_SYMBOL_NS() and EXPORT_SYMBOL_NS_GPL() macros can be used to
export a symbol to a specific namespace. There are no _GPL_FUTURE and
_UNUSED variants because these are currently unused, and I'm not sure
they are necessary.
I didn't add EXPORT_SYMBOL_NS() for ASM exports; this patch sets the
namespace of ASM exports to NULL by default. In case of relative
references, it will be relocatable to NULL. If there's a need, this
should be pretty easy to add.
A module that wants to use a symbol exported to a namespace must add a
MODULE_IMPORT_NS() statement to their module code; otherwise, modpost
will complain when building the module, and the kernel module loader
will emit an error and fail when loading the module.
MODULE_IMPORT_NS() adds a modinfo tag 'import_ns' to the module. That
tag can be observed by the modinfo command, modpost and kernel/module.c
at the time of loading the module.
The ELF symbols are renamed to include the namespace with an asm label;
for example, symbol 'usb_stor_suspend' in namespace USB_STORAGE becomes
'usb_stor_suspend.USB_STORAGE'. This allows modpost to do namespace
checking, without having to go through all the effort of parsing ELF and
relocation records just to get to the struct kernel_symbols.
On x86_64 I saw no difference in binary size (compression), but at
runtime this will require a word of memory per export to hold the
namespace. An alternative could be to store namespaced symbols in their
own section and use a separate 'struct namespaced_kernel_symbol' for
that section, at the cost of making the module loader more complex.
Co-developed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-06 13:32:27 +03:00
2010-06-05 21:17:36 +04:00
/* Resolve a symbol for this module. I.e. if we find one, record usage. */
2010-08-05 22:59:10 +04:00
static const struct kernel_symbol * resolve_symbol ( struct module * mod ,
const struct load_info * info ,
2008-12-06 03:03:56 +03:00
const char * name ,
2010-06-05 21:17:37 +04:00
char ownername [ ] )
2005-04-17 02:20:36 +04:00
{
struct module * owner ;
2008-12-06 03:03:56 +03:00
const struct kernel_symbol * sym ;
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 12:54:06 +03:00
const s32 * crc ;
2020-07-30 09:10:26 +03:00
enum mod_license license ;
2010-06-05 21:17:37 +04:00
int err ;
2005-04-17 02:20:36 +04:00
2015-02-11 07:31:13 +03:00
/*
* The module_mutex should not be a heavily contended lock ;
* if we get the occasional sleep here , we ' ll go an extra iteration
* in the wait_event_interruptible ( ) , which is harmless .
*/
sched_annotate_sleep ( ) ;
2010-06-05 21:17:36 +04:00
mutex_lock ( & module_mutex ) ;
2020-07-30 09:10:26 +03:00
sym = find_symbol ( name , & owner , & crc , & license ,
2008-10-16 09:01:41 +04:00
! ( mod - > taints & ( 1 < < TAINT_PROPRIETARY_MODULE ) ) , true ) ;
2010-06-05 21:17:37 +04:00
if ( ! sym )
goto unlock ;
2020-07-29 00:33:33 +03:00
if ( license = = GPL_ONLY )
mod - > using_gplonly_symbols = true ;
if ( ! inherit_taint ( mod , owner ) ) {
sym = NULL ;
goto getname ;
}
2017-04-22 01:35:26 +03:00
if ( ! check_version ( info , name , mod , crc ) ) {
2010-06-05 21:17:37 +04:00
sym = ERR_PTR ( - EINVAL ) ;
goto getname ;
2005-04-17 02:20:36 +04:00
}
2010-06-05 21:17:37 +04:00
module: add support for symbol namespaces.
The EXPORT_SYMBOL_NS() and EXPORT_SYMBOL_NS_GPL() macros can be used to
export a symbol to a specific namespace. There are no _GPL_FUTURE and
_UNUSED variants because these are currently unused, and I'm not sure
they are necessary.
I didn't add EXPORT_SYMBOL_NS() for ASM exports; this patch sets the
namespace of ASM exports to NULL by default. In case of relative
references, it will be relocatable to NULL. If there's a need, this
should be pretty easy to add.
A module that wants to use a symbol exported to a namespace must add a
MODULE_IMPORT_NS() statement to their module code; otherwise, modpost
will complain when building the module, and the kernel module loader
will emit an error and fail when loading the module.
MODULE_IMPORT_NS() adds a modinfo tag 'import_ns' to the module. That
tag can be observed by the modinfo command, modpost and kernel/module.c
at the time of loading the module.
The ELF symbols are renamed to include the namespace with an asm label;
for example, symbol 'usb_stor_suspend' in namespace USB_STORAGE becomes
'usb_stor_suspend.USB_STORAGE'. This allows modpost to do namespace
checking, without having to go through all the effort of parsing ELF and
relocation records just to get to the struct kernel_symbols.
On x86_64 I saw no difference in binary size (compression), but at
runtime this will require a word of memory per export to hold the
namespace. An alternative could be to store namespaced symbols in their
own section and use a separate 'struct namespaced_kernel_symbol' for
that section, at the cost of making the module loader more complex.
Co-developed-by: Martijn Coenen <maco@android.com>
Signed-off-by: Martijn Coenen <maco@android.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-06 13:32:27 +03:00
err = verify_namespace_is_imported ( info , sym , mod ) ;
if ( err ) {
sym = ERR_PTR ( err ) ;
goto getname ;
}
2010-06-05 21:17:37 +04:00
err = ref_module ( mod , owner ) ;
if ( err ) {
sym = ERR_PTR ( err ) ;
goto getname ;
}
getname :
/* We must make copy under the lock if we failed to get ref. */
strncpy ( ownername , module_name ( owner ) , MODULE_NAME_LEN ) ;
unlock :
2010-06-05 21:17:36 +04:00
mutex_unlock ( & module_mutex ) ;
2010-05-26 03:48:30 +04:00
return sym ;
2005-04-17 02:20:36 +04:00
}
2010-08-05 22:59:10 +04:00
static const struct kernel_symbol *
resolve_symbol_wait ( struct module * mod ,
const struct load_info * info ,
const char * name )
2010-06-05 21:17:37 +04:00
{
const struct kernel_symbol * ksym ;
2010-08-05 22:59:10 +04:00
char owner [ MODULE_NAME_LEN ] ;
2010-06-05 21:17:37 +04:00
if ( wait_event_interruptible_timeout ( module_wq ,
2010-08-05 22:59:10 +04:00
! IS_ERR ( ksym = resolve_symbol ( mod , info , name , owner ) )
| | PTR_ERR ( ksym ) ! = - EBUSY ,
2010-06-05 21:17:37 +04:00
30 * HZ ) < = 0 ) {
2013-11-13 03:11:28 +04:00
pr_warn ( " %s: gave up waiting for init of module %s. \n " ,
mod - > name , owner ) ;
2010-06-05 21:17:37 +04:00
}
return ksym ;
}
2005-04-17 02:20:36 +04:00
/*
* / sys / module / foo / sections stuff
* J . Corbet < corbet @ lwn . net >
*/
2010-08-05 22:59:09 +04:00
# ifdef CONFIG_SYSFS
2009-12-19 17:43:01 +03:00
2010-08-05 22:59:09 +04:00
# ifdef CONFIG_KALLSYMS
2009-12-19 17:43:01 +03:00
static inline bool sect_empty ( const Elf_Shdr * sect )
{
return ! ( sect - > sh_flags & SHF_ALLOC ) | | sect - > sh_size = = 0 ;
}
2014-11-10 02:01:29 +03:00
struct module_sect_attr {
2020-07-02 23:47:20 +03:00
struct bin_attribute battr ;
2008-03-13 12:03:44 +03:00
unsigned long address ;
} ;
2014-11-10 02:01:29 +03:00
struct module_sect_attrs {
2008-03-13 12:03:44 +03:00
struct attribute_group grp ;
unsigned int nsections ;
2020-02-13 18:14:09 +03:00
struct module_sect_attr attrs [ ] ;
2008-03-13 12:03:44 +03:00
} ;
2020-08-07 00:15:23 +03:00
# define MODULE_SECT_READ_SIZE (3 /* "0x", "\n" */ + (BITS_PER_LONG / 4))
2020-07-02 23:47:20 +03:00
static ssize_t module_sect_read ( struct file * file , struct kobject * kobj ,
struct bin_attribute * battr ,
char * buf , loff_t pos , size_t count )
2005-04-17 02:20:36 +04:00
{
struct module_sect_attr * sattr =
2020-07-02 23:47:20 +03:00
container_of ( battr , struct module_sect_attr , battr ) ;
2020-08-07 00:15:23 +03:00
char bounce [ MODULE_SECT_READ_SIZE + 1 ] ;
size_t wrote ;
2020-07-02 23:47:20 +03:00
if ( pos ! = 0 )
return - EINVAL ;
2020-08-07 00:15:23 +03:00
/*
* Since we ' re a binary read handler , we must account for the
* trailing NUL byte that sprintf will write : if " buf " is
* too small to hold the NUL , or the NUL is exactly the last
* byte , the read will look like it got truncated by one byte .
* Since there is no way to ask sprintf nicely to not write
* the NUL , we have to use a bounce buffer .
*/
wrote = scnprintf ( bounce , sizeof ( bounce ) , " 0x%px \n " ,
kallsyms_show_value ( file - > f_cred )
? ( void * ) sattr - > address : NULL ) ;
count = min ( count , wrote ) ;
memcpy ( buf , bounce , count ) ;
return count ;
2005-04-17 02:20:36 +04:00
}
2006-09-29 13:01:31 +04:00
static void free_sect_attrs ( struct module_sect_attrs * sect_attrs )
{
2008-03-13 12:03:44 +03:00
unsigned int section ;
2006-09-29 13:01:31 +04:00
for ( section = 0 ; section < sect_attrs - > nsections ; section + + )
2020-07-02 23:47:20 +03:00
kfree ( sect_attrs - > attrs [ section ] . battr . attr . name ) ;
2006-09-29 13:01:31 +04:00
kfree ( sect_attrs ) ;
}
2010-08-05 22:59:09 +04:00
static void add_sect_attrs ( struct module * mod , const struct load_info * info )
2005-04-17 02:20:36 +04:00
{
unsigned int nloaded = 0 , i , size [ 2 ] ;
struct module_sect_attrs * sect_attrs ;
struct module_sect_attr * sattr ;
2020-07-02 23:47:20 +03:00
struct bin_attribute * * gattr ;
2007-10-18 14:06:07 +04:00
2005-04-17 02:20:36 +04:00
/* Count loaded sections and allocate structures */
2010-08-05 22:59:09 +04:00
for ( i = 0 ; i < info - > hdr - > e_shnum ; i + + )
if ( ! sect_empty ( & info - > sechdrs [ i ] ) )
2005-04-17 02:20:36 +04:00
nloaded + + ;
2019-06-06 21:18:53 +03:00
size [ 0 ] = ALIGN ( struct_size ( sect_attrs , attrs , nloaded ) ,
2020-07-02 23:47:20 +03:00
sizeof ( sect_attrs - > grp . bin_attrs [ 0 ] ) ) ;
size [ 1 ] = ( nloaded + 1 ) * sizeof ( sect_attrs - > grp . bin_attrs [ 0 ] ) ;
2006-09-29 13:01:31 +04:00
sect_attrs = kzalloc ( size [ 0 ] + size [ 1 ] , GFP_KERNEL ) ;
if ( sect_attrs = = NULL )
2005-04-17 02:20:36 +04:00
return ;
/* Setup section attributes. */
sect_attrs - > grp . name = " sections " ;
2020-07-02 23:47:20 +03:00
sect_attrs - > grp . bin_attrs = ( void * ) sect_attrs + size [ 0 ] ;
2005-04-17 02:20:36 +04:00
2006-09-29 13:01:31 +04:00
sect_attrs - > nsections = 0 ;
2005-04-17 02:20:36 +04:00
sattr = & sect_attrs - > attrs [ 0 ] ;
2020-07-02 23:47:20 +03:00
gattr = & sect_attrs - > grp . bin_attrs [ 0 ] ;
2010-08-05 22:59:09 +04:00
for ( i = 0 ; i < info - > hdr - > e_shnum ; i + + ) {
Elf_Shdr * sec = & info - > sechdrs [ i ] ;
if ( sect_empty ( sec ) )
modules: don't export section names of empty sections via sysfs
On the parisc architecture we face for each and every loaded kernel module
this kernel "badness warning":
sysfs: cannot create duplicate filename '/module/ac97_bus/sections/.text'
Badness at fs/sysfs/dir.c:487
Reason for that is, that on parisc all kernel modules do have multiple
.text sections due to the usage of the -ffunction-sections compiler flag
which is needed to reach all jump targets on this platform.
An objdump on such a kernel module gives:
Sections:
Idx Name Size VMA LMA File off Algn
0 .note.gnu.build-id 00000024 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .text 00000000 00000000 00000000 00000058 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
2 .text.ac97_bus_match 0000001c 00000000 00000000 00000058 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
3 .text 00000000 00000000 00000000 000000d4 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
...
Since the .text sections are empty (size of 0 bytes) and won't be
loaded by the kernel module loader anyway, I don't see a reason
why such sections need to be listed under
/sys/module/<module_name>/sections/<section_name> either.
The attached patch does solve this issue by not exporting section
names which are empty.
This fixes bugzilla http://bugzilla.kernel.org/show_bug.cgi?id=14703
Signed-off-by: Helge Deller <deller@gmx.de>
CC: rusty@rustcorp.com.au
CC: akpm@linux-foundation.org
CC: James.Bottomley@HansenPartnership.com
CC: roland@redhat.com
CC: dave@hiauly1.hia.nrc.ca
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-03 02:29:15 +03:00
continue ;
2020-07-02 23:47:20 +03:00
sysfs_bin_attr_init ( & sattr - > battr ) ;
2010-08-05 22:59:09 +04:00
sattr - > address = sec - > sh_addr ;
2020-07-02 23:47:20 +03:00
sattr - > battr . attr . name =
kstrdup ( info - > secstrings + sec - > sh_name , GFP_KERNEL ) ;
if ( sattr - > battr . attr . name = = NULL )
2006-09-29 13:01:31 +04:00
goto out ;
sect_attrs - > nsections + + ;
2020-07-02 23:47:20 +03:00
sattr - > battr . read = module_sect_read ;
2020-08-07 00:15:23 +03:00
sattr - > battr . size = MODULE_SECT_READ_SIZE ;
2020-07-02 23:47:20 +03:00
sattr - > battr . attr . mode = 0400 ;
* ( gattr + + ) = & ( sattr + + ) - > battr ;
2005-04-17 02:20:36 +04:00
}
* gattr = NULL ;
if ( sysfs_create_group ( & mod - > mkobj . kobj , & sect_attrs - > grp ) )
goto out ;
mod - > sect_attrs = sect_attrs ;
return ;
out :
2006-09-29 13:01:31 +04:00
free_sect_attrs ( sect_attrs ) ;
2005-04-17 02:20:36 +04:00
}
static void remove_sect_attrs ( struct module * mod )
{
if ( mod - > sect_attrs ) {
sysfs_remove_group ( & mod - > mkobj . kobj ,
& mod - > sect_attrs - > grp ) ;
/* We are positive that no one is using any sect attrs
* at this point . Deallocate immediately . */
2006-09-29 13:01:31 +04:00
free_sect_attrs ( mod - > sect_attrs ) ;
2005-04-17 02:20:36 +04:00
mod - > sect_attrs = NULL ;
}
}
2007-10-17 10:26:40 +04:00
/*
* / sys / module / foo / notes / . section . name gives contents of SHT_NOTE sections .
*/
struct module_notes_attrs {
struct kobject * dir ;
unsigned int notes ;
2020-02-13 18:14:09 +03:00
struct bin_attribute attrs [ ] ;
2007-10-17 10:26:40 +04:00
} ;
2010-05-13 05:28:57 +04:00
static ssize_t module_notes_read ( struct file * filp , struct kobject * kobj ,
2007-10-17 10:26:40 +04:00
struct bin_attribute * bin_attr ,
char * buf , loff_t pos , size_t count )
{
/*
* The caller checked the pos and count against our size .
*/
memcpy ( buf , bin_attr - > private + pos , count ) ;
return count ;
}
static void free_notes_attrs ( struct module_notes_attrs * notes_attrs ,
unsigned int i )
{
if ( notes_attrs - > dir ) {
while ( i - - > 0 )
sysfs_remove_bin_file ( notes_attrs - > dir ,
& notes_attrs - > attrs [ i ] ) ;
2008-09-23 23:51:11 +04:00
kobject_put ( notes_attrs - > dir ) ;
2007-10-17 10:26:40 +04:00
}
kfree ( notes_attrs ) ;
}
2010-08-05 22:59:09 +04:00
static void add_notes_attrs ( struct module * mod , const struct load_info * info )
2007-10-17 10:26:40 +04:00
{
unsigned int notes , loaded , i ;
struct module_notes_attrs * notes_attrs ;
struct bin_attribute * nattr ;
2009-08-28 12:44:56 +04:00
/* failed to create section attributes, so can't create notes */
if ( ! mod - > sect_attrs )
return ;
2007-10-17 10:26:40 +04:00
/* Count notes sections and allocate structures. */
notes = 0 ;
2010-08-05 22:59:09 +04:00
for ( i = 0 ; i < info - > hdr - > e_shnum ; i + + )
if ( ! sect_empty ( & info - > sechdrs [ i ] ) & &
( info - > sechdrs [ i ] . sh_type = = SHT_NOTE ) )
2007-10-17 10:26:40 +04:00
+ + notes ;
if ( notes = = 0 )
return ;
treewide: Use struct_size() for kmalloc()-family
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct foo {
int stuff;
void *entry[];
};
instance = kmalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);
This patch makes the changes for kmalloc()-family (and kvmalloc()-family)
uses. It was done via automatic conversion with manual review for the
"CHECKME" non-standard cases noted below, using the following Coccinelle
script:
// pkey_cache = kmalloc(sizeof *pkey_cache + tprops->pkey_tbl_len *
// sizeof *pkey_cache->table, GFP_KERNEL);
@@
identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
expression GFP;
identifier VAR, ELEMENT;
expression COUNT;
@@
- alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT), GFP)
+ alloc(struct_size(VAR, ELEMENT, COUNT), GFP)
// mr = kzalloc(sizeof(*mr) + m * sizeof(mr->map[0]), GFP_KERNEL);
@@
identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
expression GFP;
identifier VAR, ELEMENT;
expression COUNT;
@@
- alloc(sizeof(*VAR) + COUNT * sizeof(VAR->ELEMENT[0]), GFP)
+ alloc(struct_size(VAR, ELEMENT, COUNT), GFP)
// Same pattern, but can't trivially locate the trailing element name,
// or variable name.
@@
identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
expression GFP;
expression SOMETHING, COUNT, ELEMENT;
@@
- alloc(sizeof(SOMETHING) + COUNT * sizeof(ELEMENT), GFP)
+ alloc(CHECKME_struct_size(&SOMETHING, ELEMENT, COUNT), GFP)
Signed-off-by: Kees Cook <keescook@chromium.org>
2018-05-08 23:45:50 +03:00
notes_attrs = kzalloc ( struct_size ( notes_attrs , attrs , notes ) ,
2007-10-17 10:26:40 +04:00
GFP_KERNEL ) ;
if ( notes_attrs = = NULL )
return ;
notes_attrs - > notes = notes ;
nattr = & notes_attrs - > attrs [ 0 ] ;
2010-08-05 22:59:09 +04:00
for ( loaded = i = 0 ; i < info - > hdr - > e_shnum ; + + i ) {
if ( sect_empty ( & info - > sechdrs [ i ] ) )
2007-10-17 10:26:40 +04:00
continue ;
2010-08-05 22:59:09 +04:00
if ( info - > sechdrs [ i ] . sh_type = = SHT_NOTE ) {
2010-02-13 00:41:56 +03:00
sysfs_bin_attr_init ( nattr ) ;
2020-07-02 23:47:20 +03:00
nattr - > attr . name = mod - > sect_attrs - > attrs [ loaded ] . battr . attr . name ;
2007-10-17 10:26:40 +04:00
nattr - > attr . mode = S_IRUGO ;
2010-08-05 22:59:09 +04:00
nattr - > size = info - > sechdrs [ i ] . sh_size ;
nattr - > private = ( void * ) info - > sechdrs [ i ] . sh_addr ;
2007-10-17 10:26:40 +04:00
nattr - > read = module_notes_read ;
+ + nattr ;
}
+ + loaded ;
}
2007-11-06 09:24:43 +03:00
notes_attrs - > dir = kobject_create_and_add ( " notes " , & mod - > mkobj . kobj ) ;
2007-10-17 10:26:40 +04:00
if ( ! notes_attrs - > dir )
goto out ;
for ( i = 0 ; i < notes ; + + i )
if ( sysfs_create_bin_file ( notes_attrs - > dir ,
& notes_attrs - > attrs [ i ] ) )
goto out ;
mod - > notes_attrs = notes_attrs ;
return ;
out :
free_notes_attrs ( notes_attrs , i ) ;
}
static void remove_notes_attrs ( struct module * mod )
{
if ( mod - > notes_attrs )
free_notes_attrs ( mod - > notes_attrs , mod - > notes_attrs - > notes ) ;
}
2005-04-17 02:20:36 +04:00
# else
2006-09-29 13:01:31 +04:00
2010-08-05 22:59:09 +04:00
static inline void add_sect_attrs ( struct module * mod ,
const struct load_info * info )
2005-04-17 02:20:36 +04:00
{
}
static inline void remove_sect_attrs ( struct module * mod )
{
}
2007-10-17 10:26:40 +04:00
2010-08-05 22:59:09 +04:00
static inline void add_notes_attrs ( struct module * mod ,
const struct load_info * info )
2007-10-17 10:26:40 +04:00
{
}
static inline void remove_notes_attrs ( struct module * mod )
{
}
2010-08-05 22:59:09 +04:00
# endif /* CONFIG_KALLSYMS */
2005-04-17 02:20:36 +04:00
2017-06-06 15:17:39 +03:00
static void del_usage_links ( struct module * mod )
2010-06-05 21:17:36 +04:00
{
# ifdef CONFIG_MODULE_UNLOAD
struct module_use * use ;
2010-06-05 21:17:36 +04:00
mutex_lock ( & module_mutex ) ;
2017-06-06 15:17:39 +03:00
list_for_each_entry ( use , & mod - > target_list , target_list )
sysfs_remove_link ( use - > target - > holders_dir , mod - > name ) ;
2010-06-05 21:17:36 +04:00
mutex_unlock ( & module_mutex ) ;
2010-06-05 21:17:36 +04:00
# endif
}
2017-06-06 15:17:39 +03:00
static int add_usage_links ( struct module * mod )
2010-06-05 21:17:36 +04:00
{
2017-06-06 15:17:39 +03:00
int ret = 0 ;
2010-06-05 21:17:36 +04:00
# ifdef CONFIG_MODULE_UNLOAD
struct module_use * use ;
2010-06-05 21:17:36 +04:00
mutex_lock ( & module_mutex ) ;
2017-06-06 15:17:39 +03:00
list_for_each_entry ( use , & mod - > target_list , target_list ) {
ret = sysfs_create_link ( use - > target - > holders_dir ,
& mod - > mkobj . kobj , mod - > name ) ;
if ( ret )
break ;
}
2010-06-05 21:17:36 +04:00
mutex_unlock ( & module_mutex ) ;
2017-06-06 15:17:39 +03:00
if ( ret )
del_usage_links ( mod ) ;
2010-06-05 21:17:36 +04:00
# endif
2017-06-06 15:17:39 +03:00
return ret ;
2010-06-05 21:17:36 +04:00
}
2019-06-11 18:00:07 +03:00
static void module_remove_modinfo_attrs ( struct module * mod , int end ) ;
2010-06-05 21:17:36 +04:00
static int module_add_modinfo_attrs ( struct module * mod )
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
{
struct module_attribute * attr ;
2006-02-17 00:50:23 +03:00
struct module_attribute * temp_attr ;
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
int error = 0 ;
int i ;
2006-02-17 00:50:23 +03:00
mod - > modinfo_attrs = kzalloc ( ( sizeof ( struct module_attribute ) *
( ARRAY_SIZE ( modinfo_attrs ) + 1 ) ) ,
GFP_KERNEL ) ;
if ( ! mod - > modinfo_attrs )
return - ENOMEM ;
temp_attr = mod - > modinfo_attrs ;
2019-06-11 18:00:07 +03:00
for ( i = 0 ; ( attr = modinfo_attrs [ i ] ) ; i + + ) {
2016-04-11 22:33:09 +03:00
if ( ! attr - > test | | attr - > test ( mod ) ) {
2006-02-17 00:50:23 +03:00
memcpy ( temp_attr , attr , sizeof ( * temp_attr ) ) ;
2010-02-13 00:41:56 +03:00
sysfs_attr_init ( & temp_attr - > attr ) ;
2014-11-10 02:01:29 +03:00
error = sysfs_create_file ( & mod - > mkobj . kobj ,
& temp_attr - > attr ) ;
2019-06-11 18:00:07 +03:00
if ( error )
goto error_out ;
2006-02-17 00:50:23 +03:00
+ + temp_attr ;
}
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
}
2019-06-11 18:00:07 +03:00
return 0 ;
error_out :
if ( i > 0 )
module_remove_modinfo_attrs ( mod , - - i ) ;
2019-12-28 14:54:55 +03:00
else
kfree ( mod - > modinfo_attrs ) ;
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
return error ;
}
2019-06-11 18:00:07 +03:00
static void module_remove_modinfo_attrs ( struct module * mod , int end )
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
{
struct module_attribute * attr ;
int i ;
2006-02-17 00:50:23 +03:00
for ( i = 0 ; ( attr = & mod - > modinfo_attrs [ i ] ) ; i + + ) {
2019-06-11 18:00:07 +03:00
if ( end > = 0 & & i > end )
break ;
2006-02-17 00:50:23 +03:00
/* pick a field to test for end of list */
if ( ! attr - > attr . name )
break ;
2014-11-10 02:01:29 +03:00
sysfs_remove_file ( & mod - > mkobj . kobj , & attr - > attr ) ;
2006-02-17 00:50:23 +03:00
if ( attr - > free )
attr - > free ( mod ) ;
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
}
2006-02-17 00:50:23 +03:00
kfree ( mod - > modinfo_attrs ) ;
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
}
2005-04-17 02:20:36 +04:00
2013-09-03 11:03:57 +04:00
static void mod_kobject_put ( struct module * mod )
{
DECLARE_COMPLETION_ONSTACK ( c ) ;
mod - > mkobj . kobj_completion = & c ;
kobject_put ( & mod - > mkobj . kobj ) ;
wait_for_completion ( & c ) ;
}
2010-06-05 21:17:36 +04:00
static int mod_sysfs_init ( struct module * mod )
2005-04-17 02:20:36 +04:00
{
int err ;
2008-01-28 02:38:40 +03:00
struct kobject * kobj ;
2005-04-17 02:20:36 +04:00
2007-04-14 00:15:19 +04:00
if ( ! module_sysfs_initialized ) {
2013-11-13 03:11:28 +04:00
pr_err ( " %s: module sysfs not initialized \n " , mod - > name ) ;
2006-09-26 03:25:36 +04:00
err = - EINVAL ;
goto out ;
}
2008-01-28 02:38:40 +03:00
kobj = kset_find_obj ( module_kset , mod - > name ) ;
if ( kobj ) {
2013-11-13 03:11:28 +04:00
pr_err ( " %s: module is already loaded \n " , mod - > name ) ;
2008-01-28 02:38:40 +03:00
kobject_put ( kobj ) ;
err = - EINVAL ;
goto out ;
}
2005-04-17 02:20:36 +04:00
mod - > mkobj . mod = mod ;
2006-11-24 14:15:25 +03:00
2007-12-18 09:05:35 +03:00
memset ( & mod - > mkobj . kobj , 0 , sizeof ( mod - > mkobj . kobj ) ) ;
mod - > mkobj . kobj . kset = module_kset ;
err = kobject_init_and_add ( & mod - > mkobj . kobj , & module_ktype , NULL ,
" %s " , mod - > name ) ;
if ( err )
2013-09-03 11:03:57 +04:00
mod_kobject_put ( mod ) ;
2007-01-18 15:26:15 +03:00
2007-11-30 01:46:11 +03:00
/* delay uevent until full sysfs population */
2007-01-18 15:26:15 +03:00
out :
return err ;
}
2010-06-05 21:17:36 +04:00
static int mod_sysfs_setup ( struct module * mod ,
2010-08-05 22:59:09 +04:00
const struct load_info * info ,
2007-01-18 15:26:15 +03:00
struct kernel_param * kparam ,
unsigned int num_params )
{
int err ;
2010-06-05 21:17:36 +04:00
err = mod_sysfs_init ( mod ) ;
if ( err )
goto out ;
2007-11-06 09:24:43 +03:00
mod - > holders_dir = kobject_create_and_add ( " holders " , & mod - > mkobj . kobj ) ;
2007-04-26 11:12:09 +04:00
if ( ! mod - > holders_dir ) {
err = - ENOMEM ;
2007-01-18 15:26:15 +03:00
goto out_unreg ;
2007-04-26 11:12:09 +04:00
}
2007-01-18 15:26:15 +03:00
2005-04-17 02:20:36 +04:00
err = module_param_sysfs_setup ( mod , kparam , num_params ) ;
if ( err )
2007-01-18 15:26:15 +03:00
goto out_unreg_holders ;
2005-04-17 02:20:36 +04:00
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
err = module_add_modinfo_attrs ( mod ) ;
if ( err )
2006-11-24 14:15:25 +03:00
goto out_unreg_param ;
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
2017-06-06 15:17:39 +03:00
err = add_usage_links ( mod ) ;
if ( err )
goto out_unreg_modinfo_attrs ;
2010-08-05 22:59:09 +04:00
add_sect_attrs ( mod , info ) ;
add_notes_attrs ( mod , info ) ;
2010-06-05 21:17:36 +04:00
2006-11-24 14:15:25 +03:00
kobject_uevent ( & mod - > mkobj . kobj , KOBJ_ADD ) ;
2005-04-17 02:20:36 +04:00
return 0 ;
2017-06-06 15:17:39 +03:00
out_unreg_modinfo_attrs :
2019-06-11 18:00:07 +03:00
module_remove_modinfo_attrs ( mod , - 1 ) ;
2006-11-24 14:15:25 +03:00
out_unreg_param :
module_param_sysfs_remove ( mod ) ;
2007-01-18 15:26:15 +03:00
out_unreg_holders :
2007-12-20 19:13:05 +03:00
kobject_put ( mod - > holders_dir ) ;
2007-01-18 15:26:15 +03:00
out_unreg :
2013-09-03 11:03:57 +04:00
mod_kobject_put ( mod ) ;
2010-06-05 21:17:36 +04:00
out :
2005-04-17 02:20:36 +04:00
return err ;
}
2008-05-20 13:59:48 +04:00
static void mod_sysfs_fini ( struct module * mod )
{
2010-08-05 22:59:09 +04:00
remove_notes_attrs ( mod ) ;
remove_sect_attrs ( mod ) ;
2013-09-03 11:03:57 +04:00
mod_kobject_put ( mod ) ;
2008-05-20 13:59:48 +04:00
}
2015-06-26 00:14:38 +03:00
static void init_param_lock ( struct module * mod )
{
mutex_init ( & mod - > param_lock ) ;
}
2010-08-05 22:59:09 +04:00
# else /* !CONFIG_SYSFS */
2008-05-20 13:59:48 +04:00
2010-08-05 22:59:09 +04:00
static int mod_sysfs_setup ( struct module * mod ,
const struct load_info * info ,
2010-06-05 21:17:36 +04:00
struct kernel_param * kparam ,
unsigned int num_params )
{
return 0 ;
}
2008-05-20 13:59:48 +04:00
static void mod_sysfs_fini ( struct module * mod )
{
}
2019-06-11 18:00:07 +03:00
static void module_remove_modinfo_attrs ( struct module * mod , int end )
2010-08-05 22:59:09 +04:00
{
}
2010-06-05 21:17:36 +04:00
static void del_usage_links ( struct module * mod )
{
}
2015-06-26 00:14:38 +03:00
static void init_param_lock ( struct module * mod )
{
}
2008-05-20 13:59:48 +04:00
# endif /* CONFIG_SYSFS */
2005-04-17 02:20:36 +04:00
2010-08-05 22:59:09 +04:00
static void mod_sysfs_teardown ( struct module * mod )
2005-04-17 02:20:36 +04:00
{
2010-06-05 21:17:36 +04:00
del_usage_links ( mod ) ;
2019-06-11 18:00:07 +03:00
module_remove_modinfo_attrs ( mod , - 1 ) ;
2005-04-17 02:20:36 +04:00
module_param_sysfs_remove ( mod ) ;
2007-12-20 19:13:05 +03:00
kobject_put ( mod - > mkobj . drivers_dir ) ;
kobject_put ( mod - > holders_dir ) ;
2008-05-20 13:59:48 +04:00
mod_sysfs_fini ( mod ) ;
2005-04-17 02:20:36 +04:00
}
2010-11-17 00:35:16 +03:00
/*
* LKM RO / NX protection : protect module ' s text / ro - data
* from modification and any data from execution .
2015-11-26 02:15:08 +03:00
*
* General layout of module is :
2016-07-27 05:36:21 +03:00
* [ text ] [ read - only - data ] [ ro - after - init ] [ writable data ]
* text_size - - - - - ^ ^ ^ ^
* ro_size - - - - - - - - - - - - - - - - - - - - - - - - | | |
* ro_after_init_size - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |
* size - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - |
2015-11-26 02:15:08 +03:00
*
* These values are always page - aligned ( as is base )
2010-11-17 00:35:16 +03:00
*/
2020-04-08 17:31:06 +03:00
/*
* Since some arches are moving towards PAGE_KERNEL module allocations instead
* of PAGE_KERNEL_EXEC , keep frob_text ( ) and module_enable_x ( ) outside of the
* CONFIG_STRICT_MODULE_RWX block below because they are needed regardless of
* whether we are strict .
*/
# ifdef CONFIG_ARCH_HAS_STRICT_MODULE_RWX
2015-11-26 02:15:08 +03:00
static void frob_text ( const struct module_layout * layout ,
int ( * set_memory ) ( unsigned long start , int num_pages ) )
2010-11-17 00:35:16 +03:00
{
2015-11-26 02:15:08 +03:00
BUG_ON ( ( unsigned long ) layout - > base & ( PAGE_SIZE - 1 ) ) ;
BUG_ON ( ( unsigned long ) layout - > text_size & ( PAGE_SIZE - 1 ) ) ;
set_memory ( ( unsigned long ) layout - > base ,
layout - > text_size > > PAGE_SHIFT ) ;
2010-11-17 00:35:16 +03:00
}
2020-04-08 17:31:06 +03:00
static void module_enable_x ( const struct module * mod )
{
frob_text ( & mod - > core_layout , set_memory_x ) ;
frob_text ( & mod - > init_layout , set_memory_x ) ;
}
# else /* !CONFIG_ARCH_HAS_STRICT_MODULE_RWX */
static void module_enable_x ( const struct module * mod ) { }
# endif /* CONFIG_ARCH_HAS_STRICT_MODULE_RWX */
2019-06-25 12:40:28 +03:00
# ifdef CONFIG_STRICT_MODULE_RWX
2015-11-26 02:15:08 +03:00
static void frob_rodata ( const struct module_layout * layout ,
int ( * set_memory ) ( unsigned long start , int num_pages ) )
2010-11-17 00:35:16 +03:00
{
2015-11-26 02:15:08 +03:00
BUG_ON ( ( unsigned long ) layout - > base & ( PAGE_SIZE - 1 ) ) ;
BUG_ON ( ( unsigned long ) layout - > text_size & ( PAGE_SIZE - 1 ) ) ;
BUG_ON ( ( unsigned long ) layout - > ro_size & ( PAGE_SIZE - 1 ) ) ;
set_memory ( ( unsigned long ) layout - > base + layout - > text_size ,
( layout - > ro_size - layout - > text_size ) > > PAGE_SHIFT ) ;
2010-11-17 00:35:16 +03:00
}
2016-07-27 05:36:21 +03:00
static void frob_ro_after_init ( const struct module_layout * layout ,
int ( * set_memory ) ( unsigned long start , int num_pages ) )
{
BUG_ON ( ( unsigned long ) layout - > base & ( PAGE_SIZE - 1 ) ) ;
BUG_ON ( ( unsigned long ) layout - > ro_size & ( PAGE_SIZE - 1 ) ) ;
BUG_ON ( ( unsigned long ) layout - > ro_after_init_size & ( PAGE_SIZE - 1 ) ) ;
set_memory ( ( unsigned long ) layout - > base + layout - > ro_size ,
( layout - > ro_after_init_size - layout - > ro_size ) > > PAGE_SHIFT ) ;
}
2015-11-26 02:15:08 +03:00
static void frob_writable_data ( const struct module_layout * layout ,
int ( * set_memory ) ( unsigned long start , int num_pages ) )
2010-11-17 00:35:16 +03:00
{
2015-11-26 02:15:08 +03:00
BUG_ON ( ( unsigned long ) layout - > base & ( PAGE_SIZE - 1 ) ) ;
2016-07-27 05:36:21 +03:00
BUG_ON ( ( unsigned long ) layout - > ro_after_init_size & ( PAGE_SIZE - 1 ) ) ;
2015-11-26 02:15:08 +03:00
BUG_ON ( ( unsigned long ) layout - > size & ( PAGE_SIZE - 1 ) ) ;
2016-07-27 05:36:21 +03:00
set_memory ( ( unsigned long ) layout - > base + layout - > ro_after_init_size ,
( layout - > size - layout - > ro_after_init_size ) > > PAGE_SHIFT ) ;
2010-11-17 00:35:16 +03:00
}
2020-04-29 18:24:53 +03:00
static void module_enable_ro ( const struct module * mod , bool after_init )
2011-05-20 02:55:26 +04:00
{
2016-11-14 09:15:05 +03:00
if ( ! rodata_enabled )
return ;
2019-04-26 03:11:37 +03:00
set_vm_flush_reset_perms ( mod - > core_layout . base ) ;
set_vm_flush_reset_perms ( mod - > init_layout . base ) ;
2015-11-26 02:15:08 +03:00
frob_text ( & mod - > core_layout , set_memory_ro ) ;
2019-04-26 03:11:31 +03:00
2015-11-26 02:15:08 +03:00
frob_rodata ( & mod - > core_layout , set_memory_ro ) ;
frob_text ( & mod - > init_layout , set_memory_ro ) ;
frob_rodata ( & mod - > init_layout , set_memory_ro ) ;
2016-07-27 05:36:21 +03:00
if ( after_init )
frob_ro_after_init ( & mod - > core_layout , set_memory_ro ) ;
2010-11-17 00:35:16 +03:00
}
2015-11-26 02:15:08 +03:00
static void module_enable_nx ( const struct module * mod )
2011-05-20 02:55:26 +04:00
{
2015-11-26 02:15:08 +03:00
frob_rodata ( & mod - > core_layout , set_memory_nx ) ;
2016-07-27 05:36:21 +03:00
frob_ro_after_init ( & mod - > core_layout , set_memory_nx ) ;
2015-11-26 02:15:08 +03:00
frob_writable_data ( & mod - > core_layout , set_memory_nx ) ;
frob_rodata ( & mod - > init_layout , set_memory_nx ) ;
frob_writable_data ( & mod - > init_layout , set_memory_nx ) ;
2011-05-20 02:55:26 +04:00
}
2020-04-03 20:13:03 +03:00
static int module_enforce_rwx_sections ( Elf_Ehdr * hdr , Elf_Shdr * sechdrs ,
char * secstrings , struct module * mod )
{
const unsigned long shf_wx = SHF_WRITE | SHF_EXECINSTR ;
int i ;
for ( i = 0 ; i < hdr - > e_shnum ; i + + ) {
if ( ( sechdrs [ i ] . sh_flags & shf_wx ) = = shf_wx )
return - ENOEXEC ;
}
return 0 ;
}
2019-06-25 12:40:28 +03:00
# else /* !CONFIG_STRICT_MODULE_RWX */
2015-11-26 02:15:08 +03:00
static void module_enable_nx ( const struct module * mod ) { }
2020-04-29 18:24:53 +03:00
static void module_enable_ro ( const struct module * mod , bool after_init ) { }
2020-04-03 20:13:03 +03:00
static int module_enforce_rwx_sections ( Elf_Ehdr * hdr , Elf_Shdr * sechdrs ,
char * secstrings , struct module * mod )
2019-06-20 05:18:14 +03:00
{
2020-04-03 20:13:03 +03:00
return 0 ;
2019-06-20 05:18:14 +03:00
}
2019-06-25 12:40:28 +03:00
# endif /* CONFIG_STRICT_MODULE_RWX */
2010-11-17 00:35:16 +03:00
2016-03-23 03:03:16 +03:00
# ifdef CONFIG_LIVEPATCH
/*
* Persist Elf information about a module . Copy the Elf header ,
* section header table , section string table , and symtab section
* index from info to mod - > klp_info .
*/
static int copy_module_elf ( struct module * mod , struct load_info * info )
{
unsigned int size , symndx ;
int ret ;
size = sizeof ( * mod - > klp_info ) ;
mod - > klp_info = kmalloc ( size , GFP_KERNEL ) ;
if ( mod - > klp_info = = NULL )
return - ENOMEM ;
/* Elf header */
size = sizeof ( mod - > klp_info - > hdr ) ;
memcpy ( & mod - > klp_info - > hdr , info - > hdr , size ) ;
/* Elf section header table */
size = sizeof ( * info - > sechdrs ) * info - > hdr - > e_shnum ;
2018-07-31 19:56:17 +03:00
mod - > klp_info - > sechdrs = kmemdup ( info - > sechdrs , size , GFP_KERNEL ) ;
2016-03-23 03:03:16 +03:00
if ( mod - > klp_info - > sechdrs = = NULL ) {
ret = - ENOMEM ;
goto free_info ;
}
/* Elf section name string table */
size = info - > sechdrs [ info - > hdr - > e_shstrndx ] . sh_size ;
2018-07-31 19:56:17 +03:00
mod - > klp_info - > secstrings = kmemdup ( info - > secstrings , size , GFP_KERNEL ) ;
2016-03-23 03:03:16 +03:00
if ( mod - > klp_info - > secstrings = = NULL ) {
ret = - ENOMEM ;
goto free_sechdrs ;
}
/* Elf symbol section index */
symndx = info - > index . sym ;
mod - > klp_info - > symndx = symndx ;
/*
* For livepatch modules , core_kallsyms . symtab is a complete
* copy of the original symbol table . Adjust sh_addr to point
* to core_kallsyms . symtab since the copy of the symtab in module
* init memory is freed at the end of do_init_module ( ) .
*/
mod - > klp_info - > sechdrs [ symndx ] . sh_addr = \
( unsigned long ) mod - > core_kallsyms . symtab ;
return 0 ;
free_sechdrs :
kfree ( mod - > klp_info - > sechdrs ) ;
free_info :
kfree ( mod - > klp_info ) ;
return ret ;
}
static void free_module_elf ( struct module * mod )
{
kfree ( mod - > klp_info - > sechdrs ) ;
kfree ( mod - > klp_info - > secstrings ) ;
kfree ( mod - > klp_info ) ;
}
# else /* !CONFIG_LIVEPATCH */
static int copy_module_elf ( struct module * mod , struct load_info * info )
{
return 0 ;
}
static void free_module_elf ( struct module * mod )
{
}
# endif /* CONFIG_LIVEPATCH */
2015-01-20 01:37:05 +03:00
void __weak module_memfree ( void * module_region )
2011-06-30 23:22:11 +04:00
{
2019-04-26 03:11:37 +03:00
/*
* This memory may be RO , and freeing RO memory in an interrupt is not
* supported by vmalloc .
*/
WARN_ON ( in_interrupt ( ) ) ;
2011-06-30 23:22:11 +04:00
vfree ( module_region ) ;
}
void __weak module_arch_cleanup ( struct module * mod )
{
}
2015-01-20 01:37:04 +03:00
void __weak module_arch_freeing_init ( struct module * mod )
{
}
2010-06-05 21:17:36 +04:00
/* Free a module, remove from lists, etc. */
2005-04-17 02:20:36 +04:00
static void free_module ( struct module * mod )
{
2009-08-17 12:56:28 +04:00
trace_module_free ( mod ) ;
2010-08-05 22:59:09 +04:00
mod_sysfs_teardown ( mod ) ;
2005-04-17 02:20:36 +04:00
2013-04-17 07:50:03 +04:00
/* We leave it in list to prevent duplicate loads, but make sure
* that noone uses it while it ' s being deconstructed . */
modules, lock around setting of MODULE_STATE_UNFORMED
A panic was seen in the following sitation.
There are two threads running on the system. The first thread is a system
monitoring thread that is reading /proc/modules. The second thread is
loading and unloading a module (in this example I'm using my simple
dummy-module.ko). Note, in the "real world" this occurred with the qlogic
driver module.
When doing this, the following panic occurred:
------------[ cut here ]------------
kernel BUG at kernel/module.c:3739!
invalid opcode: 0000 [#1] SMP
Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module]
CPU: 37 PID: 186343 Comm: cat Tainted: GF O-------------- 3.10.0+ #7
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000
RIP: 0010:[<ffffffff810d64c5>] [<ffffffff810d64c5>] module_flags+0xb5/0xc0
RSP: 0018:ffff88080fa7fe18 EFLAGS: 00010246
RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000
RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000
RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000
R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000
R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000
FS: 00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c
ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00
ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0
Call Trace:
[<ffffffff810d666c>] m_show+0x19c/0x1e0
[<ffffffff811e4d7e>] seq_read+0x16e/0x3b0
[<ffffffff812281ed>] proc_reg_read+0x3d/0x80
[<ffffffff811c0f2c>] vfs_read+0x9c/0x170
[<ffffffff811c1a58>] SyS_read+0x58/0xb0
[<ffffffff81605829>] system_call_fastpath+0x16/0x1b
Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
RIP [<ffffffff810d64c5>] module_flags+0xb5/0xc0
RSP <ffff88080fa7fe18>
Consider the two processes running on the system.
CPU 0 (/proc/modules reader)
CPU 1 (loading/unloading module)
CPU 0 opens /proc/modules, and starts displaying data for each module by
traversing the modules list via fs/seq_file.c:seq_open() and
fs/seq_file.c:seq_read(). For each module in the modules list, seq_read
does
op->start() <-- this is a pointer to m_start()
op->show() <- this is a pointer to m_show()
op->stop() <-- this is a pointer to m_stop()
The m_start(), m_show(), and m_stop() module functions are defined in
kernel/module.c. The m_start() and m_stop() functions acquire and release
the module_mutex respectively.
ie) When reading /proc/modules, the module_mutex is acquired and released
for each module.
m_show() is called with the module_mutex held. It accesses the module
struct data and attempts to write out module data. It is in this code
path that the above BUG_ON() warning is encountered, specifically m_show()
calls
static char *module_flags(struct module *mod, char *buf)
{
int bx = 0;
BUG_ON(mod->state == MODULE_STATE_UNFORMED);
...
The other thread, CPU 1, in unloading the module calls the syscall
delete_module() defined in kernel/module.c. The module_mutex is acquired
for a short time, and then released. free_module() is called without the
module_mutex. free_module() then sets mod->state = MODULE_STATE_UNFORMED,
also without the module_mutex. Some additional code is called and then the
module_mutex is reacquired to remove the module from the modules list:
/* Now we can delete it from the lists */
mutex_lock(&module_mutex);
stop_machine(__unlink_module, mod, NULL);
mutex_unlock(&module_mutex);
This is the sequence of events that leads to the panic.
CPU 1 is removing dummy_module via delete_module(). It acquires the
module_mutex, and then releases it. CPU 1 has NOT set dummy_module->state to
MODULE_STATE_UNFORMED yet.
CPU 0, which is reading the /proc/modules, acquires the module_mutex and
acquires a pointer to the dummy_module which is still in the modules list.
CPU 0 calls m_show for dummy_module. The check in m_show() for
MODULE_STATE_UNFORMED passed for dummy_module even though it is being
torn down.
Meanwhile CPU 1, which has been continuing to remove dummy_module without
holding the module_mutex, now calls free_module() and sets
dummy_module->state to MODULE_STATE_UNFORMED.
CPU 0 now calls module_flags() with dummy_module and ...
static char *module_flags(struct module *mod, char *buf)
{
int bx = 0;
BUG_ON(mod->state == MODULE_STATE_UNFORMED);
and BOOM.
Acquire and release the module_mutex lock around the setting of
MODULE_STATE_UNFORMED in the teardown path, which should resolve the
problem.
Testing: In the unpatched kernel I can panic the system within 1 minute by
doing
while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done
and
while (true) do cat /proc/modules; done
in separate terminals.
In the patched kernel I was able to run just over one hour without seeing
any issues. I also verified the output of panic via sysrq-c and the output
of /proc/modules looks correct for all three states for the dummy_module.
dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-)
dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE)
dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+)
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2014-10-13 20:21:39 +04:00
mutex_lock ( & module_mutex ) ;
2013-04-17 07:50:03 +04:00
mod - > state = MODULE_STATE_UNFORMED ;
modules, lock around setting of MODULE_STATE_UNFORMED
A panic was seen in the following sitation.
There are two threads running on the system. The first thread is a system
monitoring thread that is reading /proc/modules. The second thread is
loading and unloading a module (in this example I'm using my simple
dummy-module.ko). Note, in the "real world" this occurred with the qlogic
driver module.
When doing this, the following panic occurred:
------------[ cut here ]------------
kernel BUG at kernel/module.c:3739!
invalid opcode: 0000 [#1] SMP
Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module]
CPU: 37 PID: 186343 Comm: cat Tainted: GF O-------------- 3.10.0+ #7
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000
RIP: 0010:[<ffffffff810d64c5>] [<ffffffff810d64c5>] module_flags+0xb5/0xc0
RSP: 0018:ffff88080fa7fe18 EFLAGS: 00010246
RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000
RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000
RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000
R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000
R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000
FS: 00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c
ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00
ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0
Call Trace:
[<ffffffff810d666c>] m_show+0x19c/0x1e0
[<ffffffff811e4d7e>] seq_read+0x16e/0x3b0
[<ffffffff812281ed>] proc_reg_read+0x3d/0x80
[<ffffffff811c0f2c>] vfs_read+0x9c/0x170
[<ffffffff811c1a58>] SyS_read+0x58/0xb0
[<ffffffff81605829>] system_call_fastpath+0x16/0x1b
Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
RIP [<ffffffff810d64c5>] module_flags+0xb5/0xc0
RSP <ffff88080fa7fe18>
Consider the two processes running on the system.
CPU 0 (/proc/modules reader)
CPU 1 (loading/unloading module)
CPU 0 opens /proc/modules, and starts displaying data for each module by
traversing the modules list via fs/seq_file.c:seq_open() and
fs/seq_file.c:seq_read(). For each module in the modules list, seq_read
does
op->start() <-- this is a pointer to m_start()
op->show() <- this is a pointer to m_show()
op->stop() <-- this is a pointer to m_stop()
The m_start(), m_show(), and m_stop() module functions are defined in
kernel/module.c. The m_start() and m_stop() functions acquire and release
the module_mutex respectively.
ie) When reading /proc/modules, the module_mutex is acquired and released
for each module.
m_show() is called with the module_mutex held. It accesses the module
struct data and attempts to write out module data. It is in this code
path that the above BUG_ON() warning is encountered, specifically m_show()
calls
static char *module_flags(struct module *mod, char *buf)
{
int bx = 0;
BUG_ON(mod->state == MODULE_STATE_UNFORMED);
...
The other thread, CPU 1, in unloading the module calls the syscall
delete_module() defined in kernel/module.c. The module_mutex is acquired
for a short time, and then released. free_module() is called without the
module_mutex. free_module() then sets mod->state = MODULE_STATE_UNFORMED,
also without the module_mutex. Some additional code is called and then the
module_mutex is reacquired to remove the module from the modules list:
/* Now we can delete it from the lists */
mutex_lock(&module_mutex);
stop_machine(__unlink_module, mod, NULL);
mutex_unlock(&module_mutex);
This is the sequence of events that leads to the panic.
CPU 1 is removing dummy_module via delete_module(). It acquires the
module_mutex, and then releases it. CPU 1 has NOT set dummy_module->state to
MODULE_STATE_UNFORMED yet.
CPU 0, which is reading the /proc/modules, acquires the module_mutex and
acquires a pointer to the dummy_module which is still in the modules list.
CPU 0 calls m_show for dummy_module. The check in m_show() for
MODULE_STATE_UNFORMED passed for dummy_module even though it is being
torn down.
Meanwhile CPU 1, which has been continuing to remove dummy_module without
holding the module_mutex, now calls free_module() and sets
dummy_module->state to MODULE_STATE_UNFORMED.
CPU 0 now calls module_flags() with dummy_module and ...
static char *module_flags(struct module *mod, char *buf)
{
int bx = 0;
BUG_ON(mod->state == MODULE_STATE_UNFORMED);
and BOOM.
Acquire and release the module_mutex lock around the setting of
MODULE_STATE_UNFORMED in the teardown path, which should resolve the
problem.
Testing: In the unpatched kernel I can panic the system within 1 minute by
doing
while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done
and
while (true) do cat /proc/modules; done
in separate terminals.
In the patched kernel I was able to run just over one hour without seeing
any issues. I also verified the output of panic via sysrq-c and the output
of /proc/modules looks correct for all three states for the dummy_module.
dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-)
dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE)
dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+)
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2014-10-13 20:21:39 +04:00
mutex_unlock ( & module_mutex ) ;
2013-04-17 07:50:03 +04:00
2010-07-28 00:18:01 +04:00
/* Remove dynamic debug info */
ddebug_remove_module ( mod - > name ) ;
2005-04-17 02:20:36 +04:00
/* Arch-specific cleanup. */
module_arch_cleanup ( mod ) ;
/* Module unload stuff */
module_unload_free ( mod ) ;
2009-03-31 23:05:29 +04:00
/* Free any allocated parameters. */
destroy_params ( mod - > kp , mod - > num_kp ) ;
2016-03-23 03:03:16 +03:00
if ( is_livepatch_module ( mod ) )
free_module_elf ( mod ) ;
2013-04-17 07:50:03 +04:00
/* Now we can delete it from the lists */
mutex_lock ( & module_mutex ) ;
2014-11-10 01:57:29 +03:00
/* Unlink carefully: kallsyms could be walking list. */
list_del_rcu ( & mod - > list ) ;
2015-05-27 04:39:37 +03:00
mod_tree_remove ( mod ) ;
2014-11-10 01:58:29 +03:00
/* Remove this module from bug list, this uses list_del_rcu */
2014-11-10 01:57:29 +03:00
module_bug_cleanup ( mod ) ;
2015-05-27 04:39:35 +03:00
/* Wait for RCU-sched synchronizing before releasing mod->list and buglist. */
2018-11-07 06:17:01 +03:00
synchronize_rcu ( ) ;
2013-04-17 07:50:03 +04:00
mutex_unlock ( & module_mutex ) ;
2015-11-26 02:15:08 +03:00
/* This may be empty, but that's OK */
2015-01-20 01:37:04 +03:00
module_arch_freeing_init ( mod ) ;
2015-11-26 02:14:08 +03:00
module_memfree ( mod - > init_layout . base ) ;
2005-04-17 02:20:36 +04:00
kfree ( mod - > args ) ;
2010-03-10 12:56:10 +03:00
percpu_modfree ( mod ) ;
2010-08-05 22:59:04 +04:00
2015-02-26 18:23:11 +03:00
/* Free lock-classes; relies on the preceding sync_rcu(). */
2015-11-26 02:14:08 +03:00
lockdep_free_key_range ( mod - > core_layout . base , mod - > core_layout . size ) ;
[PATCH] lockdep: core
Do 'make oldconfig' and accept all the defaults for new config options -
reboot into the kernel and if everything goes well it should boot up fine and
you should have /proc/lockdep and /proc/lockdep_stats files.
Typically if the lock validator finds some problem it will print out
voluminous debug output that begins with "BUG: ..." and which syslog output
can be used by kernel developers to figure out the precise locking scenario.
What does the lock validator do? It "observes" and maps all locking rules as
they occur dynamically (as triggered by the kernel's natural use of spinlocks,
rwlocks, mutexes and rwsems). Whenever the lock validator subsystem detects a
new locking scenario, it validates this new rule against the existing set of
rules. If this new rule is consistent with the existing set of rules then the
new rule is added transparently and the kernel continues as normal. If the
new rule could create a deadlock scenario then this condition is printed out.
When determining validity of locking, all possible "deadlock scenarios" are
considered: assuming arbitrary number of CPUs, arbitrary irq context and task
context constellations, running arbitrary combinations of all the existing
locking scenarios. In a typical system this means millions of separate
scenarios. This is why we call it a "locking correctness" validator - for all
rules that are observed the lock validator proves it with mathematical
certainty that a deadlock could not occur (assuming that the lock validator
implementation itself is correct and its internal data structures are not
corrupted by some other kernel subsystem). [see more details and conditionals
of this statement in include/linux/lockdep.h and
Documentation/lockdep-design.txt]
Furthermore, this "all possible scenarios" property of the validator also
enables the finding of complex, highly unlikely multi-CPU multi-context races
via single single-context rules, increasing the likelyhood of finding bugs
drastically. In practical terms: the lock validator already found a bug in
the upstream kernel that could only occur on systems with 3 or more CPUs, and
which needed 3 very unlikely code sequences to occur at once on the 3 CPUs.
That bug was found and reported on a single-CPU system (!). So in essence a
race will be found "piecemail-wise", triggering all the necessary components
for the race, without having to reproduce the race scenario itself! In its
short existence the lock validator found and reported many bugs before they
actually caused a real deadlock.
To further increase the efficiency of the validator, the mapping is not per
"lock instance", but per "lock-class". For example, all struct inode objects
in the kernel have inode->inotify_mutex. If there are 10,000 inodes cached,
then there are 10,000 lock objects. But ->inotify_mutex is a single "lock
type", and all locking activities that occur against ->inotify_mutex are
"unified" into this single lock-class. The advantage of the lock-class
approach is that all historical ->inotify_mutex uses are mapped into a single
(and as narrow as possible) set of locking rules - regardless of how many
different tasks or inode structures it took to build this set of rules. The
set of rules persist during the lifetime of the kernel.
To see the rough magnitude of checking that the lock validator does, here's a
portion of /proc/lockdep_stats, fresh after bootup:
lock-classes: 694 [max: 2048]
direct dependencies: 1598 [max: 8192]
indirect dependencies: 17896
all direct dependencies: 16206
dependency chains: 1910 [max: 8192]
in-hardirq chains: 17
in-softirq chains: 105
in-process chains: 1065
stack-trace entries: 38761 [max: 131072]
combined max dependencies: 2033928
hardirq-safe locks: 24
hardirq-unsafe locks: 176
softirq-safe locks: 53
softirq-unsafe locks: 137
irq-safe locks: 59
irq-unsafe locks: 176
The lock validator has observed 1598 actual single-thread locking patterns,
and has validated all possible 2033928 distinct locking scenarios.
More details about the design of the lock validator can be found in
Documentation/lockdep-design.txt, which can also found at:
http://redhat.com/~mingo/lockdep-patches/lockdep-design.txt
[bunk@stusta.de: cleanups]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 11:24:50 +04:00
2005-04-17 02:20:36 +04:00
/* Finally, free the core (containing the module structure) */
2015-11-26 02:14:08 +03:00
module_memfree ( mod - > core_layout . base ) ;
2005-04-17 02:20:36 +04:00
}
void * __symbol_get ( const char * symbol )
{
struct module * owner ;
2008-12-06 03:03:56 +03:00
const struct kernel_symbol * sym ;
2005-04-17 02:20:36 +04:00
2007-07-16 10:41:46 +04:00
preempt_disable ( ) ;
2020-07-30 09:10:26 +03:00
sym = find_symbol ( symbol , & owner , NULL , NULL , true , true ) ;
2008-12-06 03:03:56 +03:00
if ( sym & & strong_try_module_get ( owner ) )
sym = NULL ;
2007-07-16 10:41:46 +04:00
preempt_enable ( ) ;
2005-04-17 02:20:36 +04:00
module: use relative references for __ksymtab entries
An ordinary arm64 defconfig build has ~64 KB worth of __ksymtab entries,
each consisting of two 64-bit fields containing absolute references, to
the symbol itself and to a char array containing its name, respectively.
When we build the same configuration with KASLR enabled, we end up with an
additional ~192 KB of relocations in the .init section, i.e., one 24 byte
entry for each absolute reference, which all need to be processed at boot
time.
Given how the struct kernel_symbol that describes each entry is completely
local to module.c (except for the references emitted by EXPORT_SYMBOL()
itself), we can easily modify it to contain two 32-bit relative references
instead. This reduces the size of the __ksymtab section by 50% for all
64-bit architectures, and gets rid of the runtime relocations entirely for
architectures implementing KASLR, either via standard PIE linking (arm64)
or using custom host tools (x86).
Note that the binary search involving __ksymtab contents relies on each
section being sorted by symbol name. This is implemented based on the
input section names, not the names in the ksymtab entries, so this patch
does not interfere with that.
Given that the use of place-relative relocations requires support both in
the toolchain and in the module loader, we cannot enable this feature for
all architectures. So make it dependent on whether
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS is defined.
Link: http://lkml.kernel.org/r/20180704083651.24360-4-ard.biesheuvel@linaro.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morris <james.morris@microsoft.com>
Cc: James Morris <jmorris@namei.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 07:56:09 +03:00
return sym ? ( void * ) kernel_symbol_value ( sym ) : NULL ;
2005-04-17 02:20:36 +04:00
}
EXPORT_SYMBOL_GPL ( __symbol_get ) ;
2006-01-08 12:04:25 +03:00
/*
* Ensure that an exported symbol [ global namespace ] does not already exist
2007-05-09 09:26:28 +04:00
* in the kernel or in some other module ' s exported symbol table .
2010-06-05 21:17:37 +04:00
*
* You must hold the module_mutex .
2006-01-08 12:04:25 +03:00
*/
2018-11-19 19:43:58 +03:00
static int verify_exported_symbols ( struct module * mod )
2006-01-08 12:04:25 +03:00
{
2008-05-02 06:15:00 +04:00
unsigned int i ;
2006-01-08 12:04:25 +03:00
struct module * owner ;
2008-05-02 06:15:00 +04:00
const struct kernel_symbol * s ;
struct {
const struct kernel_symbol * sym ;
unsigned int num ;
} arr [ ] = {
{ mod - > syms , mod - > num_syms } ,
{ mod - > gpl_syms , mod - > num_gpl_syms } ,
{ mod - > gpl_future_syms , mod - > num_gpl_future_syms } ,
2008-07-23 04:24:26 +04:00
# ifdef CONFIG_UNUSED_SYMBOLS
2008-05-02 06:15:00 +04:00
{ mod - > unused_syms , mod - > num_unused_syms } ,
{ mod - > unused_gpl_syms , mod - > num_unused_gpl_syms } ,
2008-07-23 04:24:26 +04:00
# endif
2008-05-02 06:15:00 +04:00
} ;
2006-01-08 12:04:25 +03:00
2008-05-02 06:15:00 +04:00
for ( i = 0 ; i < ARRAY_SIZE ( arr ) ; i + + ) {
for ( s = arr [ i ] . sym ; s < arr [ i ] . sym + arr [ i ] . num ; s + + ) {
module: use relative references for __ksymtab entries
An ordinary arm64 defconfig build has ~64 KB worth of __ksymtab entries,
each consisting of two 64-bit fields containing absolute references, to
the symbol itself and to a char array containing its name, respectively.
When we build the same configuration with KASLR enabled, we end up with an
additional ~192 KB of relocations in the .init section, i.e., one 24 byte
entry for each absolute reference, which all need to be processed at boot
time.
Given how the struct kernel_symbol that describes each entry is completely
local to module.c (except for the references emitted by EXPORT_SYMBOL()
itself), we can easily modify it to contain two 32-bit relative references
instead. This reduces the size of the __ksymtab section by 50% for all
64-bit architectures, and gets rid of the runtime relocations entirely for
architectures implementing KASLR, either via standard PIE linking (arm64)
or using custom host tools (x86).
Note that the binary search involving __ksymtab contents relies on each
section being sorted by symbol name. This is implemented based on the
input section names, not the names in the ksymtab entries, so this patch
does not interfere with that.
Given that the use of place-relative relocations requires support both in
the toolchain and in the module loader, we cannot enable this feature for
all architectures. So make it dependent on whether
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS is defined.
Link: http://lkml.kernel.org/r/20180704083651.24360-4-ard.biesheuvel@linaro.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morris <james.morris@microsoft.com>
Cc: James Morris <jmorris@namei.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 07:56:09 +03:00
if ( find_symbol ( kernel_symbol_name ( s ) , & owner , NULL ,
2020-07-30 09:10:26 +03:00
NULL , true , false ) ) {
2013-11-13 03:11:28 +04:00
pr_err ( " %s: exports duplicate symbol %s "
2008-05-02 06:15:00 +04:00
" (owned by %s) \n " ,
module: use relative references for __ksymtab entries
An ordinary arm64 defconfig build has ~64 KB worth of __ksymtab entries,
each consisting of two 64-bit fields containing absolute references, to
the symbol itself and to a char array containing its name, respectively.
When we build the same configuration with KASLR enabled, we end up with an
additional ~192 KB of relocations in the .init section, i.e., one 24 byte
entry for each absolute reference, which all need to be processed at boot
time.
Given how the struct kernel_symbol that describes each entry is completely
local to module.c (except for the references emitted by EXPORT_SYMBOL()
itself), we can easily modify it to contain two 32-bit relative references
instead. This reduces the size of the __ksymtab section by 50% for all
64-bit architectures, and gets rid of the runtime relocations entirely for
architectures implementing KASLR, either via standard PIE linking (arm64)
or using custom host tools (x86).
Note that the binary search involving __ksymtab contents relies on each
section being sorted by symbol name. This is implemented based on the
input section names, not the names in the ksymtab entries, so this patch
does not interfere with that.
Given that the use of place-relative relocations requires support both in
the toolchain and in the module loader, we cannot enable this feature for
all architectures. So make it dependent on whether
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS is defined.
Link: http://lkml.kernel.org/r/20180704083651.24360-4-ard.biesheuvel@linaro.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morris <james.morris@microsoft.com>
Cc: James Morris <jmorris@namei.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 07:56:09 +03:00
mod - > name , kernel_symbol_name ( s ) ,
module_name ( owner ) ) ;
2008-05-02 06:15:00 +04:00
return - ENOEXEC ;
}
2006-01-08 12:04:25 +03:00
}
2008-05-02 06:15:00 +04:00
}
return 0 ;
2006-01-08 12:04:25 +03:00
}
2007-11-08 19:37:38 +03:00
/* Change all symbols so that st_value encodes the pointer directly. */
2010-08-05 22:59:10 +04:00
static int simplify_symbols ( struct module * mod , const struct load_info * info )
{
Elf_Shdr * symsec = & info - > sechdrs [ info - > index . sym ] ;
Elf_Sym * sym = ( void * ) symsec - > sh_addr ;
2005-04-17 02:20:36 +04:00
unsigned long secbase ;
2010-08-05 22:59:10 +04:00
unsigned int i ;
2005-04-17 02:20:36 +04:00
int ret = 0 ;
2008-12-06 03:03:56 +03:00
const struct kernel_symbol * ksym ;
2005-04-17 02:20:36 +04:00
2010-08-05 22:59:10 +04:00
for ( i = 1 ; i < symsec - > sh_size / sizeof ( Elf_Sym ) ; i + + ) {
const char * name = info - > strtab + sym [ i ] . st_name ;
2005-04-17 02:20:36 +04:00
switch ( sym [ i ] . st_shndx ) {
case SHN_COMMON :
2014-02-08 12:01:09 +04:00
/* Ignore common symbols */
if ( ! strncmp ( name , " __gnu_lto " , 9 ) )
break ;
2005-04-17 02:20:36 +04:00
/* We compiled with -fno-common. These are not
supposed to happen . */
2011-12-06 23:11:31 +04:00
pr_debug ( " Common symbol: %s \n " , name ) ;
2014-11-10 02:01:29 +03:00
pr_warn ( " %s: please compile with -fno-common \n " ,
2005-04-17 02:20:36 +04:00
mod - > name ) ;
ret = - ENOEXEC ;
break ;
case SHN_ABS :
/* Don't need to do anything */
2011-12-06 23:11:31 +04:00
pr_debug ( " Absolute symbol: 0x%08lx \n " ,
2005-04-17 02:20:36 +04:00
( long ) sym [ i ] . st_value ) ;
break ;
2016-03-23 03:03:16 +03:00
case SHN_LIVEPATCH :
/* Livepatch symbols are resolved by livepatch */
break ;
2005-04-17 02:20:36 +04:00
case SHN_UNDEF :
2010-08-05 22:59:10 +04:00
ksym = resolve_symbol_wait ( mod , info , name ) ;
2005-04-17 02:20:36 +04:00
/* Ok if resolved. */
2010-06-05 21:17:37 +04:00
if ( ksym & & ! IS_ERR ( ksym ) ) {
module: use relative references for __ksymtab entries
An ordinary arm64 defconfig build has ~64 KB worth of __ksymtab entries,
each consisting of two 64-bit fields containing absolute references, to
the symbol itself and to a char array containing its name, respectively.
When we build the same configuration with KASLR enabled, we end up with an
additional ~192 KB of relocations in the .init section, i.e., one 24 byte
entry for each absolute reference, which all need to be processed at boot
time.
Given how the struct kernel_symbol that describes each entry is completely
local to module.c (except for the references emitted by EXPORT_SYMBOL()
itself), we can easily modify it to contain two 32-bit relative references
instead. This reduces the size of the __ksymtab section by 50% for all
64-bit architectures, and gets rid of the runtime relocations entirely for
architectures implementing KASLR, either via standard PIE linking (arm64)
or using custom host tools (x86).
Note that the binary search involving __ksymtab contents relies on each
section being sorted by symbol name. This is implemented based on the
input section names, not the names in the ksymtab entries, so this patch
does not interfere with that.
Given that the use of place-relative relocations requires support both in
the toolchain and in the module loader, we cannot enable this feature for
all architectures. So make it dependent on whether
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS is defined.
Link: http://lkml.kernel.org/r/20180704083651.24360-4-ard.biesheuvel@linaro.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morris <james.morris@microsoft.com>
Cc: James Morris <jmorris@namei.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 07:56:09 +03:00
sym [ i ] . st_value = kernel_symbol_value ( ksym ) ;
2005-04-17 02:20:36 +04:00
break ;
2008-12-06 03:03:56 +03:00
}
2005-04-17 02:20:36 +04:00
/* Ok if weak. */
2010-06-05 21:17:37 +04:00
if ( ! ksym & & ELF_ST_BIND ( sym [ i ] . st_info ) = = STB_WEAK )
2005-04-17 02:20:36 +04:00
break ;
2010-06-05 21:17:37 +04:00
ret = PTR_ERR ( ksym ) ? : - ENOENT ;
2018-06-22 18:38:50 +03:00
pr_warn ( " %s: Unknown symbol %s (err %d) \n " ,
mod - > name , name , ret ) ;
2005-04-17 02:20:36 +04:00
break ;
default :
/* Divert to percpu allocation if a percpu var. */
2010-08-05 22:59:10 +04:00
if ( sym [ i ] . st_shndx = = info - > index . pcpu )
2010-03-10 12:56:10 +03:00
secbase = ( unsigned long ) mod_percpu ( mod ) ;
2005-04-17 02:20:36 +04:00
else
2010-08-05 22:59:10 +04:00
secbase = info - > sechdrs [ sym [ i ] . st_shndx ] . sh_addr ;
2005-04-17 02:20:36 +04:00
sym [ i ] . st_value + = secbase ;
break ;
}
}
return ret ;
}
2010-08-05 22:59:10 +04:00
static int apply_relocations ( struct module * mod , const struct load_info * info )
2010-08-05 22:59:05 +04:00
{
unsigned int i ;
int err = 0 ;
/* Now do relocations. */
2010-08-05 22:59:10 +04:00
for ( i = 1 ; i < info - > hdr - > e_shnum ; i + + ) {
unsigned int infosec = info - > sechdrs [ i ] . sh_info ;
2010-08-05 22:59:05 +04:00
/* Not a valid relocation section? */
2010-08-05 22:59:10 +04:00
if ( infosec > = info - > hdr - > e_shnum )
2010-08-05 22:59:05 +04:00
continue ;
/* Don't bother with non-allocated sections */
2010-08-05 22:59:10 +04:00
if ( ! ( info - > sechdrs [ infosec ] . sh_flags & SHF_ALLOC ) )
2010-08-05 22:59:05 +04:00
continue ;
2016-03-23 03:03:16 +03:00
if ( info - > sechdrs [ i ] . sh_flags & SHF_RELA_LIVEPATCH )
livepatch: Apply vmlinux-specific KLP relocations early
KLP relocations are livepatch-specific relocations which are applied to
a KLP module's text or data. They exist for two reasons:
1) Unexported symbols: replacement functions often need to access
unexported symbols (e.g. static functions), which "normal"
relocations don't allow.
2) Late module patching: this is the ability for a KLP module to
bypass normal module dependencies, such that the KLP module can be
loaded *before* a to-be-patched module. This means that
relocations which need to access symbols in the to-be-patched
module might need to be applied to the KLP module well after it has
been loaded.
Non-late-patched KLP relocations are applied from the KLP module's init
function. That usually works fine, unless the patched code wants to use
alternatives, paravirt patching, jump tables, or some other special
section which needs relocations. Then we run into ordering issues and
crashes.
In order for those special sections to work properly, the KLP
relocations should be applied *before* the special section init code
runs, such as apply_paravirt(), apply_alternatives(), or
jump_label_apply_nops().
You might think the obvious solution would be to move the KLP relocation
initialization earlier, but it's not necessarily that simple. The
problem is the above-mentioned late module patching, for which KLP
relocations can get applied well after the KLP module is loaded.
To "fix" this issue in the past, we created .klp.arch sections:
.klp.arch.{module}..altinstructions
.klp.arch.{module}..parainstructions
Those sections allow KLP late module patching code to call
apply_paravirt() and apply_alternatives() after the module-specific KLP
relocations (.klp.rela.{module}.{section}) have been applied.
But that has a lot of drawbacks, including code complexity, the need for
arch-specific code, and the (per-arch) danger that we missed some
special section -- for example the __jump_table section which is used
for jump labels.
It turns out there's a simpler and more functional approach. There are
two kinds of KLP relocation sections:
1) vmlinux-specific KLP relocation sections
.klp.rela.vmlinux.{sec}
These are relocations (applied to the KLP module) which reference
unexported vmlinux symbols.
2) module-specific KLP relocation sections
.klp.rela.{module}.{sec}:
These are relocations (applied to the KLP module) which reference
unexported or exported module symbols.
Up until now, these have been treated the same. However, they're
inherently different.
Because of late module patching, module-specific KLP relocations can be
applied very late, thus they can create the ordering headaches described
above.
But vmlinux-specific KLP relocations don't have that problem. There's
nothing to prevent them from being applied earlier. So apply them at
the same time as normal relocations, when the KLP module is being
loaded.
This means that for vmlinux-specific KLP relocations, we no longer have
any ordering issues. vmlinux-referencing jump labels, alternatives, and
paravirt patching will work automatically, without the need for the
.klp.arch hacks.
All that said, for module-specific KLP relocations, the ordering
problems still exist and we *do* still need .klp.arch. Or do we? Stay
tuned.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-04-29 18:24:44 +03:00
err = klp_apply_section_relocs ( mod , info - > sechdrs ,
info - > secstrings ,
info - > strtab ,
info - > index . sym , i ,
NULL ) ;
else if ( info - > sechdrs [ i ] . sh_type = = SHT_REL )
2010-08-05 22:59:10 +04:00
err = apply_relocate ( info - > sechdrs , info - > strtab ,
info - > index . sym , i , mod ) ;
else if ( info - > sechdrs [ i ] . sh_type = = SHT_RELA )
err = apply_relocate_add ( info - > sechdrs , info - > strtab ,
info - > index . sym , i , mod ) ;
2010-08-05 22:59:05 +04:00
if ( err < 0 )
break ;
}
return err ;
}
2008-12-31 14:31:18 +03:00
/* Additional bytes needed by arch in front of individual sections */
unsigned int __weak arch_mod_section_prepend ( struct module * mod ,
unsigned int section )
{
/* default implementation just returns zero */
return 0 ;
}
2005-04-17 02:20:36 +04:00
/* Update size with this section: return offset. */
2008-12-31 14:31:18 +03:00
static long get_offset ( struct module * mod , unsigned int * size ,
Elf_Shdr * sechdr , unsigned int section )
2005-04-17 02:20:36 +04:00
{
long ret ;
2008-12-31 14:31:18 +03:00
* size + = arch_mod_section_prepend ( mod , section ) ;
2005-04-17 02:20:36 +04:00
ret = ALIGN ( * size , sechdr - > sh_addralign ? : 1 ) ;
* size = ret + sechdr - > sh_size ;
return ret ;
}
/* Lay out the SHF_ALLOC sections in a way not dissimilar to how ld
might - - code , read - only data , read - write data , small data . Tally
sizes , and place the offsets into sh_entsize fields : high bit means it
belongs in init . */
2010-08-05 22:59:10 +04:00
static void layout_sections ( struct module * mod , struct load_info * info )
2005-04-17 02:20:36 +04:00
{
static unsigned long const masks [ ] [ 2 ] = {
/* NOTE: all executable code must be the first section
* in this array ; otherwise modify the text_size
* finder in the two loops below */
{ SHF_EXECINSTR | SHF_ALLOC , ARCH_SHF_SMALL } ,
{ SHF_ALLOC , SHF_WRITE | ARCH_SHF_SMALL } ,
2016-07-27 05:36:21 +03:00
{ SHF_RO_AFTER_INIT | SHF_ALLOC , ARCH_SHF_SMALL } ,
2005-04-17 02:20:36 +04:00
{ SHF_WRITE | SHF_ALLOC , ARCH_SHF_SMALL } ,
{ ARCH_SHF_SMALL | SHF_ALLOC , 0 }
} ;
unsigned int m , i ;
2010-08-05 22:59:10 +04:00
for ( i = 0 ; i < info - > hdr - > e_shnum ; i + + )
info - > sechdrs [ i ] . sh_entsize = ~ 0UL ;
2005-04-17 02:20:36 +04:00
2011-12-06 23:11:31 +04:00
pr_debug ( " Core section allocation order: \n " ) ;
2005-04-17 02:20:36 +04:00
for ( m = 0 ; m < ARRAY_SIZE ( masks ) ; + + m ) {
2010-08-05 22:59:10 +04:00
for ( i = 0 ; i < info - > hdr - > e_shnum ; + + i ) {
Elf_Shdr * s = & info - > sechdrs [ i ] ;
const char * sname = info - > secstrings + s - > sh_name ;
2005-04-17 02:20:36 +04:00
if ( ( s - > sh_flags & masks [ m ] [ 0 ] ) ! = masks [ m ] [ 0 ]
| | ( s - > sh_flags & masks [ m ] [ 1 ] )
| | s - > sh_entsize ! = ~ 0UL
2020-05-14 13:36:41 +03:00
| | module_init_section ( sname ) )
2005-04-17 02:20:36 +04:00
continue ;
2015-11-26 02:14:08 +03:00
s - > sh_entsize = get_offset ( mod , & mod - > core_layout . size , s , i ) ;
2011-12-06 23:11:31 +04:00
pr_debug ( " \t %s \n " , sname ) ;
2005-04-17 02:20:36 +04:00
}
2010-11-17 00:35:16 +03:00
switch ( m ) {
case 0 : /* executable */
2015-11-26 02:14:08 +03:00
mod - > core_layout . size = debug_align ( mod - > core_layout . size ) ;
mod - > core_layout . text_size = mod - > core_layout . size ;
2010-11-17 00:35:16 +03:00
break ;
case 1 : /* RO: text and ro-data */
2015-11-26 02:14:08 +03:00
mod - > core_layout . size = debug_align ( mod - > core_layout . size ) ;
mod - > core_layout . ro_size = mod - > core_layout . size ;
2010-11-17 00:35:16 +03:00
break ;
2016-07-27 05:36:21 +03:00
case 2 : /* RO after init */
mod - > core_layout . size = debug_align ( mod - > core_layout . size ) ;
mod - > core_layout . ro_after_init_size = mod - > core_layout . size ;
break ;
case 4 : /* whole core */
2015-11-26 02:14:08 +03:00
mod - > core_layout . size = debug_align ( mod - > core_layout . size ) ;
2010-11-17 00:35:16 +03:00
break ;
}
2005-04-17 02:20:36 +04:00
}
2011-12-06 23:11:31 +04:00
pr_debug ( " Init section allocation order: \n " ) ;
2005-04-17 02:20:36 +04:00
for ( m = 0 ; m < ARRAY_SIZE ( masks ) ; + + m ) {
2010-08-05 22:59:10 +04:00
for ( i = 0 ; i < info - > hdr - > e_shnum ; + + i ) {
Elf_Shdr * s = & info - > sechdrs [ i ] ;
const char * sname = info - > secstrings + s - > sh_name ;
2005-04-17 02:20:36 +04:00
if ( ( s - > sh_flags & masks [ m ] [ 0 ] ) ! = masks [ m ] [ 0 ]
| | ( s - > sh_flags & masks [ m ] [ 1 ] )
| | s - > sh_entsize ! = ~ 0UL
2020-05-14 13:36:41 +03:00
| | ! module_init_section ( sname ) )
2005-04-17 02:20:36 +04:00
continue ;
2015-11-26 02:14:08 +03:00
s - > sh_entsize = ( get_offset ( mod , & mod - > init_layout . size , s , i )
2005-04-17 02:20:36 +04:00
| INIT_OFFSET_MASK ) ;
2011-12-06 23:11:31 +04:00
pr_debug ( " \t %s \n " , sname ) ;
2005-04-17 02:20:36 +04:00
}
2010-11-17 00:35:16 +03:00
switch ( m ) {
case 0 : /* executable */
2015-11-26 02:14:08 +03:00
mod - > init_layout . size = debug_align ( mod - > init_layout . size ) ;
mod - > init_layout . text_size = mod - > init_layout . size ;
2010-11-17 00:35:16 +03:00
break ;
case 1 : /* RO: text and ro-data */
2015-11-26 02:14:08 +03:00
mod - > init_layout . size = debug_align ( mod - > init_layout . size ) ;
mod - > init_layout . ro_size = mod - > init_layout . size ;
2010-11-17 00:35:16 +03:00
break ;
2016-07-27 05:36:21 +03:00
case 2 :
/*
* RO after init doesn ' t apply to init_layout ( only
* core_layout ) , so it just takes the value of ro_size .
*/
mod - > init_layout . ro_after_init_size = mod - > init_layout . ro_size ;
break ;
case 4 : /* whole init */
2015-11-26 02:14:08 +03:00
mod - > init_layout . size = debug_align ( mod - > init_layout . size ) ;
2010-11-17 00:35:16 +03:00
break ;
}
2005-04-17 02:20:36 +04:00
}
}
static void set_license ( struct module * mod , const char * license )
{
if ( ! license )
license = " unspecified " ;
2006-10-11 12:21:48 +04:00
if ( ! license_is_gpl_compatible ( license ) ) {
2008-10-16 09:01:41 +04:00
if ( ! test_taint ( TAINT_PROPRIETARY_MODULE ) )
2013-11-13 03:11:28 +04:00
pr_warn ( " %s: module license '%s' taints kernel. \n " ,
mod - > name , license ) ;
2013-01-21 10:47:39 +04:00
add_taint_module ( mod , TAINT_PROPRIETARY_MODULE ,
LOCKDEP_NOW_UNRELIABLE ) ;
2005-04-17 02:20:36 +04:00
}
}
/* Parse tag=value strings from .modinfo section */
static char * next_string ( char * string , unsigned long * secsize )
{
/* Skip non-zero chars */
while ( string [ 0 ] ) {
string + + ;
if ( ( * secsize ) - - < = 1 )
return NULL ;
}
/* Skip any zero padding. */
while ( ! string [ 0 ] ) {
string + + ;
if ( ( * secsize ) - - < = 1 )
return NULL ;
}
return string ;
}
2019-09-06 13:32:25 +03:00
static char * get_next_modinfo ( const struct load_info * info , const char * tag ,
char * prev )
2005-04-17 02:20:36 +04:00
{
char * p ;
unsigned int taglen = strlen ( tag ) ;
2010-08-05 22:59:10 +04:00
Elf_Shdr * infosec = & info - > sechdrs [ info - > index . info ] ;
unsigned long size = infosec - > sh_size ;
2005-04-17 02:20:36 +04:00
2018-06-22 15:00:01 +03:00
/*
* get_modinfo ( ) calls made before rewrite_section_headers ( )
* must use sh_offset , as sh_addr isn ' t set !
*/
2019-09-06 13:32:25 +03:00
char * modinfo = ( char * ) info - > hdr + infosec - > sh_offset ;
if ( prev ) {
size - = prev - modinfo ;
modinfo = next_string ( prev , & size ) ;
}
for ( p = modinfo ; p ; p = next_string ( p , & size ) ) {
2005-04-17 02:20:36 +04:00
if ( strncmp ( p , tag , taglen ) = = 0 & & p [ taglen ] = = ' = ' )
return p + taglen + 1 ;
}
return NULL ;
}
2019-09-06 13:32:25 +03:00
static char * get_modinfo ( const struct load_info * info , const char * tag )
{
return get_next_modinfo ( info , tag , NULL ) ;
}
2010-08-05 22:59:10 +04:00
static void setup_modinfo ( struct module * mod , struct load_info * info )
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
{
struct module_attribute * attr ;
int i ;
for ( i = 0 ; ( attr = modinfo_attrs [ i ] ) ; i + + ) {
if ( attr - > setup )
2010-08-05 22:59:10 +04:00
attr - > setup ( mod , get_modinfo ( info , attr - > attr . name ) ) ;
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
}
}
2009-09-25 10:32:58 +04:00
static void free_modinfo ( struct module * mod )
{
struct module_attribute * attr ;
int i ;
for ( i = 0 ; ( attr = modinfo_attrs [ i ] ) ; i + + ) {
if ( attr - > free )
attr - > free ( mod ) ;
}
}
2005-04-17 02:20:36 +04:00
# ifdef CONFIG_KALLSYMS
2008-07-24 18:41:48 +04:00
2018-11-19 19:43:58 +03:00
/* Lookup exported symbol in given range of kernel_symbols */
static const struct kernel_symbol * lookup_exported_symbol ( const char * name ,
const struct kernel_symbol * start ,
const struct kernel_symbol * stop )
2008-07-24 18:41:48 +04:00
{
2011-05-19 00:35:59 +04:00
return bsearch ( name , start , stop - start ,
sizeof ( struct kernel_symbol ) , cmp_name ) ;
2008-07-24 18:41:48 +04:00
}
2009-01-05 17:40:10 +03:00
static int is_exported ( const char * name , unsigned long value ,
const struct module * mod )
2005-04-17 02:20:36 +04:00
{
2009-01-05 17:40:10 +03:00
const struct kernel_symbol * ks ;
if ( ! mod )
2018-11-19 19:43:58 +03:00
ks = lookup_exported_symbol ( name , __start___ksymtab , __stop___ksymtab ) ;
2006-02-08 23:16:45 +03:00
else
2018-11-19 19:43:58 +03:00
ks = lookup_exported_symbol ( name , mod - > syms , mod - > syms + mod - > num_syms ) ;
module: use relative references for __ksymtab entries
An ordinary arm64 defconfig build has ~64 KB worth of __ksymtab entries,
each consisting of two 64-bit fields containing absolute references, to
the symbol itself and to a char array containing its name, respectively.
When we build the same configuration with KASLR enabled, we end up with an
additional ~192 KB of relocations in the .init section, i.e., one 24 byte
entry for each absolute reference, which all need to be processed at boot
time.
Given how the struct kernel_symbol that describes each entry is completely
local to module.c (except for the references emitted by EXPORT_SYMBOL()
itself), we can easily modify it to contain two 32-bit relative references
instead. This reduces the size of the __ksymtab section by 50% for all
64-bit architectures, and gets rid of the runtime relocations entirely for
architectures implementing KASLR, either via standard PIE linking (arm64)
or using custom host tools (x86).
Note that the binary search involving __ksymtab contents relies on each
section being sorted by symbol name. This is implemented based on the
input section names, not the names in the ksymtab entries, so this patch
does not interfere with that.
Given that the use of place-relative relocations requires support both in
the toolchain and in the module loader, we cannot enable this feature for
all architectures. So make it dependent on whether
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS is defined.
Link: http://lkml.kernel.org/r/20180704083651.24360-4-ard.biesheuvel@linaro.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morris <james.morris@microsoft.com>
Cc: James Morris <jmorris@namei.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 07:56:09 +03:00
return ks ! = NULL & & kernel_symbol_value ( ks ) = = value ;
2005-04-17 02:20:36 +04:00
}
/* As per nm */
2010-08-05 22:59:07 +04:00
static char elf_type ( const Elf_Sym * sym , const struct load_info * info )
2005-04-17 02:20:36 +04:00
{
2010-08-05 22:59:07 +04:00
const Elf_Shdr * sechdrs = info - > sechdrs ;
2005-04-17 02:20:36 +04:00
if ( ELF_ST_BIND ( sym - > st_info ) = = STB_WEAK ) {
if ( ELF_ST_TYPE ( sym - > st_info ) = = STT_OBJECT )
return ' v ' ;
else
return ' w ' ;
}
if ( sym - > st_shndx = = SHN_UNDEF )
return ' U ' ;
2015-11-26 05:48:06 +03:00
if ( sym - > st_shndx = = SHN_ABS | | sym - > st_shndx = = info - > index . pcpu )
2005-04-17 02:20:36 +04:00
return ' a ' ;
if ( sym - > st_shndx > = SHN_LORESERVE )
return ' ? ' ;
if ( sechdrs [ sym - > st_shndx ] . sh_flags & SHF_EXECINSTR )
return ' t ' ;
if ( sechdrs [ sym - > st_shndx ] . sh_flags & SHF_ALLOC
& & sechdrs [ sym - > st_shndx ] . sh_type ! = SHT_NOBITS ) {
if ( ! ( sechdrs [ sym - > st_shndx ] . sh_flags & SHF_WRITE ) )
return ' r ' ;
else if ( sechdrs [ sym - > st_shndx ] . sh_flags & ARCH_SHF_SMALL )
return ' g ' ;
else
return ' d ' ;
}
if ( sechdrs [ sym - > st_shndx ] . sh_type = = SHT_NOBITS ) {
if ( sechdrs [ sym - > st_shndx ] . sh_flags & ARCH_SHF_SMALL )
return ' s ' ;
else
return ' b ' ;
}
2010-08-05 22:59:07 +04:00
if ( strstarts ( info - > secstrings + sechdrs [ sym - > st_shndx ] . sh_name ,
" .debug " ) ) {
2005-04-17 02:20:36 +04:00
return ' n ' ;
2010-08-05 22:59:07 +04:00
}
2005-04-17 02:20:36 +04:00
return ' ? ' ;
}
2009-07-06 17:50:42 +04:00
static bool is_core_symbol ( const Elf_Sym * src , const Elf_Shdr * sechdrs ,
2015-11-26 05:48:06 +03:00
unsigned int shnum , unsigned int pcpundx )
2009-07-06 17:50:42 +04:00
{
const Elf_Shdr * sec ;
if ( src - > st_shndx = = SHN_UNDEF
| | src - > st_shndx > = shnum
| | ! src - > st_name )
return false ;
2015-11-26 05:48:06 +03:00
# ifdef CONFIG_KALLSYMS_ALL
if ( src - > st_shndx = = pcpundx )
return true ;
# endif
2009-07-06 17:50:42 +04:00
sec = sechdrs + src - > st_shndx ;
if ( ! ( sec - > sh_flags & SHF_ALLOC )
# ifndef CONFIG_KALLSYMS_ALL
| | ! ( sec - > sh_flags & SHF_EXECINSTR )
# endif
| | ( sec - > sh_entsize & INIT_OFFSET_MASK ) )
return false ;
return true ;
}
2012-01-13 03:02:14 +04:00
/*
* We only allocate and copy the strings needed by the parts of symtab
* we keep . This is simple , but has the effect of making multiple
* copies of duplicates . We could be more sophisticated , see
* linux - kernel thread starting with
* < 73 defb5e4bca04a6431392cc341112b1 @ localhost > .
*/
2010-08-05 22:59:10 +04:00
static void layout_symtab ( struct module * mod , struct load_info * info )
2009-07-06 17:50:42 +04:00
{
2010-08-05 22:59:10 +04:00
Elf_Shdr * symsect = info - > sechdrs + info - > index . sym ;
Elf_Shdr * strsect = info - > sechdrs + info - > index . str ;
2009-07-06 17:50:42 +04:00
const Elf_Sym * src ;
2012-12-05 05:59:04 +04:00
unsigned int i , nsrc , ndst , strtab_size = 0 ;
2009-07-06 17:50:42 +04:00
/* Put symbol section at end of init part of module. */
symsect - > sh_flags | = SHF_ALLOC ;
2015-11-26 02:14:08 +03:00
symsect - > sh_entsize = get_offset ( mod , & mod - > init_layout . size , symsect ,
2010-08-05 22:59:10 +04:00
info - > index . sym ) | INIT_OFFSET_MASK ;
2011-12-06 23:11:31 +04:00
pr_debug ( " \t %s \n " , info - > secstrings + symsect - > sh_name ) ;
2009-07-06 17:50:42 +04:00
2010-08-05 22:59:10 +04:00
src = ( void * ) info - > hdr + symsect - > sh_offset ;
2009-07-06 17:50:42 +04:00
nsrc = symsect - > sh_size / sizeof ( * src ) ;
2011-11-13 07:08:55 +04:00
2012-01-13 03:02:14 +04:00
/* Compute total space required for the core symbols' strtab. */
2012-10-25 04:19:25 +04:00
for ( ndst = i = 0 ; i < nsrc ; i + + ) {
2016-03-23 03:03:16 +03:00
if ( i = = 0 | | is_livepatch_module ( mod ) | |
2015-11-26 05:48:06 +03:00
is_core_symbol ( src + i , info - > sechdrs , info - > hdr - > e_shnum ,
info - > index . pcpu ) ) {
2012-10-25 04:19:25 +04:00
strtab_size + = strlen ( & info - > strtab [ src [ i ] . st_name ] ) + 1 ;
2012-01-13 03:02:14 +04:00
ndst + + ;
2009-07-06 17:51:44 +04:00
}
2012-10-25 04:19:25 +04:00
}
2009-07-06 17:50:42 +04:00
/* Append room for core symbols at end of core part. */
2015-11-26 02:14:08 +03:00
info - > symoffs = ALIGN ( mod - > core_layout . size , symsect - > sh_addralign ? : 1 ) ;
info - > stroffs = mod - > core_layout . size = info - > symoffs + ndst * sizeof ( Elf_Sym ) ;
mod - > core_layout . size + = strtab_size ;
2019-02-25 22:59:58 +03:00
info - > core_typeoffs = mod - > core_layout . size ;
mod - > core_layout . size + = ndst * sizeof ( char ) ;
2015-11-26 02:14:08 +03:00
mod - > core_layout . size = debug_align ( mod - > core_layout . size ) ;
2009-07-06 17:50:42 +04:00
2009-07-06 17:51:44 +04:00
/* Put string table section at end of init part of module. */
strsect - > sh_flags | = SHF_ALLOC ;
2015-11-26 02:14:08 +03:00
strsect - > sh_entsize = get_offset ( mod , & mod - > init_layout . size , strsect ,
2010-08-05 22:59:10 +04:00
info - > index . str ) | INIT_OFFSET_MASK ;
2011-12-06 23:11:31 +04:00
pr_debug ( " \t %s \n " , info - > secstrings + strsect - > sh_name ) ;
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
/* We'll tack temporary mod_kallsyms on the end. */
mod - > init_layout . size = ALIGN ( mod - > init_layout . size ,
__alignof__ ( struct mod_kallsyms ) ) ;
info - > mod_kallsyms_init_off = mod - > init_layout . size ;
mod - > init_layout . size + = sizeof ( struct mod_kallsyms ) ;
2019-02-25 22:59:58 +03:00
info - > init_typeoffs = mod - > init_layout . size ;
mod - > init_layout . size + = nsrc * sizeof ( char ) ;
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
mod - > init_layout . size = debug_align ( mod - > init_layout . size ) ;
2009-07-06 17:50:42 +04:00
}
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
/*
* We use the full symtab and strtab which layout_symtab arranged to
* be appended to the init section . Later we switch to the cut - down
* core - only ones .
*/
2010-08-05 22:59:12 +04:00
static void add_kallsyms ( struct module * mod , const struct load_info * info )
2005-04-17 02:20:36 +04:00
{
2009-07-06 17:50:42 +04:00
unsigned int i , ndst ;
const Elf_Sym * src ;
Elf_Sym * dst ;
2009-07-06 17:51:44 +04:00
char * s ;
2010-08-05 22:59:07 +04:00
Elf_Shdr * symsec = & info - > sechdrs [ info - > index . sym ] ;
2005-04-17 02:20:36 +04:00
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
/* Set up to point into init section. */
mod - > kallsyms = mod - > init_layout . base + info - > mod_kallsyms_init_off ;
mod - > kallsyms - > symtab = ( void * ) symsec - > sh_addr ;
mod - > kallsyms - > num_symtab = symsec - > sh_size / sizeof ( Elf_Sym ) ;
2010-08-05 22:59:08 +04:00
/* Make sure we get permanent strtab: don't use info->strtab. */
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
mod - > kallsyms - > strtab = ( void * ) info - > sechdrs [ info - > index . str ] . sh_addr ;
2019-02-25 22:59:58 +03:00
mod - > kallsyms - > typetab = mod - > init_layout . base + info - > init_typeoffs ;
2005-04-17 02:20:36 +04:00
2019-02-25 22:59:58 +03:00
/*
* Now populate the cut down core kallsyms for after init
* and set types up while we still have access to sections .
*/
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
mod - > core_kallsyms . symtab = dst = mod - > core_layout . base + info - > symoffs ;
mod - > core_kallsyms . strtab = s = mod - > core_layout . base + info - > stroffs ;
2019-02-25 22:59:58 +03:00
mod - > core_kallsyms . typetab = mod - > core_layout . base + info - > core_typeoffs ;
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
src = mod - > kallsyms - > symtab ;
for ( ndst = i = 0 ; i < mod - > kallsyms - > num_symtab ; i + + ) {
2019-02-25 22:59:58 +03:00
mod - > kallsyms - > typetab [ i ] = elf_type ( src + i , info ) ;
2016-03-23 03:03:16 +03:00
if ( i = = 0 | | is_livepatch_module ( mod ) | |
2015-11-26 05:48:06 +03:00
is_core_symbol ( src + i , info - > sechdrs , info - > hdr - > e_shnum ,
info - > index . pcpu ) ) {
2019-02-25 22:59:58 +03:00
mod - > core_kallsyms . typetab [ ndst ] =
mod - > kallsyms - > typetab [ i ] ;
2012-10-25 04:19:25 +04:00
dst [ ndst ] = src [ i ] ;
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
dst [ ndst + + ] . st_name = s - mod - > core_kallsyms . strtab ;
s + = strlcpy ( s , & mod - > kallsyms - > strtab [ src [ i ] . st_name ] ,
2012-10-25 04:19:25 +04:00
KSYM_NAME_LEN ) + 1 ;
}
2009-07-06 17:50:42 +04:00
}
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
mod - > core_kallsyms . num_symtab = ndst ;
2005-04-17 02:20:36 +04:00
}
# else
2010-08-05 22:59:10 +04:00
static inline void layout_symtab ( struct module * mod , struct load_info * info )
2009-07-06 17:50:42 +04:00
{
}
2009-10-02 02:43:54 +04:00
2010-09-20 03:58:08 +04:00
static void add_kallsyms ( struct module * mod , const struct load_info * info )
2005-04-17 02:20:36 +04:00
{
}
# endif /* CONFIG_KALLSYMS */
2017-07-07 06:15:58 +03:00
static void dynamic_debug_setup ( struct module * mod , struct _ddebug * debug , unsigned int num )
driver core: basic infrastructure for per-module dynamic debug messages
Base infrastructure to enable per-module debug messages.
I've introduced CONFIG_DYNAMIC_PRINTK_DEBUG, which when enabled centralizes
control of debugging statements on a per-module basis in one /proc file,
currently, <debugfs>/dynamic_printk/modules. When, CONFIG_DYNAMIC_PRINTK_DEBUG,
is not set, debugging statements can still be enabled as before, often by
defining 'DEBUG' for the proper compilation unit. Thus, this patch set has no
affect when CONFIG_DYNAMIC_PRINTK_DEBUG is not set.
The infrastructure currently ties into all pr_debug() and dev_dbg() calls. That
is, if CONFIG_DYNAMIC_PRINTK_DEBUG is set, all pr_debug() and dev_dbg() calls
can be dynamically enabled/disabled on a per-module basis.
Future plans include extending this functionality to subsystems, that define
their own debug levels and flags.
Usage:
Dynamic debugging is controlled by the debugfs file,
<debugfs>/dynamic_printk/modules. This file contains a list of the modules that
can be enabled. The format of the file is as follows:
<module_name> <enabled=0/1>
.
.
.
<module_name> : Name of the module in which the debug call resides
<enabled=0/1> : whether the messages are enabled or not
For example:
snd_hda_intel enabled=0
fixup enabled=1
driver enabled=0
Enable a module:
$echo "set enabled=1 <module_name>" > dynamic_printk/modules
Disable a module:
$echo "set enabled=0 <module_name>" > dynamic_printk/modules
Enable all modules:
$echo "set enabled=1 all" > dynamic_printk/modules
Disable all modules:
$echo "set enabled=0 all" > dynamic_printk/modules
Finally, passing "dynamic_printk" at the command line enables
debugging for all modules. This mode can be turned off via the above
disable command.
[gkh: minor cleanups and tweaks to make the build work quietly]
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-08-13 00:46:19 +04:00
{
2010-08-05 22:59:12 +04:00
if ( ! debug )
return ;
2019-03-08 03:27:48 +03:00
ddebug_add_module ( debug , num , mod - > name ) ;
2008-10-22 19:00:13 +04:00
}
driver core: basic infrastructure for per-module dynamic debug messages
Base infrastructure to enable per-module debug messages.
I've introduced CONFIG_DYNAMIC_PRINTK_DEBUG, which when enabled centralizes
control of debugging statements on a per-module basis in one /proc file,
currently, <debugfs>/dynamic_printk/modules. When, CONFIG_DYNAMIC_PRINTK_DEBUG,
is not set, debugging statements can still be enabled as before, often by
defining 'DEBUG' for the proper compilation unit. Thus, this patch set has no
affect when CONFIG_DYNAMIC_PRINTK_DEBUG is not set.
The infrastructure currently ties into all pr_debug() and dev_dbg() calls. That
is, if CONFIG_DYNAMIC_PRINTK_DEBUG is set, all pr_debug() and dev_dbg() calls
can be dynamically enabled/disabled on a per-module basis.
Future plans include extending this functionality to subsystems, that define
their own debug levels and flags.
Usage:
Dynamic debugging is controlled by the debugfs file,
<debugfs>/dynamic_printk/modules. This file contains a list of the modules that
can be enabled. The format of the file is as follows:
<module_name> <enabled=0/1>
.
.
.
<module_name> : Name of the module in which the debug call resides
<enabled=0/1> : whether the messages are enabled or not
For example:
snd_hda_intel enabled=0
fixup enabled=1
driver enabled=0
Enable a module:
$echo "set enabled=1 <module_name>" > dynamic_printk/modules
Disable a module:
$echo "set enabled=0 <module_name>" > dynamic_printk/modules
Enable all modules:
$echo "set enabled=1 all" > dynamic_printk/modules
Disable all modules:
$echo "set enabled=0 all" > dynamic_printk/modules
Finally, passing "dynamic_printk" at the command line enables
debugging for all modules. This mode can be turned off via the above
disable command.
[gkh: minor cleanups and tweaks to make the build work quietly]
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-08-13 00:46:19 +04:00
2017-07-07 06:15:58 +03:00
static void dynamic_debug_remove ( struct module * mod , struct _ddebug * debug )
2010-07-03 07:07:35 +04:00
{
if ( debug )
2017-07-07 06:15:58 +03:00
ddebug_remove_module ( mod - > name ) ;
2010-07-03 07:07:35 +04:00
}
2011-06-30 23:22:11 +04:00
void * __weak module_alloc ( unsigned long size )
{
2020-06-26 06:30:47 +03:00
return __vmalloc_node_range ( size , 1 , VMALLOC_START , VMALLOC_END ,
GFP_KERNEL , PAGE_KERNEL_EXEC , VM_FLUSH_RESET_PERMS ,
2020-07-04 01:15:27 +03:00
NUMA_NO_NODE , __builtin_return_address ( 0 ) ) ;
2011-06-30 23:22:11 +04:00
}
2020-05-14 13:36:41 +03:00
bool __weak module_init_section ( const char * name )
{
return strstarts ( name , " .init " ) ;
}
2019-06-07 13:49:11 +03:00
bool __weak module_exit_section ( const char * name )
{
return strstarts ( name , " .exit " ) ;
}
2009-06-11 16:23:20 +04:00
# ifdef CONFIG_DEBUG_KMEMLEAK
2010-08-05 22:59:10 +04:00
static void kmemleak_load_module ( const struct module * mod ,
const struct load_info * info )
2009-06-11 16:23:20 +04:00
{
unsigned int i ;
/* only scan the sections containing data */
2009-10-28 16:33:09 +03:00
kmemleak_scan_area ( mod , sizeof ( struct module ) , GFP_KERNEL ) ;
2009-06-11 16:23:20 +04:00
2010-08-05 22:59:10 +04:00
for ( i = 1 ; i < info - > hdr - > e_shnum ; i + + ) {
2013-05-15 23:33:01 +04:00
/* Scan all writable sections that's not executable */
if ( ! ( info - > sechdrs [ i ] . sh_flags & SHF_ALLOC ) | |
! ( info - > sechdrs [ i ] . sh_flags & SHF_WRITE ) | |
( info - > sechdrs [ i ] . sh_flags & SHF_EXECINSTR ) )
2009-06-11 16:23:20 +04:00
continue ;
2010-08-05 22:59:10 +04:00
kmemleak_scan_area ( ( void * ) info - > sechdrs [ i ] . sh_addr ,
info - > sechdrs [ i ] . sh_size , GFP_KERNEL ) ;
2009-06-11 16:23:20 +04:00
}
}
# else
2010-08-05 22:59:10 +04:00
static inline void kmemleak_load_module ( const struct module * mod ,
const struct load_info * info )
2009-06-11 16:23:20 +04:00
{
}
# endif
2012-09-26 13:09:40 +04:00
# ifdef CONFIG_MODULE_SIG
2016-04-28 02:54:01 +03:00
static int module_sig_check ( struct load_info * info , int flags )
2012-09-26 13:09:40 +04:00
{
2019-08-20 03:17:40 +03:00
int err = - ENODATA ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
const unsigned long markerlen = sizeof ( MODULE_SIG_STRING ) - 1 ;
2019-08-20 03:17:40 +03:00
const char * reason ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
const void * mod = info - > hdr ;
2012-10-20 04:19:29 +04:00
2016-04-28 02:54:01 +03:00
/*
* Require flags = = 0 , as a module with version information
* removed is no longer the module that was signed
*/
if ( flags = = 0 & &
info - > len > markerlen & &
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
memcmp ( mod + info - > len - markerlen , MODULE_SIG_STRING , markerlen ) = = 0 ) {
2012-10-20 04:19:29 +04:00
/* We truncate the module to discard the signature */
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
info - > len - = markerlen ;
2018-06-29 17:37:08 +03:00
err = mod_verify_sig ( mod , info ) ;
2012-09-26 13:09:40 +04:00
}
2019-08-20 03:17:40 +03:00
switch ( err ) {
case 0 :
2012-09-26 13:09:40 +04:00
info - > sig_ok = true ;
return 0 ;
2019-08-20 03:17:40 +03:00
/* We don't permit modules to be loaded into trusted kernels
* without a valid signature on them , but if we ' re not
* enforcing , certain errors are non - fatal .
*/
case - ENODATA :
reason = " Loading of unsigned module " ;
goto decide ;
case - ENOPKG :
reason = " Loading of module with unsupported crypto " ;
goto decide ;
case - ENOKEY :
reason = " Loading of module with unavailable key " ;
decide :
if ( is_module_sig_enforced ( ) ) {
2020-01-15 17:49:31 +03:00
pr_notice ( " %s: %s is rejected \n " , info - > name , reason ) ;
2019-08-20 03:17:40 +03:00
return - EKEYREJECTED ;
}
2012-09-26 13:09:40 +04:00
2019-08-20 03:17:40 +03:00
return security_locked_down ( LOCKDOWN_MODULE_SIGNATURE ) ;
/* All other errors are fatal, including nomem, unparseable
* signatures and signature check failures - even if signatures
* aren ' t required .
*/
default :
return err ;
}
2012-09-26 13:09:40 +04:00
}
# else /* !CONFIG_MODULE_SIG */
2016-04-28 02:54:01 +03:00
static int module_sig_check ( struct load_info * info , int flags )
2012-09-26 13:09:40 +04:00
{
return 0 ;
}
# endif /* !CONFIG_MODULE_SIG */
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
/* Sanity checks against invalid binaries, wrong arch, weird elf version. */
static int elf_header_check ( struct load_info * info )
2010-08-05 22:59:03 +04:00
{
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
if ( info - > len < sizeof ( * ( info - > hdr ) ) )
return - ENOEXEC ;
if ( memcmp ( info - > hdr - > e_ident , ELFMAG , SELFMAG ) ! = 0
| | info - > hdr - > e_type ! = ET_REL
| | ! elf_check_arch ( info - > hdr )
| | info - > hdr - > e_shentsize ! = sizeof ( Elf_Shdr ) )
return - ENOEXEC ;
if ( info - > hdr - > e_shoff > = info - > len
| | ( info - > hdr - > e_shnum * sizeof ( Elf_Shdr ) >
info - > len - info - > hdr - > e_shoff ) )
return - ENOEXEC ;
2010-08-05 22:59:03 +04:00
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
return 0 ;
}
2015-04-07 20:33:49 +03:00
# define COPY_CHUNK_SIZE (16*PAGE_SIZE)
static int copy_chunked_from_user ( void * dst , const void __user * usrc , unsigned long len )
{
do {
unsigned long n = min ( len , COPY_CHUNK_SIZE ) ;
if ( copy_from_user ( dst , usrc , n ) ! = 0 )
return - EFAULT ;
cond_resched ( ) ;
dst + = n ;
usrc + = n ;
len - = n ;
} while ( len ) ;
return 0 ;
}
2016-03-23 03:03:16 +03:00
# ifdef CONFIG_LIVEPATCH
2016-08-25 18:04:45 +03:00
static int check_modinfo_livepatch ( struct module * mod , struct load_info * info )
2016-03-23 03:03:16 +03:00
{
2016-08-25 18:04:45 +03:00
if ( get_modinfo ( info , " livepatch " ) ) {
mod - > klp = true ;
add_taint_module ( mod , TAINT_LIVEPATCH , LOCKDEP_STILL_OK ) ;
2017-01-12 19:57:44 +03:00
pr_notice_once ( " %s: tainting kernel with TAINT_LIVEPATCH \n " ,
mod - > name ) ;
2016-08-25 18:04:45 +03:00
}
2016-03-23 03:03:16 +03:00
return 0 ;
}
# else /* !CONFIG_LIVEPATCH */
2016-08-25 18:04:45 +03:00
static int check_modinfo_livepatch ( struct module * mod , struct load_info * info )
2016-03-23 03:03:16 +03:00
{
if ( get_modinfo ( info , " livepatch " ) ) {
pr_err ( " %s: module is marked as livepatch module, but livepatch support is disabled " ,
mod - > name ) ;
return - ENOEXEC ;
}
return 0 ;
}
# endif /* CONFIG_LIVEPATCH */
2018-01-26 02:50:28 +03:00
static void check_modinfo_retpoline ( struct module * mod , struct load_info * info )
{
if ( retpoline_module_ok ( get_modinfo ( info , " retpoline " ) ) )
return ;
pr_warn ( " %s: loading module not compiled with retpoline compiler. \n " ,
mod - > name ) ;
}
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
/* Sets info->hdr and info->len. */
static int copy_module_from_user ( const void __user * umod , unsigned long len ,
struct load_info * info )
2010-08-05 22:59:03 +04:00
{
int err ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
info - > len = len ;
if ( info - > len < sizeof ( * ( info - > hdr ) ) )
2010-08-05 22:59:03 +04:00
return - ENOEXEC ;
2018-07-13 21:06:02 +03:00
err = security_kernel_load_data ( LOADING_MODULE ) ;
2012-10-16 01:02:07 +04:00
if ( err )
return err ;
2010-08-05 22:59:03 +04:00
/* Suck in entire file: we'll want most of it. */
2020-06-02 07:51:40 +03:00
info - > hdr = __vmalloc ( info - > len , GFP_KERNEL | __GFP_NOWARN ) ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
if ( ! info - > hdr )
2010-08-05 22:59:03 +04:00
return - ENOMEM ;
2015-04-07 20:33:49 +03:00
if ( copy_chunked_from_user ( info - > hdr , umod , info - > len ) ! = 0 ) {
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
vfree ( info - > hdr ) ;
return - EFAULT ;
2010-08-05 22:59:03 +04:00
}
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
return 0 ;
}
2010-08-05 22:59:08 +04:00
static void free_copy ( struct load_info * info )
{
vfree ( info - > hdr ) ;
}
2012-10-22 11:39:41 +04:00
static int rewrite_section_headers ( struct load_info * info , int flags )
2010-08-05 22:59:06 +04:00
{
unsigned int i ;
/* This should always be true, but let's be sure. */
info - > sechdrs [ 0 ] . sh_addr = 0 ;
for ( i = 1 ; i < info - > hdr - > e_shnum ; i + + ) {
Elf_Shdr * shdr = & info - > sechdrs [ i ] ;
if ( shdr - > sh_type ! = SHT_NOBITS
& & info - > len < shdr - > sh_offset + shdr - > sh_size ) {
2013-11-13 03:11:28 +04:00
pr_err ( " Module len %lu truncated \n " , info - > len ) ;
2010-08-05 22:59:06 +04:00
return - ENOEXEC ;
}
/* Mark all sections sh_addr with their address in the
temporary image . */
shdr - > sh_addr = ( size_t ) info - > hdr + shdr - > sh_offset ;
# ifndef CONFIG_MODULE_UNLOAD
/* Don't load .exit sections */
2019-06-07 13:49:11 +03:00
if ( module_exit_section ( info - > secstrings + shdr - > sh_name ) )
2010-08-05 22:59:06 +04:00
shdr - > sh_flags & = ~ ( unsigned long ) SHF_ALLOC ;
# endif
}
2010-08-05 22:59:07 +04:00
/* Track but don't keep modinfo and version sections. */
2017-04-22 01:35:27 +03:00
info - > sechdrs [ info - > index . vers ] . sh_flags & = ~ ( unsigned long ) SHF_ALLOC ;
2010-08-05 22:59:07 +04:00
info - > sechdrs [ info - > index . info ] . sh_flags & = ~ ( unsigned long ) SHF_ALLOC ;
2017-04-22 01:35:27 +03:00
2010-08-05 22:59:06 +04:00
return 0 ;
}
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
/*
* Set up our basic convenience variables ( pointers to section headers ,
* search for module section index etc ) , and do some basic section
* verification .
*
2018-06-22 14:59:29 +03:00
* Set info - > mod to the temporary copy of the module in info - > hdr . The final one
* will be allocated in move_module ( ) .
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
*/
2018-06-22 14:59:29 +03:00
static int setup_load_info ( struct load_info * info , int flags )
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
{
unsigned int i ;
/* Set up the convenience variables */
info - > sechdrs = ( void * ) info - > hdr + info - > hdr - > e_shoff ;
2010-08-05 22:59:06 +04:00
info - > secstrings = ( void * ) info - > hdr
+ info - > sechdrs [ info - > hdr - > e_shstrndx ] . sh_offset ;
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
2018-06-22 15:00:01 +03:00
/* Try to find a name early so we can log errors with a module name */
info - > index . info = find_sec ( info , " .modinfo " ) ;
2020-01-17 15:32:21 +03:00
if ( info - > index . info )
2018-06-22 15:00:01 +03:00
info - > name = get_modinfo ( info , " name " ) ;
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
2010-08-05 22:59:06 +04:00
/* Find internal symbols and strings. */
for ( i = 1 ; i < info - > hdr - > e_shnum ; i + + ) {
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
if ( info - > sechdrs [ i ] . sh_type = = SHT_SYMTAB ) {
info - > index . sym = i ;
info - > index . str = info - > sechdrs [ i ] . sh_link ;
2010-08-05 22:59:06 +04:00
info - > strtab = ( char * ) info - > hdr
+ info - > sechdrs [ info - > index . str ] . sh_offset ;
break ;
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
}
}
2018-06-22 15:00:01 +03:00
if ( info - > index . sym = = 0 ) {
2020-01-17 15:32:21 +03:00
pr_warn ( " %s: module has no symbols (stripped?) \n " ,
info - > name ? : " (missing .modinfo section or name field) " ) ;
2018-06-22 15:00:01 +03:00
return - ENOEXEC ;
}
2010-08-05 22:59:10 +04:00
info - > index . mod = find_sec ( info , " .gnu.linkonce.this_module " ) ;
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
if ( ! info - > index . mod ) {
2017-04-22 01:35:27 +03:00
pr_warn ( " %s: No module found in object \n " ,
2020-01-17 15:32:21 +03:00
info - > name ? : " (missing .modinfo section or name field) " ) ;
2018-06-22 14:59:29 +03:00
return - ENOEXEC ;
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
}
/* This is temporary: point mod into copy of data. */
2018-06-22 15:00:01 +03:00
info - > mod = ( void * ) info - > hdr + info - > sechdrs [ info - > index . mod ] . sh_offset ;
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
2017-04-22 01:35:27 +03:00
/*
2018-06-22 15:00:01 +03:00
* If we didn ' t load the . modinfo ' name ' field earlier , fall back to
2017-04-22 01:35:27 +03:00
* on - disk struct mod ' name ' field .
*/
if ( ! info - > name )
2018-06-22 14:59:29 +03:00
info - > name = info - > mod - > name ;
2017-04-22 01:35:27 +03:00
2018-06-22 15:00:01 +03:00
if ( flags & MODULE_INIT_IGNORE_MODVERSIONS )
info - > index . vers = 0 ; /* Pretend no __versions section! */
else
info - > index . vers = find_sec ( info , " __versions " ) ;
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
2010-08-05 22:59:10 +04:00
info - > index . pcpu = find_pcpusec ( info ) ;
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
2018-06-22 14:59:29 +03:00
return 0 ;
module: add load_info
Btw, here's a patch that _looks_ large, but it really pretty trivial, and
sets things up so that it would be way easier to split off pieces of the
module loading.
The reason it looks large is that it creates a "module_info" structure
that contains all the module state that we're building up while loading,
instead of having individual variables for all the indices etc.
So the patch ends up being large, because every "symindex" access instead
becomes "info.index.sym" etc. That may be a few characters longer, but it
then means that we can just pass a pointer to that "info" structure
around. and let all the pieces fill it in very naturally.
As an example of that, the patch also moves the initialization of all
those convenience variables into a "setup_module_info()" function. And at
this point it really does become very natural to start to peel off some of
the error labels and move them into the helper functions - now the
"truncated" case is gone, and is handled inside that setup function
instead.
So maybe you don't like this approach, and it does make the variable
accesses a bit longer, but I don't think unreadably so. And the patch
really does look big and scary, but there really should be absolutely no
semantic changes - most of it was a trivial and mindless rename.
In fact, it was so mindless that I on purpose kept the existing helper
functions looking like this:
- err = check_modinfo(mod, sechdrs, infoindex, versindex);
+ err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers);
rather than changing them to just take the "info" pointer. IOW, a second
phase (if you think the approach is ok) would change that calling
convention to just do
err = check_modinfo(mod, &info);
(and same for "layout_sections()", "layout_symtabs()" etc.) Similarly,
while right now it makes things _look_ bigger, with things like this:
versindex = find_sec(hdr, sechdrs, secstrings, "__versions");
becoming
info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions");
in the new "setup_module_info()" function, that's again just a result of
it being a search-and-replace patch. By using the 'info' pointer, we could
just change the 'find_sec()' interface so that it ends up being
info->index.vers = find_sec(info, "__versions");
instead, and then we'd actually have a shorter and more readable line. So
for a lot of those mindless variable name expansions there's would be room
for separate cleanups.
I didn't move quite everything in there - if we do this to layout_symtabs,
for example, we'd want to move the percpu, symoffs, stroffs, *strmap
variables to be fields in that module_info structure too. But that's a
much smaller patch, I moved just the really core stuff that is currently
being set up and used in various parts.
But even in this rough form, it removes close to 70 lines from that
function (but adds 22 lines overall, of course - the structure definition,
the helper function declarations and call-sites etc etc).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-06-02 22:01:06 +04:00
}
2012-10-22 11:39:41 +04:00
static int check_modinfo ( struct module * mod , struct load_info * info , int flags )
2010-08-05 22:59:03 +04:00
{
2010-08-05 22:59:10 +04:00
const char * modmagic = get_modinfo ( info , " vermagic " ) ;
2010-08-05 22:59:03 +04:00
int err ;
2012-10-22 11:39:41 +04:00
if ( flags & MODULE_INIT_IGNORE_VERMAGIC )
modmagic = NULL ;
2010-08-05 22:59:03 +04:00
/* This is allowed: modprobe --force will invalidate it. */
if ( ! modmagic ) {
err = try_to_force_load ( mod , " bad vermagic " ) ;
if ( err )
return err ;
2010-08-05 22:59:10 +04:00
} else if ( ! same_magic ( modmagic , vermagic , info - > index . vers ) ) {
2013-11-13 03:11:28 +04:00
pr_err ( " %s: version magic '%s' should be '%s' \n " ,
2017-04-22 01:35:27 +03:00
info - > name , modmagic , vermagic ) ;
2010-08-05 22:59:03 +04:00
return - ENOEXEC ;
}
2016-04-13 04:36:12 +03:00
if ( ! get_modinfo ( info , " intree " ) ) {
if ( ! test_taint ( TAINT_OOT_MODULE ) )
pr_warn ( " %s: loading out-of-tree module taints kernel. \n " ,
mod - > name ) ;
2013-01-21 10:47:39 +04:00
add_taint_module ( mod , TAINT_OOT_MODULE , LOCKDEP_STILL_OK ) ;
2016-04-13 04:36:12 +03:00
}
2011-10-24 17:12:28 +04:00
2018-01-26 02:50:28 +03:00
check_modinfo_retpoline ( mod , info ) ;
2010-08-05 22:59:10 +04:00
if ( get_modinfo ( info , " staging " ) ) {
2013-01-21 10:47:39 +04:00
add_taint_module ( mod , TAINT_CRAP , LOCKDEP_STILL_OK ) ;
2013-11-13 03:11:28 +04:00
pr_warn ( " %s: module is from the staging directory, the quality "
" is unknown, you have been warned. \n " , mod - > name ) ;
2010-08-05 22:59:03 +04:00
}
2010-08-05 22:59:05 +04:00
2016-08-25 18:04:45 +03:00
err = check_modinfo_livepatch ( mod , info ) ;
2016-03-23 03:03:16 +03:00
if ( err )
return err ;
2010-08-05 22:59:05 +04:00
/* Set up license info based on the info section */
2010-08-05 22:59:10 +04:00
set_license ( mod , get_modinfo ( info , " license " ) ) ;
2010-08-05 22:59:05 +04:00
2010-08-05 22:59:03 +04:00
return 0 ;
}
2013-10-14 11:38:46 +04:00
static int find_module_sections ( struct module * mod , struct load_info * info )
2010-08-05 22:59:02 +04:00
{
2010-08-05 22:59:10 +04:00
mod - > kp = section_objs ( info , " __param " ,
2010-08-05 22:59:02 +04:00
sizeof ( * mod - > kp ) , & mod - > num_kp ) ;
2010-08-05 22:59:10 +04:00
mod - > syms = section_objs ( info , " __ksymtab " ,
2010-08-05 22:59:02 +04:00
sizeof ( * mod - > syms ) , & mod - > num_syms ) ;
2010-08-05 22:59:10 +04:00
mod - > crcs = section_addr ( info , " __kcrctab " ) ;
mod - > gpl_syms = section_objs ( info , " __ksymtab_gpl " ,
2010-08-05 22:59:02 +04:00
sizeof ( * mod - > gpl_syms ) ,
& mod - > num_gpl_syms ) ;
2010-08-05 22:59:10 +04:00
mod - > gpl_crcs = section_addr ( info , " __kcrctab_gpl " ) ;
mod - > gpl_future_syms = section_objs ( info ,
2010-08-05 22:59:02 +04:00
" __ksymtab_gpl_future " ,
sizeof ( * mod - > gpl_future_syms ) ,
& mod - > num_gpl_future_syms ) ;
2010-08-05 22:59:10 +04:00
mod - > gpl_future_crcs = section_addr ( info , " __kcrctab_gpl_future " ) ;
2010-08-05 22:59:02 +04:00
# ifdef CONFIG_UNUSED_SYMBOLS
2010-08-05 22:59:10 +04:00
mod - > unused_syms = section_objs ( info , " __ksymtab_unused " ,
2010-08-05 22:59:02 +04:00
sizeof ( * mod - > unused_syms ) ,
& mod - > num_unused_syms ) ;
2010-08-05 22:59:10 +04:00
mod - > unused_crcs = section_addr ( info , " __kcrctab_unused " ) ;
mod - > unused_gpl_syms = section_objs ( info , " __ksymtab_unused_gpl " ,
2010-08-05 22:59:02 +04:00
sizeof ( * mod - > unused_gpl_syms ) ,
& mod - > num_unused_gpl_syms ) ;
2010-08-05 22:59:10 +04:00
mod - > unused_gpl_crcs = section_addr ( info , " __kcrctab_unused_gpl " ) ;
2010-08-05 22:59:02 +04:00
# endif
# ifdef CONFIG_CONSTRUCTORS
2010-08-05 22:59:10 +04:00
mod - > ctors = section_objs ( info , " .ctors " ,
2010-08-05 22:59:02 +04:00
sizeof ( * mod - > ctors ) , & mod - > num_ctors ) ;
2013-10-14 11:38:46 +04:00
if ( ! mod - > ctors )
mod - > ctors = section_objs ( info , " .init_array " ,
sizeof ( * mod - > ctors ) , & mod - > num_ctors ) ;
else if ( find_sec ( info , " .init_array " ) ) {
/*
* This shouldn ' t happen with same compiler and binutils
* building all parts of the module .
*/
2014-11-10 02:01:29 +03:00
pr_warn ( " %s: has both .ctors and .init_array. \n " ,
2013-10-14 11:38:46 +04:00
mod - > name ) ;
return - EINVAL ;
}
2010-08-05 22:59:02 +04:00
# endif
2020-03-10 16:04:34 +03:00
mod - > noinstr_text_start = section_objs ( info , " .noinstr.text " , 1 ,
& mod - > noinstr_text_size ) ;
2010-08-05 22:59:02 +04:00
# ifdef CONFIG_TRACEPOINTS
2011-01-27 01:26:22 +03:00
mod - > tracepoints_ptrs = section_objs ( info , " __tracepoints_ptrs " ,
sizeof ( * mod - > tracepoints_ptrs ) ,
& mod - > num_tracepoints ) ;
2010-08-05 22:59:02 +04:00
# endif
2019-04-06 02:15:00 +03:00
# ifdef CONFIG_TREE_SRCU
mod - > srcu_struct_ptrs = section_objs ( info , " ___srcu_struct_ptrs " ,
sizeof ( * mod - > srcu_struct_ptrs ) ,
& mod - > num_srcu_structs ) ;
# endif
2018-12-13 03:42:37 +03:00
# ifdef CONFIG_BPF_EVENTS
mod - > bpf_raw_events = section_objs ( info , " __bpf_raw_tp_map " ,
sizeof ( * mod - > bpf_raw_events ) ,
& mod - > num_bpf_raw_events ) ;
# endif
2018-12-30 18:14:15 +03:00
# ifdef CONFIG_JUMP_LABEL
2010-09-17 19:09:00 +04:00
mod - > jump_entries = section_objs ( info , " __jump_table " ,
sizeof ( * mod - > jump_entries ) ,
& mod - > num_jump_entries ) ;
# endif
2010-08-05 22:59:02 +04:00
# ifdef CONFIG_EVENT_TRACING
2010-08-05 22:59:10 +04:00
mod - > trace_events = section_objs ( info , " _ftrace_events " ,
2010-08-05 22:59:02 +04:00
sizeof ( * mod - > trace_events ) ,
& mod - > num_trace_events ) ;
2017-06-01 00:56:44 +03:00
mod - > trace_evals = section_objs ( info , " _ftrace_eval_map " ,
sizeof ( * mod - > trace_evals ) ,
& mod - > num_trace_evals ) ;
2010-08-05 22:59:02 +04:00
# endif
tracing: Fix module use of trace_bprintk()
On use of trace_printk() there's a macro that determines if the format
is static or a variable. If it is static, it defaults to __trace_bprintk()
otherwise it uses __trace_printk().
A while ago, Lai Jiangshan added __trace_bprintk(). In that patch, we
discussed a way to allow modules to use it. The difference between
__trace_bprintk() and __trace_printk() is that for faster processing,
just the format and args are stored in the trace instead of running
it through a sprintf function. In order to do this, the format used
by the __trace_bprintk() had to be persistent.
See commit 1ba28e02a18cbdbea123836f6c98efb09cbf59ec
The problem comes with trace_bprintk() where the module is unloaded.
The pointer left in the buffer is still pointing to the format.
To solve this issue, the formats in the module were copied into kernel
core. If the same format was used, they would use the same copy (to prevent
memory leak). This all worked well until we tried to merge everything.
At the time this was written, Lai Jiangshan, Frederic Weisbecker,
Ingo Molnar and myself were all touching the same code. When this was
merged, we lost the part of it that was in module.c. This kept out the
copying of the formats and unloading the module could cause bad pointers
left in the ring buffer.
This patch adds back (with updates required for current kernel) the
module code that sets up the necessary pointers.
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-11-11 06:19:24 +03:00
# ifdef CONFIG_TRACING
mod - > trace_bprintk_fmt_start = section_objs ( info , " __trace_printk_fmt " ,
sizeof ( * mod - > trace_bprintk_fmt_start ) ,
& mod - > num_trace_bprintk_fmt ) ;
# endif
2010-08-05 22:59:02 +04:00
# ifdef CONFIG_FTRACE_MCOUNT_RECORD
/* sechdrs[0].sh_size is always zero */
module/ftrace: handle patchable-function-entry
When using patchable-function-entry, the compiler will record the
callsites into a section named "__patchable_function_entries" rather
than "__mcount_loc". Let's abstract this difference behind a new
FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this
explicitly (e.g. with custom module linker scripts).
As parisc currently handles this explicitly, it is fixed up accordingly,
with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is
only defined when DYNAMIC_FTRACE is selected, the parisc module loading
code is updated to only use the definition in that case. When
DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so
this removes some redundant work in that case.
To make sure that this is keep up-to-date for modules and the main
kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery
simplified for legibility.
I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled,
and verified that the section made it into the .ko files for modules.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Helge Deller <deller@gmx.de>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: linux-parisc@vger.kernel.org
2019-10-16 20:17:11 +03:00
mod - > ftrace_callsites = section_objs ( info , FTRACE_CALLSITE_SECTION ,
2010-08-05 22:59:02 +04:00
sizeof ( * mod - > ftrace_callsites ) ,
& mod - > num_ftrace_callsites ) ;
# endif
2018-01-12 20:55:03 +03:00
# ifdef CONFIG_FUNCTION_ERROR_INJECTION
mod - > ei_funcs = section_objs ( info , " _error_injection_whitelist " ,
sizeof ( * mod - > ei_funcs ) ,
& mod - > num_ei_funcs ) ;
2020-03-26 17:49:48 +03:00
# endif
# ifdef CONFIG_KPROBES
mod - > kprobes_text_start = section_objs ( info , " .kprobes.text " , 1 ,
& mod - > kprobes_text_size ) ;
2020-03-26 17:50:00 +03:00
mod - > kprobe_blacklist = section_objs ( info , " _kprobe_blacklist " ,
sizeof ( unsigned long ) ,
& mod - > num_kprobe_blacklist ) ;
2020-08-18 16:57:42 +03:00
# endif
# ifdef CONFIG_HAVE_STATIC_CALL_INLINE
mod - > static_call_sites = section_objs ( info , " .static_call_sites " ,
sizeof ( * mod - > static_call_sites ) ,
& mod - > num_static_call_sites ) ;
2017-12-11 19:36:46 +03:00
# endif
2010-08-05 22:59:12 +04:00
mod - > extable = section_objs ( info , " __ex_table " ,
sizeof ( * mod - > extable ) , & mod - > num_exentries ) ;
2010-08-05 22:59:10 +04:00
if ( section_addr ( info , " __obsparm " ) )
2013-11-13 03:11:28 +04:00
pr_warn ( " %s: Ignoring obsolete parameters \n " , mod - > name ) ;
2010-08-05 22:59:12 +04:00
2020-07-20 02:10:45 +03:00
info - > debug = section_objs ( info , " __dyndbg " ,
2010-08-05 22:59:12 +04:00
sizeof ( * info - > debug ) , & info - > num_debug ) ;
2013-10-14 11:38:46 +04:00
return 0 ;
2010-08-05 22:59:02 +04:00
}
2010-08-05 22:59:10 +04:00
static int move_module ( struct module * mod , struct load_info * info )
2010-08-05 22:59:02 +04:00
{
int i ;
void * ptr ;
/* Do the allocs. */
2015-11-26 02:14:08 +03:00
ptr = module_alloc ( mod - > core_layout . size ) ;
2010-08-05 22:59:02 +04:00
/*
* The pointer to this block is stored in the module structure
* which is inside the block . Just mark it as not being a
* leak .
*/
kmemleak_not_leak ( ptr ) ;
if ( ! ptr )
2010-08-05 22:59:08 +04:00
return - ENOMEM ;
2010-08-05 22:59:02 +04:00
2015-11-26 02:14:08 +03:00
memset ( ptr , 0 , mod - > core_layout . size ) ;
mod - > core_layout . base = ptr ;
2010-08-05 22:59:02 +04:00
2015-11-26 02:14:08 +03:00
if ( mod - > init_layout . size ) {
ptr = module_alloc ( mod - > init_layout . size ) ;
2012-12-11 03:08:33 +04:00
/*
* The pointer to this block is stored in the module structure
* which is inside the block . This block doesn ' t need to be
* scanned as it contains data and code that will be freed
* after the module is initialized .
*/
kmemleak_ignore ( ptr ) ;
if ( ! ptr ) {
2015-11-26 02:14:08 +03:00
module_memfree ( mod - > core_layout . base ) ;
2012-12-11 03:08:33 +04:00
return - ENOMEM ;
}
2015-11-26 02:14:08 +03:00
memset ( ptr , 0 , mod - > init_layout . size ) ;
mod - > init_layout . base = ptr ;
2012-12-11 03:08:33 +04:00
} else
2015-11-26 02:14:08 +03:00
mod - > init_layout . base = NULL ;
2010-08-05 22:59:02 +04:00
/* Transfer each section which specifies SHF_ALLOC */
2011-12-06 23:11:31 +04:00
pr_debug ( " final section addresses: \n " ) ;
2010-08-05 22:59:10 +04:00
for ( i = 0 ; i < info - > hdr - > e_shnum ; i + + ) {
2010-08-05 22:59:02 +04:00
void * dest ;
2010-08-05 22:59:10 +04:00
Elf_Shdr * shdr = & info - > sechdrs [ i ] ;
2010-08-05 22:59:02 +04:00
2010-08-05 22:59:10 +04:00
if ( ! ( shdr - > sh_flags & SHF_ALLOC ) )
2010-08-05 22:59:02 +04:00
continue ;
2010-08-05 22:59:10 +04:00
if ( shdr - > sh_entsize & INIT_OFFSET_MASK )
2015-11-26 02:14:08 +03:00
dest = mod - > init_layout . base
2010-08-05 22:59:10 +04:00
+ ( shdr - > sh_entsize & ~ INIT_OFFSET_MASK ) ;
2010-08-05 22:59:02 +04:00
else
2015-11-26 02:14:08 +03:00
dest = mod - > core_layout . base + shdr - > sh_entsize ;
2010-08-05 22:59:02 +04:00
2010-08-05 22:59:10 +04:00
if ( shdr - > sh_type ! = SHT_NOBITS )
memcpy ( dest , ( void * ) shdr - > sh_addr , shdr - > sh_size ) ;
2010-08-05 22:59:02 +04:00
/* Update sh_addr to point to copy in image. */
2010-08-05 22:59:10 +04:00
shdr - > sh_addr = ( unsigned long ) dest ;
2011-12-06 23:11:31 +04:00
pr_debug ( " \t 0x%lx %s \n " ,
( long ) shdr - > sh_addr , info - > secstrings + shdr - > sh_name ) ;
2010-08-05 22:59:02 +04:00
}
2010-08-05 22:59:08 +04:00
return 0 ;
2010-08-05 22:59:02 +04:00
}
2010-08-05 22:59:10 +04:00
static int check_module_license_and_versions ( struct module * mod )
2010-08-05 22:59:05 +04:00
{
2016-04-13 04:36:12 +03:00
int prev_taint = test_taint ( TAINT_PROPRIETARY_MODULE ) ;
2010-08-05 22:59:05 +04:00
/*
* ndiswrapper is under GPL by itself , but loads proprietary modules .
* Don ' t use add_taint_module ( ) , as it would prevent ndiswrapper from
* using GPL - only symbols it needs .
*/
if ( strcmp ( mod - > name , " ndiswrapper " ) = = 0 )
2013-01-21 10:47:39 +04:00
add_taint ( TAINT_PROPRIETARY_MODULE , LOCKDEP_NOW_UNRELIABLE ) ;
2010-08-05 22:59:05 +04:00
/* driverloader was caught wrongly pretending to be under GPL */
if ( strcmp ( mod - > name , " driverloader " ) = = 0 )
2013-01-21 10:47:39 +04:00
add_taint_module ( mod , TAINT_PROPRIETARY_MODULE ,
LOCKDEP_NOW_UNRELIABLE ) ;
2010-08-05 22:59:05 +04:00
2012-06-22 21:49:31 +04:00
/* lve claims to be GPL but upstream won't provide source */
if ( strcmp ( mod - > name , " lve " ) = = 0 )
2013-01-21 10:47:39 +04:00
add_taint_module ( mod , TAINT_PROPRIETARY_MODULE ,
LOCKDEP_NOW_UNRELIABLE ) ;
2012-06-22 21:49:31 +04:00
2016-04-13 04:36:12 +03:00
if ( ! prev_taint & & test_taint ( TAINT_PROPRIETARY_MODULE ) )
pr_warn ( " %s: module license taints kernel. \n " , mod - > name ) ;
2010-08-05 22:59:05 +04:00
# ifdef CONFIG_MODVERSIONS
if ( ( mod - > num_syms & & ! mod - > crcs )
| | ( mod - > num_gpl_syms & & ! mod - > gpl_crcs )
| | ( mod - > num_gpl_future_syms & & ! mod - > gpl_future_crcs )
# ifdef CONFIG_UNUSED_SYMBOLS
| | ( mod - > num_unused_syms & & ! mod - > unused_crcs )
| | ( mod - > num_unused_gpl_syms & & ! mod - > unused_gpl_crcs )
# endif
) {
return try_to_force_load ( mod ,
" no versions for exported symbols " ) ;
}
# endif
return 0 ;
}
static void flush_module_icache ( const struct module * mod )
{
/*
* Flush the instruction cache , since we ' ve played with text .
* Do it before processing of module parameters , so the module
* can provide parameter accessor functions of its own .
*/
2015-11-26 02:14:08 +03:00
if ( mod - > init_layout . base )
flush_icache_range ( ( unsigned long ) mod - > init_layout . base ,
( unsigned long ) mod - > init_layout . base
+ mod - > init_layout . size ) ;
flush_icache_range ( ( unsigned long ) mod - > core_layout . base ,
( unsigned long ) mod - > core_layout . base + mod - > core_layout . size ) ;
2010-08-05 22:59:05 +04:00
}
2011-06-30 23:22:11 +04:00
int __weak module_frob_arch_sections ( Elf_Ehdr * hdr ,
Elf_Shdr * sechdrs ,
char * secstrings ,
struct module * mod )
{
return 0 ;
}
2016-07-21 09:07:56 +03:00
/* module_blacklist is a comma-separated list of module names */
static char * module_blacklist ;
2017-06-29 04:32:31 +03:00
static bool blacklisted ( const char * module_name )
2016-07-21 09:07:56 +03:00
{
const char * p ;
size_t len ;
if ( ! module_blacklist )
return false ;
for ( p = module_blacklist ; * p ; p + = len ) {
len = strcspn ( p , " , " ) ;
if ( strlen ( module_name ) = = len & & ! memcmp ( module_name , p , len ) )
return true ;
if ( p [ len ] = = ' , ' )
len + + ;
}
return false ;
}
core_param ( module_blacklist , module_blacklist , charp , 0400 ) ;
2012-10-22 11:39:41 +04:00
static struct module * layout_and_allocate ( struct load_info * info , int flags )
2005-04-17 02:20:36 +04:00
{
struct module * mod ;
2016-07-27 05:36:21 +03:00
unsigned int ndx ;
2010-08-05 22:59:08 +04:00
int err ;
2009-10-02 02:43:54 +04:00
2018-06-22 14:59:29 +03:00
err = check_modinfo ( info - > mod , info , flags ) ;
2010-08-05 22:59:03 +04:00
if ( err )
return ERR_PTR ( err ) ;
2005-04-17 02:20:36 +04:00
/* Allow arches to frob section contents and sizes. */
2010-08-05 22:59:10 +04:00
err = module_frob_arch_sections ( info - > hdr , info - > sechdrs ,
2018-06-22 14:59:29 +03:00
info - > secstrings , info - > mod ) ;
2005-04-17 02:20:36 +04:00
if ( err < 0 )
2013-07-03 04:36:28 +04:00
return ERR_PTR ( err ) ;
2005-04-17 02:20:36 +04:00
2020-04-03 20:13:03 +03:00
err = module_enforce_rwx_sections ( info - > hdr , info - > sechdrs ,
info - > secstrings , info - > mod ) ;
if ( err < 0 )
return ERR_PTR ( err ) ;
2013-07-03 04:36:28 +04:00
/* We will do a special allocation for per-cpu sections later. */
info - > sechdrs [ info - > index . pcpu ] . sh_flags & = ~ ( unsigned long ) SHF_ALLOC ;
2005-04-17 02:20:36 +04:00
2016-07-27 05:36:21 +03:00
/*
* Mark ro_after_init section with SHF_RO_AFTER_INIT so that
* layout_sections ( ) can put it in the right place .
* Note : ro_after_init sections also have SHF_ { WRITE , ALLOC } set .
*/
ndx = find_sec ( info , " .data..ro_after_init " ) ;
2018-09-19 09:51:43 +03:00
if ( ndx )
info - > sechdrs [ ndx ] . sh_flags | = SHF_RO_AFTER_INIT ;
/*
* Mark the __jump_table section as ro_after_init as well : these data
* structures are never modified , with the exception of entries that
* refer to code in the __init section , which are annotated as such
* at module load time .
*/
ndx = find_sec ( info , " __jump_table " ) ;
2016-07-27 05:36:21 +03:00
if ( ndx )
info - > sechdrs [ ndx ] . sh_flags | = SHF_RO_AFTER_INIT ;
2005-04-17 02:20:36 +04:00
/* Determine total sizes, and put offsets in sh_entsize. For now
this is done generically ; there doesn ' t appear to be any
special cases for the architectures . */
2018-06-22 14:59:29 +03:00
layout_sections ( info - > mod , info ) ;
layout_symtab ( info - > mod , info ) ;
2005-04-17 02:20:36 +04:00
2010-08-05 22:59:02 +04:00
/* Allocate and move to the final place */
2018-06-22 14:59:29 +03:00
err = move_module ( info - > mod , info ) ;
2010-08-05 22:59:08 +04:00
if ( err )
2013-07-03 04:36:28 +04:00
return ERR_PTR ( err ) ;
2010-08-05 22:59:08 +04:00
/* Module has been copied to its final place now: return it. */
mod = ( void * ) info - > sechdrs [ info - > index . mod ] . sh_addr ;
2010-08-05 22:59:10 +04:00
kmemleak_load_module ( mod , info ) ;
2010-08-05 22:59:08 +04:00
return mod ;
}
/* mod is no longer valid after this! */
static void module_deallocate ( struct module * mod , struct load_info * info )
{
percpu_modfree ( mod ) ;
2015-01-20 01:37:04 +03:00
module_arch_freeing_init ( mod ) ;
2015-11-26 02:14:08 +03:00
module_memfree ( mod - > init_layout . base ) ;
module_memfree ( mod - > core_layout . base ) ;
2010-08-05 22:59:08 +04:00
}
2011-06-30 23:22:11 +04:00
int __weak module_finalize ( const Elf_Ehdr * hdr ,
const Elf_Shdr * sechdrs ,
struct module * me )
{
return 0 ;
}
2010-08-05 22:59:12 +04:00
static int post_relocation ( struct module * mod , const struct load_info * info )
{
2010-08-05 22:59:13 +04:00
/* Sort exception table now relocations are done. */
2010-08-05 22:59:12 +04:00
sort_extable ( mod - > extable , mod - > extable + mod - > num_exentries ) ;
/* Copy relocated percpu area over. */
percpu_modcopy ( mod , ( void * ) info - > sechdrs [ info - > index . pcpu ] . sh_addr ,
info - > sechdrs [ info - > index . pcpu ] . sh_size ) ;
2010-08-05 22:59:13 +04:00
/* Setup kallsyms-specific fields. */
2010-08-05 22:59:12 +04:00
add_kallsyms ( mod , info ) ;
/* Arch-specific module finalizing. */
return module_finalize ( info - > hdr , info - > sechdrs , mod ) ;
}
2012-09-28 09:01:03 +04:00
/* Is this module of this name done loading? No locks held. */
static bool finished_loading ( const char * name )
{
struct module * mod ;
bool ret ;
2015-02-11 07:31:13 +03:00
/*
* The module_mutex should not be a heavily contended lock ;
* if we get the occasional sleep here , we ' ll go an extra iteration
* in the wait_event_interruptible ( ) , which is harmless .
*/
sched_annotate_sleep ( ) ;
2012-09-28 09:01:03 +04:00
mutex_lock ( & module_mutex ) ;
2013-07-02 10:05:11 +04:00
mod = find_module_all ( name , strlen ( name ) , true ) ;
2019-05-29 14:26:25 +03:00
ret = ! mod | | mod - > state = = MODULE_STATE_LIVE ;
2012-09-28 09:01:03 +04:00
mutex_unlock ( & module_mutex ) ;
return ret ;
}
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
/* Call module constructors. */
static void do_mod_ctors ( struct module * mod )
{
# ifdef CONFIG_CONSTRUCTORS
unsigned long i ;
for ( i = 0 ; i < mod - > num_ctors ; i + + )
mod - > ctors [ i ] ( ) ;
# endif
}
2015-01-20 01:37:05 +03:00
/* For freeing module_init on success, in case kallsyms traversing */
struct mod_initfree {
2019-04-26 03:11:37 +03:00
struct llist_node node ;
2015-01-20 01:37:05 +03:00
void * module_init ;
} ;
2019-04-26 03:11:37 +03:00
static void do_free_init ( struct work_struct * w )
2015-01-20 01:37:05 +03:00
{
2019-04-26 03:11:37 +03:00
struct llist_node * pos , * n , * list ;
struct mod_initfree * initfree ;
list = llist_del_all ( & init_free_list ) ;
synchronize_rcu ( ) ;
llist_for_each_safe ( pos , n , list ) {
initfree = container_of ( pos , struct mod_initfree , node ) ;
module_memfree ( initfree - > module_init ) ;
kfree ( initfree ) ;
}
2015-01-20 01:37:05 +03:00
}
2019-04-26 03:11:37 +03:00
static int __init modules_wq_init ( void )
{
INIT_WORK ( & init_free_wq , do_free_init ) ;
init_llist_head ( & init_free_list ) ;
return 0 ;
}
module_init ( modules_wq_init ) ;
2015-02-18 00:46:50 +03:00
/*
* This is where the real work happens .
*
* Keep it uninlined to provide a reliable breakpoint target , e . g . for the gdb
* helper command ' lx - symbols ' .
*/
static noinline int do_init_module ( struct module * mod )
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
{
int ret = 0 ;
2015-01-20 01:37:05 +03:00
struct mod_initfree * freeinit ;
freeinit = kmalloc ( sizeof ( * freeinit ) , GFP_KERNEL ) ;
if ( ! freeinit ) {
ret = - ENOMEM ;
goto fail ;
}
2015-11-26 02:14:08 +03:00
freeinit - > module_init = mod - > init_layout . base ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
2013-01-16 06:52:51 +04:00
/*
* We want to find out whether @ mod uses async during init . Clear
* PF_USED_ASYNC . async_schedule * ( ) will set it .
*/
current - > flags & = ~ PF_USED_ASYNC ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
do_mod_ctors ( mod ) ;
/* Start the module */
if ( mod - > init ! = NULL )
ret = do_one_initcall ( mod - > init ) ;
if ( ret < 0 ) {
2015-01-20 01:37:05 +03:00
goto fail_free_freeinit ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
}
if ( ret > 0 ) {
2013-11-13 03:11:28 +04:00
pr_warn ( " %s: '%s'->init suspiciously returned %d, it should "
" follow 0/-E convention \n "
" %s: loading module anyway... \n " ,
__func__ , mod - > name , ret , __func__ ) ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
dump_stack ( ) ;
}
/* Now it's a first class citizen! */
mod - > state = MODULE_STATE_LIVE ;
blocking_notifier_call_chain ( & module_notify_list ,
MODULE_STATE_LIVE , mod ) ;
2013-01-16 06:52:51 +04:00
/*
* We need to finish all async code before the module init sequence
* is done . This has potential to deadlock . For example , a newly
* detected block device can trigger request_module ( ) of the
* default iosched from async probing task . Once userland helper
* reaches here , async_synchronize_full ( ) will wait on the async
* task waiting on request_module ( ) and deadlock .
*
* This deadlock is avoided by perfomring async_synchronize_full ( )
* iff module init queued any async jobs . This isn ' t a full
* solution as it will deadlock the same if module loading from
* async jobs nests more than once ; however , due to the various
* constraints , this hack seems to be the best option for now .
* Please refer to the following thread for details .
*
* http : //thread.gmane.org/gmane.linux.kernel/1420814
*/
2015-03-31 02:20:05 +03:00
if ( ! mod - > async_probe_requested & & ( current - > flags & PF_USED_ASYNC ) )
2013-01-16 06:52:51 +04:00
async_synchronize_full ( ) ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
2017-09-01 15:35:38 +03:00
ftrace_free_mem ( mod , mod - > init_layout . base , mod - > init_layout . base +
2017-03-04 02:00:22 +03:00
mod - > init_layout . size ) ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
mutex_lock ( & module_mutex ) ;
/* Drop initial reference. */
module_put ( mod ) ;
trim_init_extable ( mod ) ;
# ifdef CONFIG_KALLSYMS
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
/* Switch to core kallsyms now init is done: kallsyms may be walking! */
rcu_assign_pointer ( mod - > kallsyms , & mod - > core_kallsyms ) ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
# endif
2016-07-27 05:36:21 +03:00
module_enable_ro ( mod , true ) ;
2015-05-27 04:39:37 +03:00
mod_tree_remove_init ( mod ) ;
2015-01-20 01:37:04 +03:00
module_arch_freeing_init ( mod ) ;
2015-11-26 02:14:08 +03:00
mod - > init_layout . base = NULL ;
mod - > init_layout . size = 0 ;
mod - > init_layout . ro_size = 0 ;
2016-07-27 05:36:21 +03:00
mod - > init_layout . ro_after_init_size = 0 ;
2015-11-26 02:14:08 +03:00
mod - > init_layout . text_size = 0 ;
2015-01-20 01:37:05 +03:00
/*
* We want to free module_init , but be aware that kallsyms may be
2015-05-27 04:39:35 +03:00
* walking this with preempt disabled . In all the failure paths , we
2018-11-07 06:17:01 +03:00
* call synchronize_rcu ( ) , but we don ' t want to slow down the success
2019-04-26 03:11:37 +03:00
* path . module_memfree ( ) cannot be called in an interrupt , so do the
* work and call synchronize_rcu ( ) in a work queue .
*
2018-05-12 02:01:42 +03:00
* Note that module_alloc ( ) on most architectures creates W + X page
* mappings which won ' t be cleaned up until do_free_init ( ) runs . Any
* code such as mark_rodata_ro ( ) which depends on those mappings to
* be cleaned up needs to sync with the queued work - ie
2018-11-07 06:17:01 +03:00
* rcu_barrier ( )
2015-01-20 01:37:05 +03:00
*/
2019-04-26 03:11:37 +03:00
if ( llist_add ( & freeinit - > node , & init_free_list ) )
schedule_work ( & init_free_wq ) ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
mutex_unlock ( & module_mutex ) ;
wake_up_all ( & module_wq ) ;
return 0 ;
2015-01-20 01:37:05 +03:00
fail_free_freeinit :
kfree ( freeinit ) ;
fail :
/* Try to protect us from buggy refcounters. */
mod - > state = MODULE_STATE_GOING ;
2018-11-07 06:17:01 +03:00
synchronize_rcu ( ) ;
2015-01-20 01:37:05 +03:00
module_put ( mod ) ;
blocking_notifier_call_chain ( & module_notify_list ,
MODULE_STATE_GOING , mod ) ;
2016-03-17 03:55:39 +03:00
klp_module_going ( mod ) ;
2016-02-17 01:32:33 +03:00
ftrace_release_mod ( mod ) ;
2015-01-20 01:37:05 +03:00
free_module ( mod ) ;
wake_up_all ( & module_wq ) ;
return ret ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
}
static int may_init_module ( void )
{
if ( ! capable ( CAP_SYS_MODULE ) | | modules_disabled )
return - EPERM ;
return 0 ;
}
2013-01-21 10:48:59 +04:00
/*
* We try to place it in the list now to make sure it ' s unique before
* we dedicate too many resources . In particular , temporary percpu
* memory exhaustion .
*/
static int add_unformed_module ( struct module * mod )
{
int err ;
struct module * old ;
mod - > state = MODULE_STATE_UNFORMED ;
again :
mutex_lock ( & module_mutex ) ;
2013-07-02 10:05:11 +04:00
old = find_module_all ( mod - > name , strlen ( mod - > name ) , true ) ;
if ( old ! = NULL ) {
2019-05-29 14:26:25 +03:00
if ( old - > state ! = MODULE_STATE_LIVE ) {
2013-01-21 10:48:59 +04:00
/* Wait in case it fails to load. */
mutex_unlock ( & module_mutex ) ;
2015-02-11 07:31:13 +03:00
err = wait_event_interruptible ( module_wq ,
finished_loading ( mod - > name ) ) ;
2013-01-21 10:48:59 +04:00
if ( err )
goto out_unlocked ;
goto again ;
}
err = - EEXIST ;
goto out ;
}
2015-05-27 04:39:38 +03:00
mod_update_bounds ( mod ) ;
2013-01-21 10:48:59 +04:00
list_add_rcu ( & mod - > list , & modules ) ;
2015-05-27 04:39:37 +03:00
mod_tree_insert ( mod ) ;
2013-01-21 10:48:59 +04:00
err = 0 ;
out :
mutex_unlock ( & module_mutex ) ;
out_unlocked :
return err ;
}
static int complete_formation ( struct module * mod , struct load_info * info )
{
int err ;
mutex_lock ( & module_mutex ) ;
/* Find duplicate symbols (must be called under lock). */
2018-11-19 19:43:58 +03:00
err = verify_exported_symbols ( mod ) ;
2013-01-21 10:48:59 +04:00
if ( err < 0 )
goto out ;
/* This relies on module_mutex for list integrity. */
module_bug_finalize ( info - > hdr , info - > sechdrs , mod ) ;
2016-07-27 05:36:21 +03:00
module_enable_ro ( mod , false ) ;
2015-11-26 02:15:08 +03:00
module_enable_nx ( mod ) ;
2019-12-09 18:33:30 +03:00
module_enable_x ( mod ) ;
2014-05-14 05:24:19 +04:00
2013-01-21 10:48:59 +04:00
/* Mark state as coming so strong_try_module_get() ignores us,
* but kallsyms etc . can see us . */
mod - > state = MODULE_STATE_COMING ;
2014-05-14 05:24:19 +04:00
mutex_unlock ( & module_mutex ) ;
return 0 ;
2013-01-21 10:48:59 +04:00
out :
mutex_unlock ( & module_mutex ) ;
return err ;
}
2016-03-17 03:55:38 +03:00
static int prepare_coming_module ( struct module * mod )
{
2016-03-17 03:55:39 +03:00
int err ;
2016-03-17 03:55:38 +03:00
ftrace_module_enable ( mod ) ;
2016-03-17 03:55:39 +03:00
err = klp_module_coming ( mod ) ;
if ( err )
return err ;
2020-08-18 16:57:38 +03:00
err = blocking_notifier_call_chain_robust ( & module_notify_list ,
MODULE_STATE_COMING , MODULE_STATE_GOING , mod ) ;
err = notifier_to_errno ( err ) ;
if ( err )
klp_module_going ( mod ) ;
return err ;
2016-03-17 03:55:38 +03:00
}
module: add extra argument for parse_params() callback
This adds an extra argument onto parse_params() to be used
as a way to make the unused callback a bit more useful and
generic by allowing the caller to pass on a data structure
of its choice. An example use case is to allow us to easily
make module parameters for every module which we will do
next.
@ parse @
identifier name, args, params, num, level_min, level_max;
identifier unknown, param, val, doing;
type s16;
@@
extern char *parse_args(const char *name,
char *args,
const struct kernel_param *params,
unsigned num,
s16 level_min,
s16 level_max,
+ void *arg,
int (*unknown)(char *param, char *val,
const char *doing
+ , void *arg
));
@ parse_mod @
identifier name, args, params, num, level_min, level_max;
identifier unknown, param, val, doing;
type s16;
@@
char *parse_args(const char *name,
char *args,
const struct kernel_param *params,
unsigned num,
s16 level_min,
s16 level_max,
+ void *arg,
int (*unknown)(char *param, char *val,
const char *doing
+ , void *arg
))
{
...
}
@ parse_args_found @
expression R, E1, E2, E3, E4, E5, E6;
identifier func;
@@
(
R =
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
func);
|
R =
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
&func);
|
R =
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
NULL);
|
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
func);
|
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
&func);
|
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
NULL);
)
@ parse_args_unused depends on parse_args_found @
identifier parse_args_found.func;
@@
int func(char *param, char *val, const char *unused
+ , void *arg
)
{
...
}
@ mod_unused depends on parse_args_found @
identifier parse_args_found.func;
expression A1, A2, A3;
@@
- func(A1, A2, A3);
+ func(A1, A2, A3, NULL);
Generated-by: Coccinelle SmPL
Cc: cocci@systeme.lip6.fr
Cc: Tejun Heo <tj@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Felipe Contreras <felipe.contreras@gmail.com>
Cc: Ewan Milne <emilne@redhat.com>
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Tejun Heo <tj@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-31 02:20:03 +03:00
static int unknown_module_param_cb ( char * param , char * val , const char * modname ,
void * arg )
2013-07-02 10:05:12 +04:00
{
2015-03-31 02:20:05 +03:00
struct module * mod = arg ;
int ret ;
if ( strcmp ( param , " async_probe " ) = = 0 ) {
mod - > async_probe_requested = true ;
return 0 ;
}
2014-11-10 02:01:29 +03:00
/* Check for magic 'dyndbg' arg */
2015-03-31 02:20:05 +03:00
ret = ddebug_dyndbg_module_param_cb ( param , val , modname ) ;
2013-11-13 03:11:28 +04:00
if ( ret ! = 0 )
pr_warn ( " %s: unknown parameter '%s' ignored \n " , modname , param ) ;
2013-07-02 10:05:12 +04:00
return 0 ;
}
2010-08-05 22:59:08 +04:00
/* Allocate and load the module: note that size of section 0 is always
zero , and we rely on this for optional sections . */
2012-10-22 11:39:41 +04:00
static int load_module ( struct load_info * info , const char __user * uargs ,
int flags )
2010-08-05 22:59:08 +04:00
{
2013-01-21 10:48:59 +04:00
struct module * mod ;
2018-06-22 15:00:01 +03:00
long err = 0 ;
2014-04-28 06:04:33 +04:00
char * after_dashes ;
2010-08-05 22:59:08 +04:00
2018-06-22 15:00:01 +03:00
err = elf_header_check ( info ) ;
if ( err )
goto free_copy ;
err = setup_load_info ( info , flags ) ;
if ( err )
goto free_copy ;
if ( blacklisted ( info - > name ) ) {
err = - EPERM ;
goto free_copy ;
}
2016-04-28 02:54:01 +03:00
err = module_sig_check ( info , flags ) ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
if ( err )
goto free_copy ;
2010-08-05 22:59:08 +04:00
2018-06-22 15:00:01 +03:00
err = rewrite_section_headers ( info , flags ) ;
2010-08-05 22:59:08 +04:00
if ( err )
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
goto free_copy ;
2010-08-05 22:59:08 +04:00
2018-06-22 15:00:01 +03:00
/* Check module struct version now, before we try to use module. */
if ( ! check_modstruct_version ( info , info - > mod ) ) {
err = - ENOEXEC ;
goto free_copy ;
}
2010-08-05 22:59:08 +04:00
/* Figure out module layout, and allocate all the memory. */
2012-10-22 11:39:41 +04:00
mod = layout_and_allocate ( info , flags ) ;
2010-08-05 22:59:02 +04:00
if ( IS_ERR ( mod ) ) {
err = PTR_ERR ( mod ) ;
2010-08-05 22:59:08 +04:00
goto free_copy ;
2005-04-17 02:20:36 +04:00
}
2017-02-04 21:10:38 +03:00
audit_log_kern_module ( mod - > name ) ;
2013-01-21 10:48:59 +04:00
/* Reserve our place in the list. */
err = add_unformed_module ( mod ) ;
if ( err )
2013-01-12 06:57:34 +04:00
goto free_module ;
2012-09-26 13:09:40 +04:00
# ifdef CONFIG_MODULE_SIG
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
mod - > sig_ok = info - > sig_ok ;
2013-01-21 10:33:02 +04:00
if ( ! mod - > sig_ok ) {
2013-11-13 03:11:28 +04:00
pr_notice_once ( " %s: module verification failed: signature "
2015-02-06 07:39:57 +03:00
" and/or required key missing - tainting "
2013-11-13 03:11:28 +04:00
" kernel \n " , mod - > name ) ;
Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE
Users have reported being unable to trace non-signed modules loaded
within a kernel supporting module signature.
This is caused by tracepoint.c:tracepoint_module_coming() refusing to
take into account tracepoints sitting within force-loaded modules
(TAINT_FORCED_MODULE). The reason for this check, in the first place, is
that a force-loaded module may have a struct module incompatible with
the layout expected by the kernel, and can thus cause a kernel crash
upon forced load of that module on a kernel with CONFIG_TRACEPOINTS=y.
Tracepoints, however, specifically accept TAINT_OOT_MODULE and
TAINT_CRAP, since those modules do not lead to the "very likely system
crash" issue cited above for force-loaded modules.
With kernels having CONFIG_MODULE_SIG=y (signed modules), a non-signed
module is tainted re-using the TAINT_FORCED_MODULE taint flag.
Unfortunately, this means that Tracepoints treat that module as a
force-loaded module, and thus silently refuse to consider any tracepoint
within this module.
Since an unsigned module does not fit within the "very likely system
crash" category of tainting, add a new TAINT_UNSIGNED_MODULE taint flag
to specifically address this taint behavior, and accept those modules
within Tracepoints. We use the letter 'X' as a taint flag character for
a module being loaded that doesn't know how to sign its name (proposed
by Steven Rostedt).
Also add the missing 'O' entry to trace event show_module_flags() list
for the sake of completeness.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
NAKed-by: Ingo Molnar <mingo@redhat.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: David Howells <dhowells@redhat.com>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-03-13 05:41:30 +04:00
add_taint_module ( mod , TAINT_UNSIGNED_MODULE , LOCKDEP_STILL_OK ) ;
2013-01-21 10:33:02 +04:00
}
2012-09-26 13:09:40 +04:00
# endif
2013-07-03 04:36:28 +04:00
/* To avoid stressing percpu allocator, do this once we're unique. */
2013-07-03 04:36:29 +04:00
err = percpu_modalloc ( mod , info ) ;
2013-07-03 04:36:28 +04:00
if ( err )
goto unlink_mod ;
2010-08-05 22:59:10 +04:00
/* Now module is in final location, initialize linked lists, etc. */
2010-08-05 22:59:04 +04:00
err = module_unload_init ( mod ) ;
if ( err )
2013-01-12 06:57:34 +04:00
goto unlink_mod ;
2005-04-17 02:20:36 +04:00
2015-06-26 00:14:38 +03:00
init_param_lock ( mod ) ;
module: add per-module param_lock
Add a "param_lock" mutex to each module, and update params.c to use
the correct built-in or module mutex while locking kernel params.
Remove the kparam_block_sysfs_r/w() macros, replace them with direct
calls to kernel_param_[un]lock(module).
The kernel param code currently uses a single mutex to protect
modification of any and all kernel params. While this generally works,
there is one specific problem with it; a module callback function
cannot safely load another module, i.e. with request_module() or even
with indirect calls such as crypto_has_alg(). If the module to be
loaded has any of its params configured (e.g. with a /etc/modprobe.d/*
config file), then the attempt will result in a deadlock between the
first module param callback waiting for modprobe, and modprobe trying to
lock the single kernel param mutex to set the new module's param.
This fixes that by using per-module mutexes, so that each individual module
is protected against concurrent changes in its own kernel params, but is
not blocked by changes to other module params. All built-in modules
continue to use the built-in mutex, since they will always be loaded at
runtime and references (e.g. request_module(), crypto_has_alg()) to them
will never cause load-time param changing.
This also simplifies the interface used by modules to block sysfs access
to their params; while there are currently functions to block and unblock
sysfs param access which are split up by read and write and expect a single
kernel param to be passed, their actual operation is identical and applies
to all params, not just the one passed to them; they simply lock and unlock
the global param mutex. They are replaced with direct calls to
kernel_param_[un]lock(THIS_MODULE), which locks THIS_MODULE's param_lock, or
if the module is built-in, it locks the built-in mutex.
Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-06-16 23:48:52 +03:00
2010-08-05 22:59:05 +04:00
/* Now we've got everything in the final locations, we can
* find optional sections . */
2013-10-14 11:38:46 +04:00
err = find_module_sections ( mod , info ) ;
if ( err )
goto free_unload ;
2008-02-29 01:11:02 +03:00
2010-08-05 22:59:10 +04:00
err = check_module_license_and_versions ( mod ) ;
2010-08-05 22:59:05 +04:00
if ( err )
goto free_unload ;
2006-01-08 12:03:41 +03:00
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
/* Set up MODINFO_ATTR fields */
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
setup_modinfo ( mod , info ) ;
[PATCH] modules: add version and srcversion to sysfs
This patch adds version and srcversion files to
/sys/module/${modulename} containing the version and srcversion fields
of the module's modinfo section (if present).
/sys/module/e1000
|-- srcversion
`-- version
This patch differs slightly from the version posted in January, as it
now uses the new kstrdup() call in -mm.
Why put this in sysfs?
a) Tools like DKMS, which deal with changing out individual kernel
modules without replacing the whole kernel, can behave smarter if they
can tell the version of a given module. The autoinstaller feature, for
example, which determines if your system has a "good" version of a
driver (i.e. if the one provided by DKMS has a newer verson than that
provided by the kernel package installed), and to automatically compile
and install a newer version if DKMS has it but your kernel doesn't yet
have that version.
b) Because sysadmins manually, or with tools like DKMS, can switch out
modules on the file system, you can't count on 'modinfo foo.ko', which
looks at /lib/modules/${kernelver}/... actually matching what is loaded
into the kernel already. Hence asking sysfs for this.
c) as the unbind-driver-from-device work takes shape, it will be
possible to rebind a driver that's built-in (no .ko to modinfo for the
version) to a newly loaded module. sysfs will have the
currently-built-in version info, for comparison.
d) tech support scripts can then easily grab the version info for what's
running presently - a question I get often.
There has been renewed interest in this patch on linux-scsi by driver
authors.
As the idea originated from GregKH, I leave his Signed-off-by: intact,
though the implementation is nearly completely new. Compiled and run on
x86 and x86_64.
From: Matthew Dobson <colpatch@us.ibm.com>
build fix
From: Thierry Vignaud <tvignaud@mandriva.com>
build fix
From: Matthew Dobson <colpatch@us.ibm.com>
warning fix
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 09:05:15 +04:00
2005-04-17 02:20:36 +04:00
/* Fix up syms, so that st_value is a pointer to location. */
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
err = simplify_symbols ( mod , info ) ;
2005-04-17 02:20:36 +04:00
if ( err < 0 )
2010-08-05 22:59:08 +04:00
goto free_modinfo ;
2005-04-17 02:20:36 +04:00
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
err = apply_relocations ( mod , info ) ;
2010-08-05 22:59:05 +04:00
if ( err < 0 )
2010-08-05 22:59:08 +04:00
goto free_modinfo ;
2005-04-17 02:20:36 +04:00
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
err = post_relocation ( mod , info ) ;
2005-04-17 02:20:36 +04:00
if ( err < 0 )
2010-08-05 22:59:08 +04:00
goto free_modinfo ;
2005-04-17 02:20:36 +04:00
2010-08-05 22:59:05 +04:00
flush_module_icache ( mod ) ;
2005-09-07 02:17:11 +04:00
2010-08-05 22:59:10 +04:00
/* Now copy in args */
mod - > args = strndup_user ( uargs , ~ 0UL > > 1 ) ;
if ( IS_ERR ( mod - > args ) ) {
err = PTR_ERR ( mod - > args ) ;
goto free_arch_cleanup ;
}
2006-03-25 14:07:05 +03:00
2017-07-07 06:15:58 +03:00
dynamic_debug_setup ( mod , info - > debug , info - > num_debug ) ;
2010-07-03 07:07:35 +04:00
2014-04-24 18:40:12 +04:00
/* Ftrace init must be called in the MODULE_STATE_UNFORMED state */
ftrace_module_init ( mod ) ;
2013-01-21 10:48:59 +04:00
/* Finally it's fully formed, ready to start executing. */
err = complete_formation ( mod , info ) ;
if ( err )
2013-01-12 06:57:34 +04:00
goto ddebug_cleanup ;
2010-06-05 21:17:37 +04:00
2016-03-17 03:55:38 +03:00
err = prepare_coming_module ( mod ) ;
if ( err )
goto bug_cleanup ;
2010-08-05 22:59:13 +04:00
/* Module is ready to execute: parsing args may do that. */
2014-04-28 06:04:33 +04:00
after_dashes = parse_args ( mod - > name , mod - > args , mod - > kp , mod - > num_kp ,
2016-02-03 09:25:26 +03:00
- 32768 , 32767 , mod ,
module: add extra argument for parse_params() callback
This adds an extra argument onto parse_params() to be used
as a way to make the unused callback a bit more useful and
generic by allowing the caller to pass on a data structure
of its choice. An example use case is to allow us to easily
make module parameters for every module which we will do
next.
@ parse @
identifier name, args, params, num, level_min, level_max;
identifier unknown, param, val, doing;
type s16;
@@
extern char *parse_args(const char *name,
char *args,
const struct kernel_param *params,
unsigned num,
s16 level_min,
s16 level_max,
+ void *arg,
int (*unknown)(char *param, char *val,
const char *doing
+ , void *arg
));
@ parse_mod @
identifier name, args, params, num, level_min, level_max;
identifier unknown, param, val, doing;
type s16;
@@
char *parse_args(const char *name,
char *args,
const struct kernel_param *params,
unsigned num,
s16 level_min,
s16 level_max,
+ void *arg,
int (*unknown)(char *param, char *val,
const char *doing
+ , void *arg
))
{
...
}
@ parse_args_found @
expression R, E1, E2, E3, E4, E5, E6;
identifier func;
@@
(
R =
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
func);
|
R =
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
&func);
|
R =
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
NULL);
|
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
func);
|
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
&func);
|
parse_args(E1, E2, E3, E4, E5, E6,
+ NULL,
NULL);
)
@ parse_args_unused depends on parse_args_found @
identifier parse_args_found.func;
@@
int func(char *param, char *val, const char *unused
+ , void *arg
)
{
...
}
@ mod_unused depends on parse_args_found @
identifier parse_args_found.func;
expression A1, A2, A3;
@@
- func(A1, A2, A3);
+ func(A1, A2, A3, NULL);
Generated-by: Coccinelle SmPL
Cc: cocci@systeme.lip6.fr
Cc: Tejun Heo <tj@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Felipe Contreras <felipe.contreras@gmail.com>
Cc: Ewan Milne <emilne@redhat.com>
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Tejun Heo <tj@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-31 02:20:03 +03:00
unknown_module_param_cb ) ;
2014-04-28 06:04:33 +04:00
if ( IS_ERR ( after_dashes ) ) {
err = PTR_ERR ( after_dashes ) ;
2016-03-17 03:55:38 +03:00
goto coming_cleanup ;
2014-04-28 06:04:33 +04:00
} else if ( after_dashes ) {
pr_warn ( " %s: parameters '%s' after `--' ignored \n " ,
mod - > name , after_dashes ) ;
}
2005-04-17 02:20:36 +04:00
2017-02-04 21:10:38 +03:00
/* Link in to sysfs. */
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
err = mod_sysfs_setup ( mod , info , mod - > kp , mod - > num_kp ) ;
2005-04-17 02:20:36 +04:00
if ( err < 0 )
2016-03-17 03:55:38 +03:00
goto coming_cleanup ;
2010-06-05 21:17:36 +04:00
2016-03-23 03:03:16 +03:00
if ( is_livepatch_module ( mod ) ) {
err = copy_module_elf ( mod , info ) ;
if ( err < 0 )
goto sysfs_cleanup ;
}
2012-01-13 03:02:14 +04:00
/* Get rid of temporary copy. */
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
free_copy ( info ) ;
2005-04-17 02:20:36 +04:00
/* Done! */
2010-08-05 22:59:13 +04:00
trace_module_load ( mod ) ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
return do_init_module ( mod ) ;
2005-04-17 02:20:36 +04:00
2016-03-23 03:03:16 +03:00
sysfs_cleanup :
mod_sysfs_teardown ( mod ) ;
2016-03-17 03:55:38 +03:00
coming_cleanup :
2016-10-20 19:18:12 +03:00
mod - > state = MODULE_STATE_GOING ;
2017-02-11 01:06:22 +03:00
destroy_params ( mod - > kp , mod - > num_kp ) ;
2016-03-17 03:55:38 +03:00
blocking_notifier_call_chain ( & module_notify_list ,
MODULE_STATE_GOING , mod ) ;
2016-03-17 03:55:39 +03:00
klp_module_going ( mod ) ;
2013-01-12 06:57:34 +04:00
bug_cleanup :
/* module_bug_cleanup needs module_mutex protection */
2010-06-05 21:17:36 +04:00
mutex_lock ( & module_mutex ) ;
2010-10-05 22:29:27 +04:00
module_bug_cleanup ( mod ) ;
2013-01-21 08:22:58 +04:00
mutex_unlock ( & module_mutex ) ;
2014-08-15 22:43:37 +04:00
2013-01-21 10:48:59 +04:00
ddebug_cleanup :
2018-01-08 08:11:21 +03:00
ftrace_release_mod ( mod ) ;
2017-07-07 06:15:58 +03:00
dynamic_debug_remove ( mod , info - > debug ) ;
2018-11-07 06:17:01 +03:00
synchronize_rcu ( ) ;
2010-08-05 22:59:10 +04:00
kfree ( mod - > args ) ;
free_arch_cleanup :
2005-04-17 02:20:36 +04:00
module_arch_cleanup ( mod ) ;
2010-08-05 22:59:08 +04:00
free_modinfo :
2009-09-25 10:32:58 +04:00
free_modinfo ( mod ) ;
2010-08-05 22:59:05 +04:00
free_unload :
2005-04-17 02:20:36 +04:00
module_unload_free ( mod ) ;
2013-01-12 06:57:34 +04:00
unlink_mod :
mutex_lock ( & module_mutex ) ;
/* Unlink carefully: kallsyms could be walking list. */
list_del_rcu ( & mod - > list ) ;
2015-07-09 00:18:06 +03:00
mod_tree_remove ( mod ) ;
2013-01-12 06:57:34 +04:00
wake_up_all ( & module_wq ) ;
2015-05-27 04:39:35 +03:00
/* Wait for RCU-sched synchronizing before releasing mod->list. */
2018-11-07 06:17:01 +03:00
synchronize_rcu ( ) ;
2013-01-12 06:57:34 +04:00
mutex_unlock ( & module_mutex ) ;
2010-08-05 22:59:08 +04:00
free_module :
2015-02-26 18:23:11 +03:00
/* Free lock-classes; relies on the preceding sync_rcu() */
2015-11-26 02:14:08 +03:00
lockdep_free_key_range ( mod - > core_layout . base , mod - > core_layout . size ) ;
2015-02-26 18:23:11 +03:00
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
module_deallocate ( mod , info ) ;
2010-08-05 22:59:08 +04:00
free_copy :
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
free_copy ( info ) ;
return err ;
2009-06-18 03:28:03 +04:00
}
2009-01-14 16:14:10 +03:00
SYSCALL_DEFINE3 ( init_module , void __user * , umod ,
unsigned long , len , const char __user * , uargs )
2005-04-17 02:20:36 +04:00
{
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
int err ;
struct load_info info = { } ;
2005-04-17 02:20:36 +04:00
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
err = may_init_module ( ) ;
if ( err )
return err ;
2005-04-17 02:20:36 +04:00
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
pr_debug ( " init_module: umod=%p, len=%lu, uargs=%p \n " ,
umod , len , uargs ) ;
2005-04-17 02:20:36 +04:00
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
err = copy_module_from_user ( umod , len , & info ) ;
if ( err )
return err ;
2005-04-17 02:20:36 +04:00
2012-10-22 11:39:41 +04:00
return load_module ( & info , uargs , 0 ) ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
}
2010-11-29 21:15:42 +03:00
2012-10-22 11:39:41 +04:00
SYSCALL_DEFINE3 ( finit_module , int , fd , const char __user * , uargs , int , flags )
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
{
struct load_info info = { } ;
2015-12-30 15:35:30 +03:00
loff_t size ;
void * hdr ;
int err ;
2010-11-29 21:15:42 +03:00
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
err = may_init_module ( ) ;
if ( err )
return err ;
2005-04-17 02:20:36 +04:00
2012-10-22 11:39:41 +04:00
pr_debug ( " finit_module: fd=%d, uargs=%p, flags=%i \n " , fd , uargs , flags ) ;
2008-03-10 21:43:52 +03:00
2012-10-22 11:39:41 +04:00
if ( flags & ~ ( MODULE_INIT_IGNORE_MODVERSIONS
| MODULE_INIT_IGNORE_VERMAGIC ) )
return - EINVAL ;
async: Fix module loading async-work regression
Several drivers use asynchronous work to do device discovery, and we
synchronize with them in the compiled-in case before we actually try to
mount root filesystems etc.
However, when compiled as modules, that synchronization is missing - the
module loading completes, but the driver hasn't actually finished
probing for devices, and that means that any user mode that expects to
use the devices after the 'insmod' is now potentially broken.
We already saw one case of a similar issue in the ACPI battery code,
where the kernel itself expected the module to be all done, and unmapped
the init memory - but the async device discovery was still running.
That got hacked around by just removing the "__init" (see commit
5d38258ec026921a7b266f4047ebeaa75db358e5 "ACPI battery: fix async boot
oops"), but the real fix is to just make the module loading wait for all
async work to be completed.
It will slow down module loading, but since common devices should be
built in anyway, and since the bug is really annoying and hard to handle
from user space (and caused several S3 resume regressions), the simple
fix to wait is the right one.
This fixes at least
http://bugzilla.kernel.org/show_bug.cgi?id=13063
but probably a few other bugzilla entries too (12936, for example), and
is confirmed to fix Rafael's storage driver breakage after resume bug
report (no bugzilla entry).
We should also be able to now revert that ACPI battery fix.
Reported-and-tested-by: Rafael J. Wysocki <rjw@suse.com>
Tested-by: Heinz Diehl <htd@fancy-poultry.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-10 23:17:41 +04:00
2015-12-30 15:35:30 +03:00
err = kernel_read_file_from_fd ( fd , & hdr , & size , INT_MAX ,
READING_MODULE ) ;
module: add syscall to load module from fd
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
2012-10-16 01:01:07 +04:00
if ( err )
return err ;
2015-12-30 15:35:30 +03:00
info . hdr = hdr ;
info . len = size ;
2005-04-17 02:20:36 +04:00
2012-10-22 11:39:41 +04:00
return load_module ( & info , uargs , flags ) ;
2005-04-17 02:20:36 +04:00
}
static inline int within ( unsigned long addr , void * start , unsigned long size )
{
return ( ( void * ) addr > = start & & ( void * ) addr < start + size ) ;
}
# ifdef CONFIG_KALLSYMS
/*
* This ignores the intensely annoying " mapping symbols " found
* in ARM ELF files : $ a , $ t and $ d .
*/
static inline int is_arm_mapping_symbol ( const char * str )
{
2014-07-27 01:59:01 +04:00
if ( str [ 0 ] = = ' . ' & & str [ 1 ] = = ' L ' )
return true ;
2014-09-17 01:37:18 +04:00
return str [ 0 ] = = ' $ ' & & strchr ( " axtd " , str [ 1 ] )
2005-04-17 02:20:36 +04:00
& & ( str [ 2 ] = = ' \0 ' | | str [ 2 ] = = ' . ' ) ;
}
2018-11-19 19:43:58 +03:00
static const char * kallsyms_symbol_name ( struct mod_kallsyms * kallsyms , unsigned int symnum )
2016-02-03 09:25:26 +03:00
{
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
return kallsyms - > strtab + kallsyms - > symtab [ symnum ] . st_name ;
2016-02-03 09:25:26 +03:00
}
2018-11-19 19:43:58 +03:00
/*
* Given a module and address , find the corresponding symbol and return its name
* while providing its size and offset if needed .
*/
static const char * find_kallsyms_symbol ( struct module * mod ,
unsigned long addr ,
unsigned long * size ,
unsigned long * offset )
2005-04-17 02:20:36 +04:00
{
unsigned int i , best = 0 ;
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
unsigned long nextval , bestval ;
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
struct mod_kallsyms * kallsyms = rcu_dereference_sched ( mod - > kallsyms ) ;
2005-04-17 02:20:36 +04:00
/* At worse, next value is at end of module */
2009-01-07 01:41:49 +03:00
if ( within_module_init ( addr , mod ) )
2015-11-26 02:14:08 +03:00
nextval = ( unsigned long ) mod - > init_layout . base + mod - > init_layout . text_size ;
2007-10-18 14:06:07 +04:00
else
2015-11-26 02:14:08 +03:00
nextval = ( unsigned long ) mod - > core_layout . base + mod - > core_layout . text_size ;
2005-04-17 02:20:36 +04:00
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
bestval = kallsyms_symbol_value ( & kallsyms - > symtab [ best ] ) ;
2011-03-31 05:57:33 +04:00
/* Scan for closest preceding symbol, and next symbol. (ELF
2007-10-18 14:06:07 +04:00
starts real symbols at 1 ) . */
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
for ( i = 1 ; i < kallsyms - > num_symtab ; i + + ) {
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
const Elf_Sym * sym = & kallsyms - > symtab [ i ] ;
unsigned long thisval = kallsyms_symbol_value ( sym ) ;
if ( sym - > st_shndx = = SHN_UNDEF )
2005-04-17 02:20:36 +04:00
continue ;
/* We ignore unnamed symbols: they're uninformative
* and inserted at a whim . */
2018-11-19 19:43:58 +03:00
if ( * kallsyms_symbol_name ( kallsyms , i ) = = ' \0 '
| | is_arm_mapping_symbol ( kallsyms_symbol_name ( kallsyms , i ) ) )
2016-02-03 09:25:26 +03:00
continue ;
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
if ( thisval < = addr & & thisval > bestval ) {
2005-04-17 02:20:36 +04:00
best = i ;
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
bestval = thisval ;
}
if ( thisval > addr & & thisval < nextval )
nextval = thisval ;
2005-04-17 02:20:36 +04:00
}
if ( ! best )
return NULL ;
2007-05-08 11:28:41 +04:00
if ( size )
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
* size = nextval - bestval ;
2007-05-08 11:28:41 +04:00
if ( offset )
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
* offset = addr - bestval ;
2018-11-19 19:43:58 +03:00
return kallsyms_symbol_name ( kallsyms , best ) ;
2005-04-17 02:20:36 +04:00
}
sections: split dereference_function_descriptor()
There are two format specifiers to print out a pointer in symbolic
format: '%pS/%ps' and '%pF/%pf'. On most architectures, the two
mean exactly the same thing, but some architectures (ia64, ppc64,
parisc64) use an indirect pointer for C function pointers, where
the function pointer points to a function descriptor (which in
turn contains the actual pointer to the code). The '%pF/%pf, when
used appropriately, automatically does the appropriate function
descriptor dereference on such architectures.
The "when used appropriately" part is tricky. Basically this is
a subtle ABI detail, specific to some platforms, that made it to
the API level and people can be unaware of it and miss the whole
"we need to dereference the function" business out. [1] proves
that point (note that it fixes only '%pF' and '%pS', there might
be '%pf' and '%ps' cases as well).
It appears that we can handle everything within the affected
arches and make '%pS/%ps' smart enough to retire '%pF/%pf'.
Function descriptors live in .opd elf section and all affected
arches (ia64, ppc64, parisc64) handle it properly for kernel
and modules. So we, technically, can decide if the dereference
is needed by simply looking at the pointer: if it belongs to
.opd section then we need to dereference it.
The kernel and modules have their own .opd sections, obviously,
that's why we need to split dereference_function_descriptor()
and use separate kernel and module dereference arch callbacks.
This patch does the first step, it
a) adds dereference_kernel_function_descriptor() function.
b) adds a weak alias to dereference_module_function_descriptor()
function.
So, for the time being, we will have:
1) dereference_function_descriptor()
A generic function, that simply dereferences the pointer. There is
bunch of places that call it: kgdbts, init/main.c, extable, etc.
2) dereference_kernel_function_descriptor()
A function to call on kernel symbols that does kernel .opd section
address range test.
3) dereference_module_function_descriptor()
A function to call on modules' symbols that does modules' .opd
section address range test.
[1] https://marc.info/?l=linux-kernel&m=150472969730573
Link: http://lkml.kernel.org/r/20171109234830.5067-2-sergey.senozhatsky@gmail.com
To: Fenghua Yu <fenghua.yu@intel.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Paul Mackerras <paulus@samba.org>
To: Michael Ellerman <mpe@ellerman.id.au>
To: James Bottomley <jejb@parisc-linux.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-ia64@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Tony Luck <tony.luck@intel.com> #ia64
Tested-by: Santosh Sivaraj <santosh@fossix.org> #powerpc
Tested-by: Helge Deller <deller@gmx.de> #parisc64
Signed-off-by: Petr Mladek <pmladek@suse.com>
2017-11-10 02:48:25 +03:00
void * __weak dereference_module_function_descriptor ( struct module * mod ,
void * ptr )
{
return ptr ;
}
2008-01-30 01:13:22 +03:00
/* For kallsyms to ask for address resolution. NULL means not found. Careful
* not to lock to avoid deadlock on oopses , simply disable preemption . */
2008-02-08 15:18:43 +03:00
const char * module_address_lookup ( unsigned long addr ,
2008-01-30 01:13:22 +03:00
unsigned long * size ,
unsigned long * offset ,
char * * modname ,
char * namebuf )
2005-04-17 02:20:36 +04:00
{
2008-01-14 11:55:03 +03:00
const char * ret = NULL ;
2015-05-27 04:39:37 +03:00
struct module * mod ;
2005-04-17 02:20:36 +04:00
2008-01-14 11:55:03 +03:00
preempt_disable ( ) ;
2015-05-27 04:39:37 +03:00
mod = __module_address ( addr ) ;
if ( mod ) {
if ( modname )
* modname = mod - > name ;
2018-11-19 19:43:58 +03:00
ret = find_kallsyms_symbol ( mod , addr , size , offset ) ;
2005-04-17 02:20:36 +04:00
}
2008-01-30 01:13:22 +03:00
/* Make a copy in here where it's safe */
if ( ret ) {
strncpy ( namebuf , ret , KSYM_NAME_LEN - 1 ) ;
ret = namebuf ;
}
2008-01-14 11:55:03 +03:00
preempt_enable ( ) ;
2015-05-27 04:39:37 +03:00
2008-02-08 15:18:43 +03:00
return ret ;
2005-04-17 02:20:36 +04:00
}
2007-05-08 11:28:43 +04:00
int lookup_module_symbol_name ( unsigned long addr , char * symname )
{
struct module * mod ;
2008-01-14 11:55:03 +03:00
preempt_disable ( ) ;
2008-08-30 12:09:00 +04:00
list_for_each_entry_rcu ( mod , & modules , list ) {
2013-01-12 05:08:44 +04:00
if ( mod - > state = = MODULE_STATE_UNFORMED )
continue ;
2014-07-27 01:54:01 +04:00
if ( within_module ( addr , mod ) ) {
2007-05-08 11:28:43 +04:00
const char * sym ;
2018-11-19 19:43:58 +03:00
sym = find_kallsyms_symbol ( mod , addr , NULL , NULL ) ;
2007-05-08 11:28:43 +04:00
if ( ! sym )
goto out ;
2018-11-19 19:43:58 +03:00
2007-07-17 15:03:51 +04:00
strlcpy ( symname , sym , KSYM_NAME_LEN ) ;
2008-01-14 11:55:03 +03:00
preempt_enable ( ) ;
2007-05-08 11:28:43 +04:00
return 0 ;
}
}
out :
2008-01-14 11:55:03 +03:00
preempt_enable ( ) ;
2007-05-08 11:28:43 +04:00
return - ERANGE ;
}
2007-05-08 11:28:47 +04:00
int lookup_module_symbol_attrs ( unsigned long addr , unsigned long * size ,
unsigned long * offset , char * modname , char * name )
{
struct module * mod ;
2008-01-14 11:55:03 +03:00
preempt_disable ( ) ;
2008-08-30 12:09:00 +04:00
list_for_each_entry_rcu ( mod , & modules , list ) {
2013-01-12 05:08:44 +04:00
if ( mod - > state = = MODULE_STATE_UNFORMED )
continue ;
2014-07-27 01:54:01 +04:00
if ( within_module ( addr , mod ) ) {
2007-05-08 11:28:47 +04:00
const char * sym ;
2018-11-19 19:43:58 +03:00
sym = find_kallsyms_symbol ( mod , addr , size , offset ) ;
2007-05-08 11:28:47 +04:00
if ( ! sym )
goto out ;
if ( modname )
2007-07-17 15:03:51 +04:00
strlcpy ( modname , mod - > name , MODULE_NAME_LEN ) ;
2007-05-08 11:28:47 +04:00
if ( name )
2007-07-17 15:03:51 +04:00
strlcpy ( name , sym , KSYM_NAME_LEN ) ;
2008-01-14 11:55:03 +03:00
preempt_enable ( ) ;
2007-05-08 11:28:47 +04:00
return 0 ;
}
}
out :
2008-01-14 11:55:03 +03:00
preempt_enable ( ) ;
2007-05-08 11:28:47 +04:00
return - ERANGE ;
}
2007-05-08 11:28:39 +04:00
int module_get_kallsym ( unsigned int symnum , unsigned long * value , char * type ,
char * name , char * module_name , int * exported )
2005-04-17 02:20:36 +04:00
{
struct module * mod ;
2008-01-14 11:55:03 +03:00
preempt_disable ( ) ;
2008-08-30 12:09:00 +04:00
list_for_each_entry_rcu ( mod , & modules , list ) {
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
struct mod_kallsyms * kallsyms ;
2013-01-12 05:08:44 +04:00
if ( mod - > state = = MODULE_STATE_UNFORMED )
continue ;
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
kallsyms = rcu_dereference_sched ( mod - > kallsyms ) ;
if ( symnum < kallsyms - > num_symtab ) {
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
const Elf_Sym * sym = & kallsyms - > symtab [ symnum ] ;
* value = kallsyms_symbol_value ( sym ) ;
2019-02-25 22:59:58 +03:00
* type = kallsyms - > typetab [ symnum ] ;
2018-11-19 19:43:58 +03:00
strlcpy ( name , kallsyms_symbol_name ( kallsyms , symnum ) , KSYM_NAME_LEN ) ;
2007-07-17 15:03:51 +04:00
strlcpy ( module_name , mod - > name , MODULE_NAME_LEN ) ;
2009-01-05 17:40:10 +03:00
* exported = is_exported ( name , * value , mod ) ;
2008-01-14 11:55:03 +03:00
preempt_enable ( ) ;
2007-05-08 11:28:39 +04:00
return 0 ;
2005-04-17 02:20:36 +04:00
}
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
symnum - = kallsyms - > num_symtab ;
2005-04-17 02:20:36 +04:00
}
2008-01-14 11:55:03 +03:00
preempt_enable ( ) ;
2007-05-08 11:28:39 +04:00
return - ERANGE ;
2005-04-17 02:20:36 +04:00
}
2018-11-19 19:43:58 +03:00
/* Given a module and name of symbol, find and return the symbol's value */
static unsigned long find_kallsyms_symbol_value ( struct module * mod , const char * name )
2005-04-17 02:20:36 +04:00
{
unsigned int i ;
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
struct mod_kallsyms * kallsyms = rcu_dereference_sched ( mod - > kallsyms ) ;
2005-04-17 02:20:36 +04:00
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
for ( i = 0 ; i < kallsyms - > num_symtab ; i + + ) {
const Elf_Sym * sym = & kallsyms - > symtab [ i ] ;
2018-11-19 19:43:58 +03:00
if ( strcmp ( name , kallsyms_symbol_name ( kallsyms , i ) ) = = 0 & &
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
sym - > st_shndx ! = SHN_UNDEF )
return kallsyms_symbol_value ( sym ) ;
}
2005-04-17 02:20:36 +04:00
return 0 ;
}
/* Look for this name: can be of form module:name. */
unsigned long module_kallsyms_lookup_name ( const char * name )
{
struct module * mod ;
char * colon ;
unsigned long ret = 0 ;
/* Don't lock: we're in enough trouble already. */
2008-01-14 11:55:03 +03:00
preempt_disable ( ) ;
2017-04-23 20:23:43 +03:00
if ( ( colon = strnchr ( name , MODULE_NAME_LEN , ' : ' ) ) ! = NULL ) {
2013-07-02 10:05:11 +04:00
if ( ( mod = find_module_all ( name , colon - name , false ) ) ! = NULL )
2018-11-19 19:43:58 +03:00
ret = find_kallsyms_symbol_value ( mod , colon + 1 ) ;
2005-04-17 02:20:36 +04:00
} else {
2013-01-12 05:08:44 +04:00
list_for_each_entry_rcu ( mod , & modules , list ) {
if ( mod - > state = = MODULE_STATE_UNFORMED )
continue ;
2018-11-19 19:43:58 +03:00
if ( ( ret = find_kallsyms_symbol_value ( mod , name ) ) ! = 0 )
2005-04-17 02:20:36 +04:00
break ;
2013-01-12 05:08:44 +04:00
}
2005-04-17 02:20:36 +04:00
}
2008-01-14 11:55:03 +03:00
preempt_enable ( ) ;
2005-04-17 02:20:36 +04:00
return ret ;
}
2008-12-06 03:03:58 +03:00
int module_kallsyms_on_each_symbol ( int ( * fn ) ( void * , const char * ,
struct module * , unsigned long ) ,
void * data )
{
struct module * mod ;
unsigned int i ;
int ret ;
2015-05-27 04:39:35 +03:00
module_assert_mutex ( ) ;
2008-12-06 03:03:58 +03:00
list_for_each_entry ( mod , & modules , list ) {
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
/* We hold module_mutex: no need for rcu_dereference_sched */
struct mod_kallsyms * kallsyms = mod - > kallsyms ;
2013-01-12 05:08:44 +04:00
if ( mod - > state = = MODULE_STATE_UNFORMED )
continue ;
modules: fix longstanding /proc/kallsyms vs module insertion race.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Cc: stable@kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-02-03 09:25:26 +03:00
for ( i = 0 ; i < kallsyms - > num_symtab ; i + + ) {
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
const Elf_Sym * sym = & kallsyms - > symtab [ i ] ;
2018-06-05 11:22:52 +03:00
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
if ( sym - > st_shndx = = SHN_UNDEF )
2018-06-05 11:22:52 +03:00
continue ;
2018-11-19 19:43:58 +03:00
ret = fn ( data , kallsyms_symbol_name ( kallsyms , i ) ,
ARM: module: Fix function kallsyms on Thumb-2
Thumb-2 functions have the lowest bit set in the symbol value in the
symtab. When kallsyms are generated for the vmlinux, the kallsyms are
generated from the output of nm, and nm clears the lowest bit.
$ arm-linux-gnueabihf-readelf -a vmlinux | grep show_interrupts
95947: 8015dc89 686 FUNC GLOBAL DEFAULT 2 show_interrupts
$ arm-linux-gnueabihf-nm vmlinux | grep show_interrupts
8015dc88 T show_interrupts
$ cat /proc/kallsyms | grep show_interrupts
8015dc88 T show_interrupts
However, for modules, the kallsyms uses the values in the symbol table
without modification, so for functions in modules, the lowest bit is set
in kallsyms.
$ arm-linux-gnueabihf-readelf -a drivers/net/tun.ko | grep tun_get_socket
333: 00002d4d 36 FUNC GLOBAL DEFAULT 1 tun_get_socket
$ arm-linux-gnueabihf-nm drivers/net/tun.ko | grep tun_get_socket
00002d4c T tun_get_socket
$ cat /proc/kallsyms | grep tun_get_socket
7f802d4d t tun_get_socket [tun]
Because of this, the symbol+offset of the crashing instruction shown in
oopses is incorrect when the crash is in a module. For example, given a
tun_get_socket which starts like this,
00002d4c <tun_get_socket>:
2d4c: 6943 ldr r3, [r0, #20]
2d4e: 4a07 ldr r2, [pc, #28]
2d50: 4293 cmp r3, r2
a crash when tun_get_socket is called with NULL results in:
PC is at tun_xdp+0xa3/0xa4 [tun]
pc : [<7f802d4c>]
As can be seen, the "PC is at" line reports the wrong symbol name, and
the symbol+offset will point to the wrong source line if it is passed to
gdb.
To solve this, add a way for archs to fixup the reading of these module
kallsyms values, and use that to clear the lowest bit for function
symbols on Thumb-2.
After the fix:
# cat /proc/kallsyms | grep tun_get_socket
7f802d4c t tun_get_socket [tun]
PC is at tun_get_socket+0x0/0x24 [tun]
pc : [<7f802d4c>]
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-12-14 19:05:55 +03:00
mod , kallsyms_symbol_value ( sym ) ) ;
2008-12-06 03:03:58 +03:00
if ( ret ! = 0 )
return ret ;
}
}
return 0 ;
}
2005-04-17 02:20:36 +04:00
# endif /* CONFIG_KALLSYMS */
2016-09-21 14:47:22 +03:00
/* Maximum number of characters written by module_flags() */
# define MODULE_FLAGS_BUF_SIZE (TAINT_FLAGS_COUNT + 4)
/* Keep in sync with MODULE_FLAGS_BUF_SIZE !!! */
2008-01-25 23:08:33 +03:00
static char * module_flags ( struct module * mod , char * buf )
2006-10-11 12:21:48 +04:00
{
int bx = 0 ;
2013-01-12 05:08:44 +04:00
BUG_ON ( mod - > state = = MODULE_STATE_UNFORMED ) ;
2008-01-25 23:08:33 +03:00
if ( mod - > taints | |
mod - > state = = MODULE_STATE_GOING | |
mod - > state = = MODULE_STATE_COMING ) {
2006-10-11 12:21:48 +04:00
buf [ bx + + ] = ' ( ' ;
2012-01-13 03:02:15 +04:00
bx + = module_flags_taint ( mod , buf + bx ) ;
2008-01-25 23:08:33 +03:00
/* Show a - for module-is-being-unloaded */
if ( mod - > state = = MODULE_STATE_GOING )
buf [ bx + + ] = ' - ' ;
/* Show a + for module-is-being-loaded */
if ( mod - > state = = MODULE_STATE_COMING )
buf [ bx + + ] = ' + ' ;
2006-10-11 12:21:48 +04:00
buf [ bx + + ] = ' ) ' ;
}
buf [ bx ] = ' \0 ' ;
return buf ;
}
2008-10-06 13:19:27 +04:00
# ifdef CONFIG_PROC_FS
/* Called by the /proc file system to return a list of modules. */
static void * m_start ( struct seq_file * m , loff_t * pos )
{
mutex_lock ( & module_mutex ) ;
return seq_list_start ( & modules , * pos ) ;
}
static void * m_next ( struct seq_file * m , void * p , loff_t * pos )
{
return seq_list_next ( p , & modules , pos ) ;
}
static void m_stop ( struct seq_file * m , void * p )
{
mutex_unlock ( & module_mutex ) ;
}
2005-04-17 02:20:36 +04:00
static int m_show ( struct seq_file * m , void * p )
{
struct module * mod = list_entry ( p , struct module , list ) ;
2016-09-21 14:47:22 +03:00
char buf [ MODULE_FLAGS_BUF_SIZE ] ;
2017-11-29 21:30:13 +03:00
void * value ;
2006-10-11 12:21:48 +04:00
2013-01-12 05:08:44 +04:00
/* We always ignore unformed modules. */
if ( mod - > state = = MODULE_STATE_UNFORMED )
return 0 ;
2008-07-23 04:24:27 +04:00
seq_printf ( m , " %s %u " ,
2015-11-26 02:14:08 +03:00
mod - > name , mod - > init_layout . size + mod - > core_layout . size ) ;
2005-04-17 02:20:36 +04:00
print_unload_info ( m , mod ) ;
/* Informative for users. */
seq_printf ( m , " %s " ,
2014-11-10 02:01:29 +03:00
mod - > state = = MODULE_STATE_GOING ? " Unloading " :
mod - > state = = MODULE_STATE_COMING ? " Loading " :
2005-04-17 02:20:36 +04:00
" Live " ) ;
/* Used by oprofile and other similar tools. */
2017-11-29 21:30:13 +03:00
value = m - > private ? NULL : mod - > core_layout . base ;
seq_printf ( m , " 0x%px " , value ) ;
2005-04-17 02:20:36 +04:00
2006-10-11 12:21:48 +04:00
/* Taints info */
if ( mod - > taints )
2008-01-25 23:08:33 +03:00
seq_printf ( m , " %s " , module_flags ( mod , buf ) ) ;
2006-10-11 12:21:48 +04:00
2014-11-10 02:01:29 +03:00
seq_puts ( m , " \n " ) ;
2005-04-17 02:20:36 +04:00
return 0 ;
}
/* Format: modulename size refcount deps address
Where refcount is a number or - , and deps is a comma - separated list
of depends or - .
*/
2008-10-06 13:19:27 +04:00
static const struct seq_operations modules_op = {
2005-04-17 02:20:36 +04:00
. start = m_start ,
. next = m_next ,
. stop = m_stop ,
. show = m_show
} ;
2017-11-13 05:44:23 +03:00
/*
* This also sets the " private " pointer to non - NULL if the
* kernel pointers should be hidden ( so you can just test
* " m->private " to see if you should keep the values private ) .
*
* We use the same logic as for / proc / kallsyms .
*/
2008-10-06 13:19:27 +04:00
static int modules_open ( struct inode * inode , struct file * file )
{
2017-11-13 05:44:23 +03:00
int err = seq_open ( file , & modules_op ) ;
if ( ! err ) {
struct seq_file * m = file - > private_data ;
2020-07-03 00:43:59 +03:00
m - > private = kallsyms_show_value ( file - > f_cred ) ? NULL : ( void * ) 8ul ;
2017-11-13 05:44:23 +03:00
}
2018-03-06 18:16:24 +03:00
return err ;
2008-10-06 13:19:27 +04:00
}
2020-02-04 04:37:17 +03:00
static const struct proc_ops modules_proc_ops = {
proc: faster open/read/close with "permanent" files
Now that "struct proc_ops" exist we can start putting there stuff which
could not fly with VFS "struct file_operations"...
Most of fs/proc/inode.c file is dedicated to make open/read/.../close
reliable in the event of disappearing /proc entries which usually happens
if module is getting removed. Files like /proc/cpuinfo which never
disappear simply do not need such protection.
Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such
"permanent" files.
Enable "permanent" flag for
/proc/cpuinfo
/proc/kmsg
/proc/modules
/proc/slabinfo
/proc/stat
/proc/sysvipc/*
/proc/swaps
More will come once I figure out foolproof way to prevent out module
authors from marking their stuff "permanent" for performance reasons
when it is not.
This should help with scalability: benchmark is "read /proc/cpuinfo R times
by N threads scattered over the system".
N R t, s (before) t, s (after)
-----------------------------------------------------
64 4096 1.582458 1.530502 -3.2%
256 4096 6.371926 6.125168 -3.9%
1024 4096 25.64888 24.47528 -4.6%
Benchmark source:
#include <chrono>
#include <iostream>
#include <thread>
#include <vector>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
const int NR_CPUS = sysconf(_SC_NPROCESSORS_ONLN);
int N;
const char *filename;
int R;
int xxx = 0;
int glue(int n)
{
cpu_set_t m;
CPU_ZERO(&m);
CPU_SET(n, &m);
return sched_setaffinity(0, sizeof(cpu_set_t), &m);
}
void f(int n)
{
glue(n % NR_CPUS);
while (*(volatile int *)&xxx == 0) {
}
for (int i = 0; i < R; i++) {
int fd = open(filename, O_RDONLY);
char buf[4096];
ssize_t rv = read(fd, buf, sizeof(buf));
asm volatile ("" :: "g" (rv));
close(fd);
}
}
int main(int argc, char *argv[])
{
if (argc < 4) {
std::cerr << "usage: " << argv[0] << ' ' << "N /proc/filename R
";
return 1;
}
N = atoi(argv[1]);
filename = argv[2];
R = atoi(argv[3]);
for (int i = 0; i < NR_CPUS; i++) {
if (glue(i) == 0)
break;
}
std::vector<std::thread> T;
T.reserve(N);
for (int i = 0; i < N; i++) {
T.emplace_back(f, i);
}
auto t0 = std::chrono::system_clock::now();
{
*(volatile int *)&xxx = 1;
for (auto& t: T) {
t.join();
}
}
auto t1 = std::chrono::system_clock::now();
std::chrono::duration<double> dt = t1 - t0;
std::cout << dt.count() << '
';
return 0;
}
P.S.:
Explicit randomization marker is added because adding non-function pointer
will silently disable structure layout randomization.
[akpm@linux-foundation.org: coding style fixes]
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/20200222201539.GA22576@avx2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-07 06:09:01 +03:00
. proc_flags = PROC_ENTRY_PERMANENT ,
2020-02-04 04:37:17 +03:00
. proc_open = modules_open ,
. proc_read = seq_read ,
. proc_lseek = seq_lseek ,
. proc_release = seq_release ,
2008-10-06 13:19:27 +04:00
} ;
static int __init proc_modules_init ( void )
{
2020-02-04 04:37:17 +03:00
proc_create ( " modules " , 0 , NULL , & modules_proc_ops ) ;
2008-10-06 13:19:27 +04:00
return 0 ;
}
module_init ( proc_modules_init ) ;
# endif
2005-04-17 02:20:36 +04:00
/* Given an address, look for it in the module exception tables. */
const struct exception_table_entry * search_module_extables ( unsigned long addr )
{
const struct exception_table_entry * e = NULL ;
struct module * mod ;
2007-07-16 10:41:46 +04:00
preempt_disable ( ) ;
2017-02-08 17:48:01 +03:00
mod = __module_address ( addr ) ;
if ( ! mod )
goto out ;
2007-10-18 14:06:07 +04:00
2017-02-08 17:48:01 +03:00
if ( ! mod - > num_exentries )
goto out ;
e = search_extable ( mod - > extable ,
2017-07-11 01:51:58 +03:00
mod - > num_exentries ,
2017-02-08 17:48:01 +03:00
addr ) ;
out :
2007-07-16 10:41:46 +04:00
preempt_enable ( ) ;
2005-04-17 02:20:36 +04:00
2017-02-08 17:48:01 +03:00
/*
* Now , if we found one , we are running inside it now , hence
* we cannot unload the module , hence no refcnt needed .
*/
2005-04-17 02:20:36 +04:00
return e ;
}
2006-07-03 11:24:24 +04:00
/*
2009-03-31 23:05:31 +04:00
* is_module_address - is this address inside a module ?
* @ addr : the address to check .
*
* See is_module_text_address ( ) if you simply want to see if the address
* is code ( not data ) .
2006-07-03 11:24:24 +04:00
*/
2009-03-31 23:05:31 +04:00
bool is_module_address ( unsigned long addr )
2006-07-03 11:24:24 +04:00
{
2009-03-31 23:05:31 +04:00
bool ret ;
2006-07-03 11:24:24 +04:00
2007-07-16 10:41:46 +04:00
preempt_disable ( ) ;
2009-03-31 23:05:31 +04:00
ret = __module_address ( addr ) ! = NULL ;
2007-07-16 10:41:46 +04:00
preempt_enable ( ) ;
2006-07-03 11:24:24 +04:00
2009-03-31 23:05:31 +04:00
return ret ;
2006-07-03 11:24:24 +04:00
}
2009-03-31 23:05:31 +04:00
/*
* __module_address - get the module which contains an address .
* @ addr : the address .
*
* Must be called with preempt disabled or module mutex held so that
* module doesn ' t get freed during this .
*/
2009-04-05 22:04:19 +04:00
struct module * __module_address ( unsigned long addr )
2005-04-17 02:20:36 +04:00
{
struct module * mod ;
2008-07-23 04:24:28 +04:00
if ( addr < module_addr_min | | addr > module_addr_max )
return NULL ;
2015-05-27 04:39:35 +03:00
module_assert_mutex_or_preempt ( ) ;
2015-05-27 04:39:37 +03:00
mod = mod_find ( addr ) ;
2015-05-27 04:39:37 +03:00
if ( mod ) {
BUG_ON ( ! within_module ( addr , mod ) ) ;
2013-01-12 05:08:44 +04:00
if ( mod - > state = = MODULE_STATE_UNFORMED )
2015-05-27 04:39:37 +03:00
mod = NULL ;
2013-01-12 05:08:44 +04:00
}
2015-05-27 04:39:37 +03:00
return mod ;
2005-04-17 02:20:36 +04:00
}
2009-03-31 23:05:31 +04:00
/*
* is_module_text_address - is this address inside module code ?
* @ addr : the address to check .
*
* See is_module_address ( ) if you simply want to see if the address is
* anywhere in a module . See kernel_text_address ( ) for testing if an
* address corresponds to kernel or module code .
*/
bool is_module_text_address ( unsigned long addr )
{
bool ret ;
preempt_disable ( ) ;
ret = __module_text_address ( addr ) ! = NULL ;
preempt_enable ( ) ;
return ret ;
}
/*
* __module_text_address - get the module whose code contains an address .
* @ addr : the address .
*
* Must be called with preempt disabled or module mutex held so that
* module doesn ' t get freed during this .
*/
struct module * __module_text_address ( unsigned long addr )
{
struct module * mod = __module_address ( addr ) ;
if ( mod ) {
/* Make sure it's within the text section. */
2015-11-26 02:14:08 +03:00
if ( ! within ( addr , mod - > init_layout . base , mod - > init_layout . text_size )
& & ! within ( addr , mod - > core_layout . base , mod - > core_layout . text_size ) )
2009-03-31 23:05:31 +04:00
mod = NULL ;
}
return mod ;
}
2005-04-17 02:20:36 +04:00
/* Don't grab lock, we're oopsing. */
void print_modules ( void )
{
struct module * mod ;
2016-09-21 14:47:22 +03:00
char buf [ MODULE_FLAGS_BUF_SIZE ] ;
2005-04-17 02:20:36 +04:00
2009-06-16 22:07:14 +04:00
printk ( KERN_DEFAULT " Modules linked in: " ) ;
2008-08-30 12:09:00 +04:00
/* Most callers should already have preempt disabled, but make sure */
preempt_disable ( ) ;
2013-01-12 05:08:44 +04:00
list_for_each_entry_rcu ( mod , & modules , list ) {
if ( mod - > state = = MODULE_STATE_UNFORMED )
continue ;
2014-02-03 04:43:13 +04:00
pr_cont ( " %s%s " , mod - > name , module_flags ( mod , buf ) ) ;
2013-01-12 05:08:44 +04:00
}
2008-08-30 12:09:00 +04:00
preempt_enable ( ) ;
2008-01-25 23:08:33 +03:00
if ( last_unloaded_module [ 0 ] )
2014-02-03 04:43:13 +04:00
pr_cont ( " [last unloaded: %s] " , last_unloaded_module ) ;
pr_cont ( " \n " ) ;
2005-04-17 02:20:36 +04:00
}
# ifdef CONFIG_MODVERSIONS
2009-03-31 23:05:34 +04:00
/* Generate the signature for all relevant module structures here.
* If these change , we don ' t want to try to parse the module . */
void module_layout ( struct module * mod ,
struct modversion_info * ver ,
struct kernel_param * kp ,
struct kernel_symbol * ks ,
2011-01-27 01:26:22 +03:00
struct tracepoint * const * tp )
2009-03-31 23:05:34 +04:00
{
}
EXPORT_SYMBOL ( module_layout ) ;
2005-04-17 02:20:36 +04:00
# endif