5274 Commits

Author SHA1 Message Date
Matthew Wilcox
d6427f8179 xarray: Move multiorder account test in-kernel
Move this test to the in-kernel test suite, and enhance it to test
several different orders.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:46 -04:00
Matthew Wilcox
372266ba02 radix tree test suite: Convert tag_tagged_items to XArray
The tag_tagged_items() function is supposed to test the page-writeback
tagging code.  Since that has been converted to the XArray, there's
not much point in testing the radix tree's tagging code.  This requires
using the pthread mutex embedded in the xarray instead of an external
lock, so remove the pthread mutexes which protect xarrays/radix trees.
Also remove radix_tree_iter_tag_set() as this was the last user.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:45 -04:00
Matthew Wilcox
adb9d9c4cc radix tree: Remove radix_tree_clear_tags
The page cache was the only user of this interface and it has now
been converted to the XArray.  Transform the test into a test of
xas_init_marks().

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:45 -04:00
Matthew Wilcox
8cf2f98411 radix tree: Remove radix_tree_maybe_preload_order
This function was only used by the page cache which is now converted
to the XArray.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:45 -04:00
Matthew Wilcox
2956c6644b radix tree: Remove split/join code
radix_tree_split and radix_tree_join were never used upstream.  Remove
them; if they're needed in future they will be replaced by XArray
equivalents.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:44 -04:00
Matthew Wilcox
1cf56f9d67 radix tree: Remove radix_tree_update_node_t
The only user of this functionality was the workingset code, and it's
now been converted to the XArray.  Remove __radix_tree_delete_node()
entirely as it was also only used by the workingset code.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:44 -04:00
Matthew Wilcox
7b8d046fba shmem: Convert shmem_alloc_hugepage to XArray
xa_find() is a slightly easier API to use than
radix_tree_gang_lookup_slot() because it contains its own RCU locking.
This commit removes the last user of radix_tree_gang_lookup_slot()
so remove the function too.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:40 -04:00
Matthew Wilcox
e21a29552f shmem: Convert find_swap_entry to XArray
This is a 1:1 conversion.  The major part of this patch is converting
the test framework from userspace to kernel space and mirroring the
algorithm now used in find_swap_entry().

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:39 -04:00
Matthew Wilcox
a97e7904c0 mm: Convert workingset to XArray
We construct an XA_STATE and use it to delete the node with
xas_store() rather than adding a special function for this unique
use case.  Includes a test that simulates this usage for the
test suite.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:36 -04:00
Matthew Wilcox
74d609585d page cache: Add and replace pages using the XArray
Use the XArray APIs to add and replace pages in the page cache.  This
removes two uses of the radix tree preload API and is significantly
shorter code.  It also removes the last user of __radix_tree_create()
outside radix-tree.c itself, so make it static.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:33 -04:00
Matthew Wilcox
f32f004cdd ida: Convert to XArray
Use the XA_TRACK_FREE ability to track which entries have a free bit,
similarly to how it uses the radix tree's IDR_FREE tag.  This eliminates
the per-cpu ida_bitmap preload, and fixes the memory consumption
regression I introduced when making the IDR able to store any pointer.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:33 -04:00
Matthew Wilcox
371c752dc6 xarray: Track free entries in an XArray
Add the optional ability to track which entries in an XArray are free
and provide xa_alloc() to replace most of the functionality of the IDR.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:32 -04:00
Matthew Wilcox
9f14d4f1f1 xarray: Add xa_reserve and xa_release
This function reserves a slot in the XArray for users which need
to acquire multiple locks before storing their entry in the tree and
so cannot use a plain xa_store().

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:46:00 -04:00
Matthew Wilcox
2264f5132f xarray: Add xas_create_range
This hopefully temporary function is useful for users who have not yet
been converted to multi-index entries.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:45:59 -04:00
Matthew Wilcox
4e99d4e957 xarray: Add xas_for_each_conflict
This iterator iterates over each entry that is stored in the index or
indices specified by the xa_state.  This is intended for use for a
conditional store of a multiindex entry, or to allow entries which are
about to be removed from the xarray to be disposed of properly.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:45:59 -04:00
Matthew Wilcox
64d3e9a9e0 xarray: Step through an XArray
The xas_next and xas_prev functions move the xas index by one position,
and adjust the rest of the iterator state to match it.  This is more
efficient than calling xas_set() as it keeps the iterator at the leaves
of the tree instead of walking the iterator from the root each time.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:45:59 -04:00
Matthew Wilcox
687149fca1 xarray: Destroy an XArray
This function frees all the internal memory allocated to the xarray
and reinitialises it to be empty.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:45:59 -04:00
Matthew Wilcox
80a0a1a9a3 xarray: Extract entries from an XArray
The xa_extract function combines the functionality of
radix_tree_gang_lookup() and radix_tree_gang_lookup_tagged().
It extracts entries matching the specified filter into a normal array.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:45:58 -04:00
Matthew Wilcox
b803b42823 xarray: Add XArray iterators
The xa_for_each iterator allows the user to efficiently walk a range
of the array, executing the loop body once for each entry in that
range that matches the filter.  This commit also includes xa_find()
and xa_find_after() which are helper functions for xa_for_each() but
may also be useful in their own right.

In the xas family of functions, we have xas_for_each(), xas_find(),
xas_next_entry(), xas_for_each_tagged(), xas_find_tagged(),
xas_next_tagged() and xas_pause().

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:45:58 -04:00
Matthew Wilcox
41aec91f55 xarray: Add XArray conditional store operations
Like cmpxchg(), xa_cmpxchg will only store to the index if the current
entry matches the old entry.  It returns the current entry, which is
usually more useful than the errno returned by radix_tree_insert().
For the users who really only want the errno, the xa_insert() wrapper
provides a more convenient calling convention.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:45:58 -04:00
Matthew Wilcox
58d6ea3085 xarray: Add XArray unconditional store operations
xa_store() differs from radix_tree_insert() in that it will overwrite an
existing element in the array rather than returning an error.  This is
the behaviour which most users want, and those that want more complex
behaviour generally want to use the xas family of routines anyway.

For memory allocation, xa_store() will first attempt to request memory
from the slab allocator; if memory is not immediately available, it will
drop the xa_lock and allocate memory, keeping a pointer in the xa_state.
It does not use the per-CPU cache, although those will continue to exist
until all radix tree users are converted to the xarray.

This patch also includes xa_erase() and __xa_erase() for a streamlined
way to store NULL.  Since there is no need to allocate memory in order
to store a NULL in the XArray, we do not need to trouble the user with
deciding what memory allocation flags to use.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:45:57 -04:00
Matthew Wilcox
9b89a03551 xarray: Add XArray marks
XArray marks are like the radix tree tags, only slightly more strongly
typed.  They are renamed in order to distinguish them from tagged
pointers.  This commit adds the basic get/set/clear operations.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:45:57 -04:00
Matthew Wilcox
ad3d6c7263 xarray: Add XArray load operation
The xa_load function brings with it a lot of infrastructure; xa_empty(),
xa_is_err(), and large chunks of the XArray advanced API that are used
to implement xa_load.

As the test-suite demonstrates, it is possible to use the XArray functions
on a radix tree.  The radix tree functions depend on the GFP flags being
stored in the root of the tree, so it's not possible to use the radix
tree functions on an XArray.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-21 10:45:57 -04:00
Matthew Wilcox
01959dfe77 xarray: Define struct xa_node
This is a direct replacement for struct radix_tree_node.  A couple of
struct members have changed name, so convert those.  Use a #define so
that radix tree users continue to work without change.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-10-21 10:45:56 -04:00
Matthew Wilcox
f8d5d0cc14 xarray: Add definition of struct xarray
This is a direct replacement for struct radix_tree_root.  Some of the
struct members have changed name; convert those, and use a #define so
that radix_tree users continue to work without change.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-10-21 10:45:53 -04:00
David S. Miller
2e2d6f0342 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
net/sched/cls_api.c has overlapping changes to a call to
nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
to the 5th argument, and another (from 'net-next') added cb->extack
instead of NULL to the 6th argument.

net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
code which moved (to mr_table_dump)) in 'net-next'.  Thanks to David
Ahern for the heads up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-19 11:03:06 -07:00
Waiman Long
01a14bda11 locking/lockdep: Make global debug_locks* variables read-mostly
Make the frequently used lockdep global variable debug_locks read-mostly.
As debug_locks_silent is sometime used together with debug_locks,
it is also made read-mostly so that they can be close together.

With false cacheline sharing, cacheline contention problem can happen
depending on what get put into the same cacheline as debug_locks.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1539913518-15598-2-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-19 07:53:18 +02:00
Waiman Long
9506a7425b locking/lockdep: Fix debug_locks off performance problem
It was found that when debug_locks was turned off because of a problem
found by the lockdep code, the system performance could drop quite
significantly when the lock_stat code was also configured into the
kernel. For instance, parallel kernel build time on a 4-socket x86-64
server nearly doubled.

Further analysis into the cause of the slowdown traced back to the
frequent call to debug_locks_off() from the __lock_acquired() function
probably due to some inconsistent lockdep states with debug_locks
off. The debug_locks_off() function did an unconditional atomic xchg
to write a 0 value into debug_locks which had already been set to 0.
This led to severe cacheline contention in the cacheline that held
debug_locks.  As debug_locks is being referenced in quite a few different
places in the kernel, this greatly slow down the system performance.

To prevent that trashing of debug_locks cacheline, lock_acquired()
and lock_contended() now checks the state of debug_locks before
proceeding. The debug_locks_off() function is also modified to check
debug_locks before calling __debug_locks_off().

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1539913518-15598-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-19 07:53:17 +02:00
Alexander Shishkin
93048c0944 lib: Fix ia64 bootloader linkage
kbuild robot reports that since commit ce76d938dd98 ("lib: Add memcat_p():
paste 2 pointer arrays together") the ia64/hp/sim/boot fails to link:

> LD      arch/ia64/hp/sim/boot/bootloader
> lib/string.o: In function `__memcat_p':
> string.c:(.text+0x1f22): undefined reference to `__kmalloc'
> string.c:(.text+0x1ff2): undefined reference to `__kmalloc'
> make[1]: *** [arch/ia64/hp/sim/boot/Makefile:37: arch/ia64/hp/sim/boot/bootloader] Error 1

The reason is, the above commit, via __memcat_p(), adds a call to
__kmalloc to string.o, which happens to be used in the bootloader, but
there's no kmalloc or slab or anything.

Since the linker would only pull in objects that contain referenced
symbols, moving __memcat_p() to a different compilation unit solves the
problem.

Fixes: ce76d938dd98 ("lib: Add memcat_p(): paste 2 pointer arrays together")
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reported-by: kbuild test robot <lkp@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-16 13:45:44 +02:00
Matthew Wilcox
c994b12945 test_ida: Fix lockdep warning
The IDA was declared on the stack instead of statically, so lockdep
triggered a warning that it was improperly initialised.

Reported-by: 0day bot
Tested-by: Rong Chen <rong.a.chen@intel.com>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-10-15 16:31:29 -04:00
David S. Miller
d864991b22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were easy to resolve using immediate context mostly,
except the cls_u32.c one where I simply too the entire HEAD
chunk.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 21:38:46 -07:00
Geert Uytterhoeven
94ac8f2074 doc: printk-formats: Remove bogus kobject references for device nodes
When converting from text to rst, the kobjects section and its sole
subsection about device tree nodes were coalesced into a single section,
yielding an inconsistent result.

Remove all references to kobjects, as
  1. Device tree object pointers are not compatible to kobject pointers
     (the former may embed the latter, though), and
  2. there are no printk formats defined for kobject types.

Update the vsprintf() source code comments to match the above.

Fixes: b3ed23213eab1e08 ("doc: convert printk-formats.txt to rst")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-10-12 11:38:18 -06:00
Greg Kroah-Hartman
a291ab2d40 * Fix a stack overflow in lib/bch.c
-----BEGIN PGP SIGNATURE-----
 
 iQI5BAABCAAjBQJbwEtmHBxib3Jpcy5icmV6aWxsb25AYm9vdGxpbi5jb20ACgkQ
 Ze02AX4ItwB9+g/7BGRJ4PkXiDVNI01N/I50oSeAQhfSJPGLOH4uWRLvvAwMMoqJ
 WmblcUwvllg7E8t9hcMOINuVEhweSAu6YxKBvVbGRKhqnuV13eNyOj8OB4IF+Yk0
 RVAYbf1lyyfZGs4PQGwvMG14kM4Y/9ttVnwfewOjBqdJf66MiTeRTX4k7z7GTrYo
 kpkTSwKUlYi9eEEJB9Mhq9Ib9f2aiIeGWhzX/oZQ5ZIW0u5wkjdYTgIA4xaT1vML
 m4KiTBRMbDV63BWuJmSDi7YXLvEb9PbZRC2EA5j8VwDmEgWDw3hPuSdayQ06s/1g
 JjZR7s6+HW8D84i9ecGJOafOS0nuwIkdBJ7aepDkiad87crweWLf9JRxkz11LPbw
 DRrXIDswh8525fHLSwK9Fzg067fSJye8XrQtwMgZVAG6d+dDkrjzgUAfTo9fTf2j
 0pShcwedDmoJrL5ntO0MQKm05RfUKRe7HmiUXX3FiXDdjNyhSM9SReN2dPw9MszP
 mlwaWZdmg3hd6cvn0aCHvmqbDBxeh1mS3RyEcXxONU+h1NGv3IS0mpLD3vLk7SHm
 6Al8+Lpbd92ldEnPLYUrJeBlhJJiwS1amTZf4xxKGWLDFM4pKzWwj9T2vFtVAZAa
 bXEs8x8CgHbfP/oYBh8sHFjoHGlaF+f6opfp5JuUcHV3GHhUFDVlVQyOBR8=
 =vdMk
 -----END PGP SIGNATURE-----

Merge tag 'mtd/fixes-for-4.19-rc8' of git://git.infradead.org/linux-mtd

Boris writes:
  "mdt: fix for 4.19-rc8

   * Fix a stack overflow in lib/bch.c"

* tag 'mtd/fixes-for-4.19-rc8' of git://git.infradead.org/linux-mtd:
  lib/bch: fix possible stack overrun
2018-10-12 12:54:26 +02:00
Geert Uytterhoeven
431bca2430 lib/vsprintf: Hash printed address for netdev bits fallback
The handler for "%pN" falls back to printing the raw pointer value when
using a different format than the (sole supported) special format
"%pNF", potentially leaking sensitive information regarding the kernel
layout in memory.

Avoid this leak by printing the hashed address instead.
Note that there are no in-tree users of the fallback.

Fixes: ad67b74d2469d9b8 ("printk: hash addresses printed with %p")
Link: http://lkml.kernel.org/r/20181011084249.4520-4-geert+renesas@glider.be
To: "Tobin C . Harding" <me@tobin.cc>
To: Andrew Morton <akpm@linux-foundation.org>
To: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-10-12 11:26:55 +02:00
Geert Uytterhoeven
ec12bc2909 lib/vsprintf: Hash legacy clock addresses
On platforms using the Common Clock Framework, "%pC" prints the clock's
name. On legacy platforms, it prints the unhashed clock's address,
potentially leaking sensitive information regarding the kernel layout in
memory.

Avoid this leak by printing the hashed address instead.  To distinguish
between clocks, a 32-bit unique identifier is as good as an actual
pointer value.

Fixes: ad67b74d2469d9b8 ("printk: hash addresses printed with %p")
Link: http://lkml.kernel.org/r/20181011084249.4520-3-geert+renesas@glider.be
To: "Tobin C . Harding" <me@tobin.cc>
To: Andrew Morton <akpm@linux-foundation.org>
To: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-10-12 11:24:41 +02:00
Geert Uytterhoeven
9073dac14e lib/vsprintf: Prepare for more general use of ptr_to_id()
Move the function and its dependencies up so it can be called from
special pointer type formatting routines.

Link: http://lkml.kernel.org/r/20181011084249.4520-2-geert+renesas@glider.be
To: "Tobin C . Harding" <me@tobin.cc>
To: Andrew Morton <akpm@linux-foundation.org>
To: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[pmladek@suse.com: Split into separate patch]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-10-12 11:23:13 +02:00
Geert Uytterhoeven
f31b224c14 lib/vsprintf: Make ptr argument conts in ptr_to_id()
Make the ptr argument const to avoid adding casts in future callers.

Link: http://lkml.kernel.org/r/20181011084249.4520-2-geert+renesas@glider.be
To: "Tobin C . Harding" <me@tobin.cc>
To: Andrew Morton <akpm@linux-foundation.org>
To: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[pmladek@suse.com: split into separate patch]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-10-12 11:14:07 +02:00
Arnd Bergmann
f0fe77f601 lib/bch: fix possible stack overrun
The previous patch introduced very large kernel stack usage and a Makefile
change to hide the warning about it.

From what I can tell, a number of things went wrong here:

- The BCH_MAX_T constant was set to the maximum value for 'n',
  not the maximum for 't', which is much smaller.

- The stack usage is actually larger than the entire kernel stack
  on some architectures that can use 4KB stacks (m68k, sh, c6x), which
  leads to an immediate overrun.

- The justification in the patch description claimed that nothing
  changed, however that is not the case even without the two points above:
  the configuration is machine specific, and most boards  never use the
  maximum BCH_ECC_WORDS() length but instead have something much smaller.
  That maximum would only apply to machines that use both the maximum
  block size and the maximum ECC strength.

The largest value for 't' that I could find is '32', which in turn leads
to a 60 byte array instead of 2048 bytes. Making it '64' for future
extension seems also worthwhile, with 120 bytes for the array. Anything
larger won't fit into the OOB area on NAND flash.

With that changed, the warning can be enabled again.

Only linux-4.19+ contains the breakage, so this is only needed
as a stable backport if it does not make it into the release.

Fixes: 02361bc77888 ("lib/bch: Remove VLA usage")
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-10-12 09:17:46 +02:00
Kees Cook
0bb95f80a3 Makefile: Globally enable VLA warning
Now that Variable Length Arrays (VLAs) have been entirely removed[1]
from the kernel, enable the VLA warning globally. The only exceptions
to this are the KASan an UBSan tests which are explicitly checking that
VLAs trigger their respective tests.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Airlie <airlied@linux.ie>
Cc: linux-kbuild@vger.kernel.org
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2018-10-11 08:17:50 -07:00
Alexander Shishkin
ce76d938dd lib: Add memcat_p(): paste 2 pointer arrays together
This adds a helper to paste 2 pointer arrays together, useful for merging
various types of attribute arrays. There are a few places in the kernel
tree where this is open coded, and I just added one more in the STM class.

The naming is inspired by memset_p() and memcat(), and partial credit for
it goes to Andy Shevchenko.

This patch adds the function wrapped in a type-enforcing macro and a test
module.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-11 12:12:55 +02:00
Greg Kroah-Hartman
588b593821 It was reported that trace_printk() was not reporting properly
values that came after a dereference pointer.
 
 trace_printk() utilizes vbin_printf() and bstr_printf() to keep the
 overhead of tracing down. vbin_printf() does not do any conversions
 and just stors the string format and the raw arguments into the
 buffer. bstr_printf() is used to read the buffer and does the conversions
 to complete the printf() output.
 
 This can be troublesome with dereferenced pointers because the reference
 may be different from the time vbin_printf() is called to the time
 bstr_printf() is called. To fix this, a prior commit changed vbin_printf()
 to convert dereferenced pointers into strings and load the converted
 string into the buffer. But the change to bstr_printf() had an off-by-one
 error and didn't account for the nul character at the end of the string
 and this corrupted the rest of the values in the format that came after
 a dereferenced pointer.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCW737iRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qnraAQDVbp0aWOpS73YUVbW/bArC8t8Z6/9h
 bXLeCdSSa1BHswD+K+kj7NiVrxIzyXrotb40JoscLsaXSIEJjlNFHQKqxQQ=
 =4BpJ
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Steven writes:
  "vsprint fix:

   It was reported that trace_printk() was not reporting properly
   values that came after a dereference pointer.

   trace_printk() utilizes vbin_printf() and bstr_printf() to keep the
   overhead of tracing down. vbin_printf() does not do any conversions
   and just stors the string format and the raw arguments into the
   buffer. bstr_printf() is used to read the buffer and does the
   conversions to complete the printf() output.

   This can be troublesome with dereferenced pointers because the
   reference may be different from the time vbin_printf() is called to
   the time bstr_printf() is called. To fix this, a prior commit changed
   vbin_printf() to convert dereferenced pointers into strings and load
   the converted string into the buffer. But the change to bstr_printf()
   had an off-by-one error and didn't account for the nul character at
   the end of the string and this corrupted the rest of the values in
   the format that came after a dereferenced pointer."

* tag 'trace-v4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  vsprintf: Fix off-by-one bug in bstr_printf() processing dereferenced pointers
2018-10-10 22:09:44 +02:00
Vasily Gorbik
5dff03813f s390/kasan: add option for 4-level paging support
By default 3-level paging is used when the kernel is compiled with
kasan support. Add 4-level paging option to support systems with more
then 3TB of physical memory and to cover 4-level paging specific code
with kasan as well.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:29 +02:00
David S. Miller
071a234ad7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-10-08

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) sk_lookup_[tcp|udp] and sk_release helpers from Joe Stringer which allow
BPF programs to perform lookups for sockets in a network namespace. This would
allow programs to determine early on in processing whether the stack is
expecting to receive the packet, and perform some action (eg drop,
forward somewhere) based on this information.

2) per-cpu cgroup local storage from Roman Gushchin.
Per-cpu cgroup local storage is very similar to simple cgroup storage
except all the data is per-cpu. The main goal of per-cpu variant is to
implement super fast counters (e.g. packet counters), which don't require
neither lookups, neither atomic operations in a fast path.
The example of these hybrid counters is in selftests/bpf/netcnt_prog.c

3) allow HW offload of programs with BPF-to-BPF function calls from Quentin Monnet

4) support more than 64-byte key/value in HW offloaded BPF maps from Jakub Kicinski

5) rename of libbpf interfaces from Andrey Ignatov.
libbpf is maturing as a library and should follow good practices in
library design and implementation to play well with other libraries.
This patch set brings consistent naming convention to global symbols.

6) relicense libbpf as LGPL-2.1 OR BSD-2-Clause from Alexei Starovoitov
to let Apache2 projects use libbpf

7) various AF_XDP fixes from Björn and Magnus
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 23:42:44 -07:00
David Ahern
a5f6cba291 netlink: Add strict version of nlmsg_parse and nla_parse
nla_parse is currently lenient on message parsing, allowing type to be 0
or greater than max expected and only logging a message

    "netlink: %d bytes leftover after parsing attributes in process `%s'."

if the netlink message has unknown data at the end after parsing. What this
could mean is that the header at the front of the attributes is actually
wrong and the parsing is shifted from what is expected.

Add a new strict version that actually fails with EINVAL if there are any
bytes remaining after the parsing loop completes, if the atttrbitue type
is 0 or greater than max expected.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
Steven Rostedt (VMware)
62165600ae vsprintf: Fix off-by-one bug in bstr_printf() processing dereferenced pointers
The functions vbin_printf() and bstr_printf() are used by trace_printk() to
try to keep the overhead down during printing. trace_printk() uses
vbin_printf() at the time of execution, as it only scans the fmt string to
record the printf values into the buffer, and then uses vbin_printf() to do
the conversions to print the string based on the format and the saved
values in the buffer.

This is an issue for dereferenced pointers, as before commit 841a915d20c7b,
the processing of the pointer could happen some time after the pointer value
was recorded (reading the trace buffer). This means the processing of the
value at a later time could show different results, or even crash the
system, if the pointer no longer existed.

Commit 841a915d20c7b addressed this by processing dereferenced pointers at
the time of execution and save the result in the ring buffer as a string.
The bstr_printf() would then treat these pointers as normal strings, and
print the value. But there was an off-by-one bug here, where after
processing the argument, it move the pointer only "strlen(arg)" which made
the arg pointer not point to the next argument in the ring buffer, but
instead point to the nul character of the last argument. This causes any
values after a dereferenced pointer to be corrupted.

Cc: stable@vger.kernel.org
Fixes: 841a915d20c7b ("vsprintf: Do not have bprintf dereference pointers")
Reported-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-10-05 10:17:15 -04:00
Stefan Agner
f9b58e8c7d ARM: 8800/1: use choice for kernel unwinders
While in theory multiple unwinders could be compiled in, it does
not make sense in practise. Use a choice to make the unwinder
selection mutually exclusive and mandatory.

Already before this commit it has not been possible to deselect
FRAME_POINTER. Remove the obsolete comment.

Furthermore, to produce a meaningful backtrace with FRAME_POINTER
enabled the kernel needs a specific function prologue:
    mov    ip, sp
    stmfd    sp!, {fp, ip, lr, pc}
    sub    fp, ip, #4

To get to the required prologue gcc uses apcs and no-sched-prolog.
This compiler options are not available on clang, and clang is not
able to generate the required prologue. Make the FRAME_POINTER
config symbol depending on !clang.

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Stefan Agner <stefan@agner.ch>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-10-04 14:48:58 +01:00
Johannes Berg
33188bd643 netlink: add validation function to policy
Add the ability to have an arbitrary validation function attached
to a netlink policy that doesn't already use the validation_data
pointer in another way.

This can be useful to validate for example the content of a binary
attribute, like in nl80211 the "(information) elements", which must
be valid streams of "u8 type, u8 length, u8 value[length]".

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:05:31 -07:00
Johannes Berg
3e48be05f3 netlink: add attribute range validation to policy
Without further bloating the policy structs, we can overload
the `validation_data' pointer with a struct of s16 min, max
and use those to validate ranges in NLA_{U,S}{8,16,32,64}
attributes.

It may sound strange to validate NLA_U32 with a s16 max, but
in many cases NLA_U32 is used for enums etc. since there's no
size benefit in using a smaller attribute width anyway, due
to netlink attribute alignment; in cases like that it's still
useful, particularly when the attribute really transports an
enum value.

Doing so lets us remove quite a bit of validation code, if we
can be sure that these attributes aren't used by userspace in
places where they're ignored today.

To achieve all this, split the 'type' field and introduce a
new 'validation_type' field which indicates what further
validation (beyond the validation prescribed by the type of
the attribute) is done. This currently allows for no further
validation (the default), as well as min, max and range checks.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:05:31 -07:00
Joel Stanley
242cdad873 lib/xz: Put CRC32_POLY_LE in xz_private.h
This fixes a regression introduced by faa16bc404d72a5 ("lib: Use
existing define with polynomial").

The cleanup added a dependency on include/linux, which broke the PowerPC
boot wrapper/decompresser when KERNEL_XZ is enabled:

  BOOTCC  arch/powerpc/boot/decompress.o
 In file included from arch/powerpc/boot/../../../lib/decompress_unxz.c:233,
                 from arch/powerpc/boot/decompress.c:42:
 arch/powerpc/boot/../../../lib/xz/xz_crc32.c:18:10: fatal error:
 linux/crc32poly.h: No such file or directory
  #include <linux/crc32poly.h>
           ^~~~~~~~~~~~~~~~~~~

The powerpc decompresser is a hairy corner of the kernel. Even while building
a 64-bit kernel it needs to build a 32-bit binary and therefore avoid including
files from include/linux.

This allows users of the xz library to avoid including headers from
'include/linux/' while still achieving the cleanup of the magic number.

Fixes: faa16bc404d72a5 ("lib: Use existing define with polynomial")
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: kbuild test robot <lkp@intel.com>
Suggested-by: Christophe LEROY <christophe.leroy@c-s.fr>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-02 08:44:59 +10:00
Matthew Wilcox
02c02bf12c xarray: Change definition of sibling entries
Instead of storing a pointer to the slot containing the canonical entry,
store the offset of the slot.  Produces slightly more efficient code
(~300 bytes) and simplifies the implementation.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
2018-09-29 22:47:49 -04:00