IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Pull EFI updates from Ingo Molnar:
"Main changes in this cycle were:
- Refactor the EFI memory map code into architecture neutral files
and allow drivers to permanently reserve EFI boot services regions
on x86, as well as ARM/arm64. (Matt Fleming)
- Add ARM support for the EFI ESRT driver. (Ard Biesheuvel)
- Make the EFI runtime services and efivar API interruptible by
swapping spinlocks for semaphores. (Sylvain Chouleur)
- Provide the EFI identity mapping for kexec which allows kexec to
work on SGI/UV platforms with requiring the "noefi" kernel command
line parameter. (Alex Thorlton)
- Add debugfs node to dump EFI page tables on arm64. (Ard Biesheuvel)
- Merge the EFI test driver being carried out of tree until now in
the FWTS project. (Ivan Hu)
- Expand the list of flags for classifying EFI regions as "RAM" on
arm64 so we align with the UEFI spec. (Ard Biesheuvel)
- Optimise out the EFI mixed mode if it's unsupported (CONFIG_X86_32)
or disabled (CONFIG_EFI_MIXED=n) and switch the early EFI boot
services function table for direct calls, alleviating us from
having to maintain the custom function table. (Lukas Wunner)
- Miscellaneous cleanups and fixes"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
x86/efi: Round EFI memmap reservations to EFI_PAGE_SIZE
x86/efi: Allow invocation of arbitrary boot services
x86/efi: Optimize away setup_gop32/64 if unused
x86/efi: Use kmalloc_array() in efi_call_phys_prolog()
efi/arm64: Treat regions with WT/WC set but WB cleared as memory
efi: Add efi_test driver for exporting UEFI runtime service interfaces
x86/efi: Defer efi_esrt_init until after memblock_x86_fill
efi/arm64: Add debugfs node to dump UEFI runtime page tables
x86/efi: Remove unused find_bits() function
fs/efivarfs: Fix double kfree() in error path
x86/efi: Map in physical addresses in efi_map_region_fixed
lib/ucs2_string: Speed up ucs2_utf8size()
firmware-gsmi: Delete an unnecessary check before the function call "dma_pool_destroy"
x86/efi: Initialize status to ensure garbage is not returned on small size
efi: Replace runtime services spinlock with semaphore
efi: Don't use spinlocks for efi vars
efi: Use a file local lock for efivars
efi/arm*: esrt: Add missing call to efi_esrt_init()
efi/esrt: Use memremap not ioremap to access ESRT table in memory
x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data
...
kbuild test robot reports:
lib/dma-debug.c: In function 'debug_dma_map_resource':
>> lib/dma-debug.c:1541:16: error: implicit declaration of function '__phys_to_pfn' [-Werror=implicit-function-declaration]
entry->pfn = __phys_to_pfn(addr);
^~~~~~~~~~~~~
ia64 does not provide __phys_to_pfn(), use the PHYS_PFN() alias.
Fixes: 0e74b34dfc3318bf ("dma-debug: add support for resource mappings")
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
put_cpu_var takes the percpu data, not the data returned from
get_cpu_var.
This doesn't change the behavior.
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* the only remaining callers of "short" fault-ins are just as happy with generic
variants (both in lib/iov_iter.c); switch them to multipage variants, kill the
"short" ones
* rename the multipage variants to now available plain ones.
* get rid of compat macro defining iov_iter_fault_in_multipage_readable by
expanding it in its only user.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Specifying the aligned attributes to the char data[NDISKS][PAGE_SIZE],
char recovi[PAGE_SIZE] and char recovi[PAGE_SIZE] arrays, so that all
malloc memory is page boundary aligned.
Without these alignment attributes, the test causes a segfault in
userspace when the NDISKS are changed to 4 from 16.
The RAID stripes will be page aligned anyway, so we want to test what
the kernel actually will execute.
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
A MMIO mapped resource can not be represented by a struct page so a new
debug type is needed to handle this. This patch add such type and
functionality to add/remove entries and how to translate them to a
physical address.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The fixes to the radix tree test suite show that the multi-order case is
broken. The basic reason is that the radix tree code uses tagged
pointers with the "internal" bit in the low bits, and calculating the
pointer indices was supposed to mask off those bits. But gcc will
notice that we then use the index to re-create the pointer, and will
avoid doing the arithmetic and use the tagged pointer directly.
This cleans the code up, using the existing is_sibling_entry() helper to
validate the sibling pointer range (instead of open-coding it), and
using entry_to_node() to mask off the low tag bit from the pointer. And
once you do that, you might as well just use the now cleaned-up pointer
directly.
[ Side note: the multi-order code isn't actually ever used in the kernel
right now, and the only reason I didn't just delete all that code is
that Kirill Shutemov piped up and said:
"Well, my ext4-with-huge-pages patchset[1] uses multi-order entries.
It also converts shmem-with-huge-pages and hugetlb to them.
I'm okay with converting it to other mechanism, but I need
something. (I looked into Konstantin's RFC patchset[2]. It looks
okay, but I don't feel myself qualified to review it as I don't
know much about radix-tree internals.)"
[1] http://lkml.kernel.org/r/20160915115523.29737-1-kirill.shutemov@linux.intel.com
[2] http://lkml.kernel.org/r/147230727479.9957.1087787722571077339.stgit@zurg ]
Reported-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Cedric Blancher <cedric.blancher@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the indefinitiley -> indefinitely typo in Kconfig.debug.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160922205513.17821-1-vivien.didelot@savoirfairelinux.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Optimize RAID6 xor_syndrome functions to take advantage of the 512-bit
ZMM integer instructions introduced in AVX512.
AVX512 optimized xor_syndrome functions, which is simply based on sse2.c
written by hpa.
The patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Adding avx512 gen_syndrome and recovery functions so as to allow code to
be compiled and tested successfully in userspace.
This patch is tested in userspace and improvement in performace is
observed.
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Optimize RAID6 recovery functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.
AVX512 optimized recovery functions, which is simply based
on recov_avx2.c written by Jim Kukunas
This patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Optimize RAID6 gen_syndrom functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.
AVX512 optimized gen_syndrom functions, which is simply based
on avx2.c written by Yuanhan Liu and sse2.c written by hpa.
The patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
This commit introduces a generic library to estimate either the min or
max value of a time-varying variable over a recent time window. This
is code originally from Kathleen Nichols. The current form of the code
is from Van Jacobson.
A single struct minmax_sample will track the estimated windowed-max
value of the series if you call minmax_running_max() or the estimated
windowed-min value of the series if you call minmax_running_min().
Nearly equivalent code is already in place for minimum RTT estimation
in the TCP stack. This commit extracts that code and generalizes it to
handle both min and max. Moving the code here reduces the footprint
and complexity of the TCP code base and makes the filter generally
available for other parts of the codebase, including an upcoming TCP
congestion control module.
This library works well for time series where the measurements are
smoothly increasing or decreasing.
Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some architectures use a hardware defined structure at address zero.
Checking for a null pointer will result in many ubsan reports.
Allow users to disable the null sanitizer.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The insecure_elasticity setting is an ugly wart brought out by
users who need to insert duplicate objects (that is, distinct
objects with identical keys) into the same table.
In fact, those users have a much bigger problem. Once those
duplicate objects are inserted, they don't have an interface to
find them (unless you count the walker interface which walks
over the entire table).
Some users have resorted to doing a manual walk over the hash
table which is of course broken because they don't handle the
potential existence of multiple hash tables. The result is that
they will break sporadically when they encounter a hash table
resize/rehash.
This patch provides a way out for those users, at the expense
of an extra pointer per object. Essentially each object is now
a list of objects carrying the same key. The hash table will
only see the lists so nothing changes as far as rhashtable is
concerned.
To use this new interface, you need to insert a struct rhlist_head
into your objects instead of struct rhash_head. While the hash
table is unchanged, for type-safety you'll need to use struct
rhltable instead of struct rhashtable. All the existing interfaces
have been duplicated for rhlist, including the hash table walker.
One missing feature is nulls marking because AFAIK the only potential
user of it does not need duplicate objects. Should anyone need
this it shouldn't be too hard to add.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Install the callbacks via the state machine.
This is just a temporary vehicle to keep the interface working for now,
It'll be replaced by the sysfs interface which allows to step through the
hotplug state machine step by step.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160906170457.32393-15-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Variable weight is not being initialized to zero before it is
used to compute the weight sum. Ensure it is initialized to zero.
Found with static analysis with cppcheck:
[lib/sbitmap.c:177]: (error) Uninitialized variable: weight
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
... by turning it into what used to be multipages counterpart
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If we have a bunch of high-numbered bits allocated and then we resize
the struct sbitmap_queue, when those bits get cleared, we'll update the
hint and then have to re-randomize it repeatedly. Avoid that by checking
that the cleared bit is still a valid hint. No measurable performance
difference in the common case.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Now that workqueue can handle work item queueing from very early
during boot, there is no need to gate schedule_work() while
!keventd_up(). Remove it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
After a struct sbitmap_queue is resized smaller, the allocation hints
may still be set to bits beyond the new depth of the bitmap. This means
that, for example, if the number of blk-mq tags is reduced through
sysfs, more requests than the nominal queue depth may be in flight.
It's tempting to fix this at resize time by doing a one-time
reinitialization of the hints, but this can race with
__sbitmap_queue_get() updating the hint. Instead, check the hint before
we use it. This caused no measurable performance difference in my
synthetic benchmarks.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
In order to get good cache behavior from a sbitmap, we want each CPU to
stick to its own cacheline(s) as much as possible. This might happen
naturally as the bitmap gets filled up and the alloc_hint values spread
out, but we really want this behavior from the start. blk-mq apparently
intended to do this, but the code to do this was never wired up. Get rid
of the dead code and make it part of the sbitmap library.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Again, there's no point in passing this in every time. Make it part of
struct sbitmap_queue and clean up the API.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Allocating your own per-cpu allocation hint separately makes for an
awkward API. Instead, allocate the per-cpu hint as part of the struct
sbitmap_queue. There's no point for a struct sbitmap_queue without the
cache, but you can still use a bare struct sbitmap.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The original bt_alloc() we converted from was using kzalloc(), not
kzalloc_node(), to allocate the wait queues. This was probably an
oversight, so fix it for sbitmap_queue_init_node().
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This is a generally useful data structure, so make it available to
anyone else who might want to use it. It's also a nice cleanup
separating the allocation logic from the rest of the tag handling logic.
The code is behind a new Kconfig option, CONFIG_SBITMAP, which is only
selected by CONFIG_BLOCK for now.
This should be a complete noop functionality-wise.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This will avoid a potential read-after-free if collect_syscall()
(e.g. /proc/PID/syscall) is called on an exiting task.
Reported-by: Jann Horn <jann@thejh.net>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/0bfd8e6d4729c97745d3781a29610a33d0a8091d.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan
info from skb->vlan_tci") made flow dissector look at vlan_proto
when vlan is present. Since test_bpf sets skb->vlan_tci to ~0
(including VLAN_TAG_PRESENT) we have to populate skb->vlan_proto.
Fixes false negative on test #24:
test_bpf: #24 LD_PAYLOAD_OFF jited:0 175 ret 0 != 42 FAIL (1 times)
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/mediatek/mtk_eth_soc.c
drivers/net/ethernet/qlogic/qed/qed_dcbx.c
drivers/net/phy/Kconfig
All conflicts were cases of overlapping commits.
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to calculate the string length on every loop iteration.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Peter Jones <pjones@redhat.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for your net-next
tree. Most relevant updates are the removal of per-conntrack timers to
use a workqueue/garbage collection approach instead from Florian
Westphal, the hash and numgen expression for nf_tables from Laura
Garcia, updates on nf_tables hash set to honor the NLM_F_EXCL flag,
removal of ip_conntrack sysctl and many other incremental updates on our
Netfilter codebase.
More specifically, they are:
1) Retrieve only 4 bytes to fetch ports in case of non-linear skb
transport area in dccp, sctp, tcp, udp and udplite protocol
conntrackers, from Gao Feng.
2) Missing whitespace on error message in physdev match, from Hangbin Liu.
3) Skip redundant IPv4 checksum calculation in nf_dup_ipv4, from Liping Zhang.
4) Add nf_ct_expires() helper function and use it, from Florian Westphal.
5) Replace opencoded nf_ct_kill() call in IPVS conntrack support, also
from Florian.
6) Rename nf_tables set implementation to nft_set_{name}.c
7) Introduce the hash expression to allow arbitrary hashing of selector
concatenations, from Laura Garcia Liebana.
8) Remove ip_conntrack sysctl backward compatibility code, this code has
been around for long time already, and we have two interfaces to do
this already: nf_conntrack sysctl and ctnetlink.
9) Use nf_conntrack_get_ht() helper function whenever possible, instead
of opencoding fetch of hashtable pointer and size, patch from Liping Zhang.
10) Add quota expression for nf_tables.
11) Add number generator expression for nf_tables, this supports
incremental and random generators that can be combined with maps,
very useful for load balancing purpose, again from Laura Garcia Liebana.
12) Fix a typo in a debug message in FTP conntrack helper, from Colin Ian King.
13) Introduce a nft_chain_parse_hook() helper function to parse chain hook
configuration, this is used by a follow up patch to perform better chain
update validation.
14) Add rhashtable_lookup_get_insert_key() to rhashtable and use it from the
nft_set_hash implementation to honor the NLM_F_EXCL flag.
15) Missing nulls check in nf_conntrack from nf_conntrack_tuple_taken(),
patch from Florian Westphal.
16) Don't use the DYING bit to know if the conntrack event has been already
delivered, instead a state variable to track event re-delivery
states, also from Florian.
17) Remove the per-conntrack timer, use the workqueue approach that was
discussed during the NFWS, from Florian Westphal.
18) Use the netlink conntrack table dump path to kill stale entries,
again from Florian.
19) Add a garbage collector to get rid of stale conntracks, from
Florian.
20) Reschedule garbage collector if eviction rate is high.
21) Get rid of the __nf_ct_kill_acct() helper.
22) Use ARPHRD_ETHER instead of hardcoded 1 from ARP logger.
23) Make nf_log_set() interface assertive on unsupported families.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Some versions of gcc don't like tests for the value of an undefined
preprocessor symbol, even in the #else branch of an #ifndef:
lib/test_hash.c:224:7: warning: "HAVE_ARCH__HASH_32" is not defined [-Wundef]
#elif HAVE_ARCH__HASH_32 != 1
^
lib/test_hash.c:229:7: warning: "HAVE_ARCH_HASH_32" is not defined [-Wundef]
#elif HAVE_ARCH_HASH_32 != 1
^
lib/test_hash.c:234:7: warning: "HAVE_ARCH_HASH_64" is not defined [-Wundef]
#elif HAVE_ARCH_HASH_64 != 1
^
Seen with gcc 4.9, not seen with 4.1.2.
Change the logic to only check the value inside an #ifdef to fix this.
Fixes: 468a9428521e7d00 ("<linux/hash.h>: Add support for architecture-specific functions")
Link: http://lkml.kernel.org/r/20160829214952.1334674-4-arnd@arndb.de
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: George Spelvin <linux@sciencehorizons.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The XC instruction can be used to improve the speed of the raid6
recovery. The loops now operate on blocks of 256 bytes.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There are three usercopy warnings which are currently being silenced for
gcc 4.6 and newer:
1) "copy_from_user() buffer size is too small" compile warning/error
This is a static warning which happens when object size and copy size
are both const, and copy size > object size. I didn't see any false
positives for this one. So the function warning attribute seems to
be working fine here.
Note this scenario is always a bug and so I think it should be
changed to *always* be an error, regardless of
CONFIG_DEBUG_STRICT_USER_COPY_CHECKS.
2) "copy_from_user() buffer size is not provably correct" compile warning
This is another static warning which happens when I enable
__compiletime_object_size() for new compilers (and
CONFIG_DEBUG_STRICT_USER_COPY_CHECKS). It happens when object size
is const, but copy size is *not*. In this case there's no way to
compare the two at build time, so it gives the warning. (Note the
warning is a byproduct of the fact that gcc has no way of knowing
whether the overflow function will be called, so the call isn't dead
code and the warning attribute is activated.)
So this warning seems to only indicate "this is an unusual pattern,
maybe you should check it out" rather than "this is a bug".
I get 102(!) of these warnings with allyesconfig and the
__compiletime_object_size() gcc check removed. I don't know if there
are any real bugs hiding in there, but from looking at a small
sample, I didn't see any. According to Kees, it does sometimes find
real bugs. But the false positive rate seems high.
3) "Buffer overflow detected" runtime warning
This is a runtime warning where object size is const, and copy size >
object size.
All three warnings (both static and runtime) were completely disabled
for gcc 4.6 with the following commit:
2fb0815c9ee6 ("gcc4: disable __compiletime_object_size for GCC 4.6+")
That commit mistakenly assumed that the false positives were caused by a
gcc bug in __compiletime_object_size(). But in fact,
__compiletime_object_size() seems to be working fine. The false
positives were instead triggered by #2 above. (Though I don't have an
explanation for why the warnings supposedly only started showing up in
gcc 4.6.)
So remove warning #2 to get rid of all the false positives, and re-enable
warnings #1 and #3 by reverting the above commit.
Furthermore, since #1 is a real bug which is detected at compile time,
upgrade it to always be an error.
Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer
needed.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If vmalloc() was successful, do not attempt a kmalloc_array()
Fixes: 4cf0b354d92e ("rhashtable: avoid large lock-array allocations")
Reported-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch modifies __rhashtable_insert_fast() so it returns the
existing object that clashes with the one that you want to insert.
In case the object is successfully inserted, NULL is returned.
Otherwise, you get an error via ERR_PTR().
This patch adapts the existing callers of __rhashtable_insert_fast()
so they handle this new logic, and it adds a new
rhashtable_lookup_get_insert_key() interface to fetch this existing
object.
nf_tables needs this change to improve handling of EEXIST cases via
honoring the NLM_F_EXCL flag and by checking if the data part of the
mapping matches what we have.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
If we're using CONFIG_VMAP_STACK=y and we manage to point an sg entry
at the stack, then either the sg page will be in highmem or sg_virt()
will return the direct-map alias. In neither case will the existing
check_for_stack() implementation realize that it's a stack page.
Fix it by explicitly checking for stack pages.
This has no effect by itself. It's broken out for ease of review.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/448460622731312298bf19dcbacb1606e75de7a9.1470907718.git.luto@kernel.org
[ Minor edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The commit 8f6fd83c6c5ec66a4a70c728535ddcdfef4f3697 ("rhashtable:
accept GFP flags in rhashtable_walk_init") added a GFP flag argument
to rhashtable_walk_init because some users wish to use the walker
in an unsleepable context.
In fact we don't need to allocate memory in rhashtable_walk_init
at all. The walker is always paired with an iterator so we could
just stash ourselves there.
This patch does that by introducing a new enter function to replace
the existing init function. This way we don't have to churn all
the existing users again.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
Fietkau.
2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.
3) Fix some tg3 ethtool logic bugs, and one that would cause no
interrupts to be generated when rx-coalescing is set to 0. From
Satish Baddipadige and Siva Reddy Kallam.
4) QLCNIC mailbox corruption and napi budget handling fix from Manish
Chopra.
5) Fix fib_trie logic when walking the trie during /proc/net/route
output than can access a stale node pointer. From David Forster.
6) Several sctp_diag fixes from Phil Sutter.
7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.
8) Checksum fixup fixes in bpf from Daniel Borkmann.
9) Memork leaks in nfnetlink, from Liping Zhang.
10) Use after free in rxrpc, from David Howells.
11) Use after free in new skb_array code of macvtap driver, from Jason
Wang.
12) Calipso resource leak, from Colin Ian King.
13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.
14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.
15) Fix lockdep splats in macsec, from Sabrina Dubroca.
16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
handling.
17) Various tc-action bug fixes, from CONG Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
net_sched: allow flushing tc police actions
net_sched: unify the init logic for act_police
net_sched: convert tcf_exts from list to pointer array
net_sched: move tc offload macros to pkt_cls.h
net_sched: fix a typo in tc_for_each_action()
net_sched: remove an unnecessary list_del()
net_sched: remove the leftover cleanup_a()
mlxsw: spectrum: Allow packets to be trapped from any PG
mlxsw: spectrum: Unmap 802.1Q FID before destroying it
mlxsw: spectrum: Add missing rollbacks in error path
mlxsw: reg: Fix missing op field fill-up
mlxsw: spectrum: Trap loop-backed packets
mlxsw: spectrum: Add missing packet traps
mlxsw: spectrum: Mark port as active before registering it
mlxsw: spectrum: Create PVID vPort before registering netdevice
mlxsw: spectrum: Remove redundant errors from the code
mlxsw: spectrum: Don't return upon error in removal path
i40e: check for and deal with non-contiguous TCs
ixgbe: Re-enable ability to toggle VLAN filtering
ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
...