53634 Commits

Author SHA1 Message Date
Joe Stringer
67e89ac328 bpf: Fix dev pointer dereference from sk_skb
Dan Carpenter reports:

The patch 6acc9b432e67: "bpf: Add helper to retrieve socket in BPF"
from Oct 2, 2018, leads to the following Smatch complaint:

    net/core/filter.c:4893 bpf_sk_lookup()
    error: we previously assumed 'skb->dev' could be null (see line 4885)

Fix this issue by checking skb->dev before using it.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-13 23:03:08 -07:00
Al Viro
ce5a983191 kill TIOCSERGSTRUCT
Once upon a time a bunch of serial drivers used to provide that;
today it's only amiserial and it's FUBAR - the structure being
copied to userland includes kernel pointers, fields with
config-dependent size, etc.  No userland code using it could
possibly survive - e.g. enabling lockdep definitely changes the
layout.  Besides, it's a massive infoleak.

Kill it.  If somebody needs that data for debugging purposes, they
can bloody well expose it saner ways.  Assuming anyone does debugging
of amiserial in the first place, that is.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-10-13 00:50:53 -04:00
Al Viro
f0193d3ea7 change semantics of ldisc ->compat_ioctl()
First of all, make it return int.  Returning long when native method
had never allowed that is ridiculous and inconvenient.

More importantly, change the caller; if ldisc ->compat_ioctl() is NULL
or returns -ENOIOCTLCMD, tty_compat_ioctl() will try to feed cmd and
compat_ptr(arg) to ldisc's native ->ioctl().

That simplifies ->compat_ioctl() instances quite a bit - they only
need to deal with ioctls that are neither generic tty ones (those
would get shunted off to tty_ioctl()) nor simple compat pointer ones.

Note that something like TCFLSH won't reach ->compat_ioctl(),
even if ldisc ->ioctl() does handle it - it will be recognized
earlier and passed to tty_ioctl() (and ultimately - ldisc ->ioctl()).

For many ldiscs it means that NULL ->compat_ioctl() does the
right thing.  Those where it won't serve (see e.g. n_r3964.c) are
also easily dealt with - we need to handle the numeric-argument
ioctls (calling the native instance) and, if such would exist,
the ioctls that need layout conversion, etc.

All in-tree ldiscs dealt with.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-10-13 00:50:53 -04:00
Al Viro
6a9daed31c rfcomm: get rid of mentioning TIOC[SG]SERIAL
no support there

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-10-13 00:50:33 -04:00
David S. Miller
d864991b22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were easy to resolve using immediate context mostly,
except the cls_u32.c one where I simply too the entire HEAD
chunk.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 21:38:46 -07:00
David S. Miller
e32cf9a386 Highlights:
* merge net-next, so I can finish the hwsim workqueue removal
  * fix TXQ NULL pointer issue that was reported multiple times
  * minstrel cleanups from Felix
  * simplify lib80211 code by not using skcipher, note that this
    will conflict with the crypto tree (and this new code here
    should be used)
  * use new netlink policy validation in nl80211
  * fix up SAE (part of WPA3) in client-mode
  * FTM responder support in the stack
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlvAgX8ACgkQB8qZga/f
 l8Sg6A/+IhbIRerfje8kVIWNG3fUa5oF/vXYRfNRDeNwHu6OVYer5uj6bonkSSoU
 lpWivhuF1XxdIQx6npr8j7R6bw8sbKMKrpnR7ObFaaCmxjmjgMNY4pkkmzSRv3zi
 LP91woUQ2mFv16U9E+9HROBJXgeZfU96bBBR4mfqKrhnhFz6O8SYa9JvRpylcLTz
 HY+a+ezIztbrCPImBUOkSfUPuoeWSJ8Y4LN2zhqbY/ZUk3n4iFOD44bAN346u3+q
 CzfL+e+PKjuSiwPrjrKMIp9emiAE1zeWPUcmnSAYU/s9zISoZ8v5L/kSBanVnvJI
 DIHJ6LB+aETm1gnj1dSnboXRKpUaW63nbkoxwTy15JrdIMysh9vDaXWok6p85mDq
 FVbFx1MzBcYPCMrACEBhhRz71eZHFtyrGyqUm1EZcdcNeQtg+eJdXvhTiowg5JhH
 Csl+ZloZfqrd0aPjaM3F819aVMKcJORdQhrfTN+GYW81mdZyZO3VWmzd3kI8Pjuc
 WRZmykYTub/MbDFjE5g1LRJ4jZ/bThIl71i8hYEopqfim9URE1RPm9mSMUv9ggVL
 /xNIIMhw3H/+P3g0z2a0Z06hT/dScHsZ9If5Srb699/NUOiES5zl43eXidBzuAbI
 36DHaHhf6k8/e3aaLOS1oD3AjpiW28uxq7baeLfd1B8l9WmmgZM=
 =RqMD
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2018-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Highlights:
 * merge net-next, so I can finish the hwsim workqueue removal
 * fix TXQ NULL pointer issue that was reported multiple times
 * minstrel cleanups from Felix
 * simplify lib80211 code by not using skcipher, note that this
   will conflict with the crypto tree (and this new code here
   should be used)
 * use new netlink policy validation in nl80211
 * fix up SAE (part of WPA3) in client-mode
 * FTM responder support in the stack
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 10:56:56 -07:00
Nikolay Aleksandrov
9163a0fc1f net: bridge: add support for per-port vlan stats
This patch adds an option to have per-port vlan stats instead of the
default global stats. The option can be set only when there are no port
vlans in the bridge since we need to allocate the stats if it is set
when vlans are being added to ports (and respectively free them
when being deleted). Also bump RTNL_MAX_TYPE as the bridge is the
largest user of options. The current stats design allows us to add
these without any changes to the fast-path, it all comes down to
the per-vlan stats pointer which, if this option is enabled, will
be allocated for each port vlan instead of using the global bridge-wide
one.

CC: bridge@lists.linux-foundation.org
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 10:18:58 -07:00
David Ahern
859bd2ef1f net: Evict neighbor entries on carrier down
When a link's carrier goes down it could be a sign of the port changing
networks. If the new network has overlapping addresses with the old one,
then the kernel will continue trying to use neighbor entries established
based on the old network until the entries finally age out - meaning a
potentially long delay with communications not working.

This patch evicts neighbor entries on carrier down with the exception of
those marked permanent. Permanent entries are managed by userspace (either
an admin or a routing daemon such as FRR).

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 09:47:39 -07:00
David Ahern
7c6bb7d2fa net/ipv6: Add knob to skip DELROUTE message on device down
Another difference between IPv4 and IPv6 is the generation of RTM_DELROUTE
notifications when a device is taken down (admin down) or deleted. IPv4
does not generate a message for routes evicted by the down or delete;
IPv6 does. A NOS at scale really needs to avoid these messages and have
IPv4 and IPv6 behave similarly, relying on userspace to handle link
notifications and evict the routes.

At this point existing user behavior needs to be preserved. Since
notifications are a global action (not per app) the only way to preserve
existing behavior and allow the messages to be skipped is to add a new
sysctl (net/ipv6/route/skip_notify_on_dev_down) which can be set to
disable the notificatioons.

IPv6 route code already supports the option to skip the message (it is
used for multipath routes for example). Besides the new sysctl we need
to pass the skip_notify setting through the generic fib6_clean and
fib6_walk functions to fib6_clean_node and to set skip_notify on calls
to __ip_del_rt for the addrconf_ifdown path.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 09:47:02 -07:00
Anilkumar Kolli
f8252e7b5a mac80211: implement ieee80211_tx_rate_update to update rate
Current mac80211 has provision to update tx status through
ieee80211_tx_status() and ieee80211_tx_status_ext(). But
drivers like ath10k updates the tx status from the skb except
txrate, txrate will be updated from a different path, peer stats.

Using ieee80211_tx_status_ext() in two different paths
(one for the stats, one for the tx rate) would duplicate
the stats instead.

To avoid this stats duplication, ieee80211_tx_rate_update()
is implemented.

Signed-off-by: Anilkumar Kolli <akolli@codeaurora.org>
[minor commit message editing, use initializers in code]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-12 13:05:40 +02:00
Ankita Bajaj
0d4e14a32d nl80211: Add per peer statistics to compute FCS error rate
Add support for drivers to report the total number of MPDUs received
and the number of MPDUs received with an FCS error from a specific
peer. These counters will be incremented only when the TA of the
frame matches the MAC address of the peer irrespective of FCS
error.

It should be noted that the TA field in the frame might be corrupted
when there is an FCS error and TA matching logic would fail in such
cases. Hence, FCS error counter might not be fully accurate, but it can
provide help in detecting bad RX links in significant number of cases.
This FCS error counter without full accuracy can be used, e.g., to
trigger a kick-out of a connected client with a bad link in AP mode to
force such a client to roam to another AP.

Signed-off-by: Ankita Bajaj <bankita@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-12 12:56:34 +02:00
Pradeep Kumar Chitrapu
bc847970f4 mac80211: support FTM responder configuration/statistics
New bss param ftm_responder is used to notify the driver to
enable fine timing request (FTM) responder role in AP mode.

Plumb the new cfg80211 API for FTM responder statistics through to
the driver API in mac80211.

Signed-off-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Pradeep Kumar Chitrapu <pradeepc@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-12 12:46:09 +02:00
Greg Kroah-Hartman
90ad18418c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David writes:
  "Networking

   1) RXRPC receive path fixes from David Howells.

   2) Re-export __skb_recv_udp(), from Jiri Kosina.

   3) Fix refcounting in u32 classificer, from Al Viro.

   4) Userspace netlink ABI fixes from Eugene Syromiatnikov.

   5) Don't double iounmap on rmmod in ena driver, from Arthur
      Kiyanovski.

   6) Fix devlink string attribute handling, we must pull a copy into a
      kernel buffer if the lifetime extends past the netlink request.
      From Moshe Shemesh.

   7) Fix hangs in RDS, from Ka-Cheong Poon.

   8) Fix recursive locking lockdep warnings in tipc, from Ying Xue.

   9) Clear RX irq correctly in socionext, from Ilias Apalodimas.

   10) bcm_sf2 fixes from Florian Fainelli."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
  net: dsa: bcm_sf2: Call setup during switch resume
  net: dsa: bcm_sf2: Fix unbind ordering
  net: phy: sfp: remove sfp_mutex's definition
  r8169: set RX_MULTI_EN bit in RxConfig for 8168F-family chips
  net: socionext: clear rx irq correctly
  net/mlx4_core: Fix warnings during boot on driverinit param set failures
  tipc: eliminate possible recursive locking detected by LOCKDEP
  selftests: udpgso_bench.sh explicitly requires bash
  selftests: rtnetlink.sh explicitly requires bash.
  qmi_wwan: Added support for Gemalto's Cinterion ALASxx WWAN interface
  tipc: queue socket protocol error messages into socket receive buffer
  tipc: set link tolerance correctly in broadcast link
  net: ipv4: don't let PMTU updates increase route MTU
  net: ipv4: update fnhe_pmtu when first hop's MTU changes
  net/ipv6: stop leaking percpu memory in fib6 info
  rds: RDS (tcp) hangs on sendto() to unresponding address
  net: make skb_partial_csum_set() more robust against overflows
  devlink: Add helper function for safely copy string param
  devlink: Fix param cmode driverinit for string type
  devlink: Fix param set handling for string type
  ...
2018-10-12 09:01:59 +02:00
Ying Xue
a1f8dd34e6 tipc: eliminate possible recursive locking detected by LOCKDEP
When booting kernel with LOCKDEP option, below warning info was found:

WARNING: possible recursive locking detected
4.19.0-rc7+ #14 Not tainted
--------------------------------------------
swapper/0/1 is trying to acquire lock:
00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
include/linux/spinlock.h:334 [inline]
00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850

but task is already holding lock:
00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
include/linux/spinlock.h:334 [inline]
00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&list->lock)->rlock#4);
  lock(&(&list->lock)->rlock#4);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by swapper/0/1:
 #0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at:
register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051
 #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
spin_lock_bh include/linux/spinlock.h:334 [inline]
 #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849

stack backtrace:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1af/0x295 lib/dump_stack.c:113
 print_deadlock_bug kernel/locking/lockdep.c:1759 [inline]
 check_deadlock kernel/locking/lockdep.c:1803 [inline]
 validate_chain kernel/locking/lockdep.c:2399 [inline]
 __lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411
 lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900
 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
 _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
 spin_lock_bh include/linux/spinlock.h:334 [inline]
 tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
 tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526
 tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521
 tipc_init_net+0x472/0x610 net/tipc/core.c:82
 ops_init+0xf7/0x520 net/core/net_namespace.c:129
 __register_pernet_operations net/core/net_namespace.c:940 [inline]
 register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011
 register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052
 tipc_init+0x83/0x104 net/tipc/core.c:140
 do_one_initcall+0x109/0x70a init/main.c:885
 do_initcall_level init/main.c:953 [inline]
 do_initcalls init/main.c:961 [inline]
 do_basic_setup init/main.c:979 [inline]
 kernel_init_freeable+0x4bd/0x57f init/main.c:1144
 kernel_init+0x13/0x180 init/main.c:1063
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413

The reason why the noise above was complained by LOCKDEP is because we
nested to hold l->wakeupq.lock and l->inputq->lock in tipc_link_reset
function. In fact it's unnecessary to move skb buffer from l->wakeupq
queue to l->inputq queue while holding the two locks at the same time.
Instead, we can move skb buffers in l->wakeupq queue to a temporary
list first and then move the buffers of the temporary list to l->inputq
queue, which is also safe for us.

Fixes: 3f32d0be6c16 ("tipc: lock wakeup & inputq at tipc_link_reset()")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:23:48 -07:00
Greg Kroah-Hartman
834d3cd294 Fix open-coded multiplication arguments to allocators
- Fixes several new open-coded multiplications added in the 4.19 merge window.
 -----BEGIN PGP SIGNATURE-----
 Comment: Kees Cook <kees@outflux.net>
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlu/fokWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsB/EACgKV77Sad5Luyr3rCmUtGcQ7az
 yLIrqvGcxC55ZEoZwHmSjxiN+5X2kDF6SEFrebvDKFSbiRoC0a1IWRC4pWTpBhTs
 +i1qHVTlOrwBZFTwOn2uklvgkkUfjatG/6zWc7l/Ye070Hekk0SnbMozlggCOJRm
 yKglXaBx9MKmj/T60Vpfve4ubBLM0zSuRPlsBON2qUUp2YTHbEqHOoYawfSK4RuF
 y2hzZc5A0/F7TionkHjrkdEJ8jRkwii2x4iM9KSdhNRxBT0lZkk3xpD6PjRaXCzt
 N2BMU17kftI5498QyKHXdTYCuVPqTpm+Z3d/q+YTbjdpXre1xcZU06ZT9Bqa+LwB
 pRaN4eqd7nLFKvCQYnUp0GuDj5pxd3Xz2dpC0IkaliEM8xYad1+NZRq7SkRJYOpM
 /y05GRdln9ULJF/pet5IS6LtXY+FSn4z+9e+ztVIPQ/kJUqvmyKfWPpdp6TPtwjC
 vb9cbKD7LRPoBfrY0efPXe4aixCwmc4Ob4kljCZtkyrpV+iImYQn9XqTblU7sbHa
 Om8FxGxdX7Xu9HUoT7uHeb8ZNg1g0/XWAEhs7pY22fzHT14T+0fYRz8njmlrw3ed
 dRdzydOxkJMcCVKLitoiw2X1yNRRHtGbXq/UhrHMNbEkOzf73/3fYZK68849FaEK
 1oFOX/N/OI5kp7pNAQ==
 =NS8/
 -----END PGP SIGNATURE-----

Merge tag 'alloc-args-v4.19-rc8' of https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Kees writes:
  "Fix open-coded multiplication arguments to allocators

   - Fixes several new open-coded multiplications added in the 4.19
     merge window."

* tag 'alloc-args-v4.19-rc8' of https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  treewide: Replace more open-coded allocation size multiplications
2018-10-11 19:10:30 +02:00
Jouni Malinen
efb543e61c mac80211: Extend SAE authentication in infra BSS STA mode
Previous implementation of SAE authentication in infrastructure BSS was
somewhat restricting and not exactly clean way of handling the two
auth() operations. This ended up removing and re-adding the STA entry
for the AP in the middle of authentication and also messing up
authentication state tracking through the sequence of four
Authentication frames. Furthermore, this did not work if the AP ended up
sending out SAE Confirm (auth trans #2) immediately after SAE Commit
(auth trans #1) before the station had time to transmit its SAE Confirm.

Clean up authentication state handling for the SAE case to allow two
rounds of auth() calls without dropping all state between those
operations. Track peer Confirmed status and mark authentication
completed only once both ends have confirmed.

ieee80211_mgd_auth() check for EBUSY cases is now handling only the
pending association (ifmgd->assoc_data) while all pending authentication
(ifmgd->auth_data) cases are allowed to proceed to allow user space to
start a new connection attempt from scratch even if the previously
requested authentication is still waiting completion. This is needed to
avoid making SAE error cases with retries take excessive amount of time
with no means for the user space to stop that (apart from setting the
netdev down).

As an extra bonus, the end of ieee80211_rx_mgmt_auth() can be cleaned up
to avoid the extra copy of the cfg80211_rx_mlme_mgmt() call for ongoing
SAE authentication since the new ieee80211_mark_sta_auth() helper
function can handle both completion of authentication and updates to the
STA entry under the same condition and there is no need to return from
the function between those operations.

Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:08 +02:00
Jouni Malinen
8d7432a2f5 mac80211: Move ieee80211_mgd_auth() EBUSY check to be before allocation
This makes it easier to conditionally replace full allocation of
auth_data to use reallocation for the case of continuing SAE
authentication. Furthermore, there was not really any point in having
this check done so late in the function after having already completed
number of steps that cannot be used anyway in the error case.

Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:07 +02:00
Jouni Malinen
fc107a9330 mac80211: Helper function for marking STA authenticated
Authentication exchange can be completed in both TX and RX paths for
SAE, so move this common functionality into a helper function to avoid
having to implement practically the same operations in two places when
extending SAE implementation in the following commits.

Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:05 +02:00
Felix Fietkau
506dbf90c1 mac80211: rc80211_minstrel: remove variance / stddev calculation
When there are few packets (e.g. for sampling attempts), the exponentially
weighted variance is usually vastly overestimated, making the resulting data
essentially useless. As far as I know, there has not been any practical use
for this, so let's not waste any cycles on it.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:05 +02:00
Felix Fietkau
f4ec7cb0f9 mac80211: minstrel: do not sample rates 3 times slower than max_prob_rate
These rates are highly unlikely to be used quickly, even if the link
deteriorates rapidly. This improves throughput in cases where CCK rates
are not reliable enough to be skipped entirely during sampling.
Sampling these rates regularly can cost a lot of airtime.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:04 +02:00
Felix Fietkau
972b66b86f mac80211: minstrel: fix sampling/reporting of CCK rates in HT mode
Long/short preamble selection cannot be sampled separately, since it
depends on the BSS state. Because of that, sampling attempts to
currently not used preamble modes are not counted in the statistics,
which leads to CCK rates being sampled too often.

Fix statistics accounting for long/short preamble by increasing the
index where necessary.
Fix excessive CCK rate sampling by dropping unsupported sample attempts.

This improves throughput on 2.4 GHz channels

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:03 +02:00
Felix Fietkau
80df9be67c mac80211: minstrel: fix CCK rate group streams value
Fixes a harmless underflow issue when CCK rates are actively being used

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:03 +02:00
Felix Fietkau
37439f2d6e mac80211: minstrel: fix using short preamble CCK rates on HT clients
mi->supported[MINSTREL_CCK_GROUP] needs to be updated
short preamble rates need to be marked as supported regardless of
whether it's currently enabled. Its state can change at any time without
a rate_update call.

Fixes: 782dda00ab8e ("mac80211: minstrel_ht: move short preamble check out of get_rate")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:02 +02:00
Felix Fietkau
202df504d7 mac80211: minstrel: reduce minstrel_mcs_groups size
By storing a shift value for all duration values of a group, we can
reduce precision by a neglegible amount to make it fit into a u16 value.
This improves cache footprint and reduces size:

Before:
   text    data     bss     dec     hex filename
  10024     116       0   10140    279c rc80211_minstrel_ht.o

After:
   text    data     bss     dec     hex filename
   9368     116       0    9484    250c rc80211_minstrel_ht.o

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:02 +02:00
Felix Fietkau
b1c4f68337 mac80211: minstrel: merge with minstrel_ht, always enable VHT support
Legacy-only devices are not very common and the overhead of the extra
code for HT and VHT rates is not big enough to justify all those extra
lines of code to make it optional.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:01 +02:00
Felix Fietkau
5b5e87314e mac80211: minstrel: remove unnecessary debugfs cleanup code
debugfs entries are cleaned up by debugfs_remove_recursive already.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:00 +02:00
Chaitanya T K
f458e832ba mac80211: minstrel: Enable STBC and LDPC for VHT Rates
If peer support reception of STBC and LDPC, enable them for better
performance.

Signed-off-by: Chaitanya TK <chaitanya.mgit@gmail.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:01:00 +02:00
Johannes Berg
42dca5ef24 mac80211: avoid reflecting frames back to the client
I'm not really sure exactly _why_ I've been carrying a note
for what's probably _years_ to check that we don't do this,
but we clearly do reflect frames back to the station itself
if it sends such.

One way or the other, it's useless since the station doesn't
really need the AP to talk to itself, so suppress it.

While at it, clarify some of the logic by removing skb->data
references in favour of the destination address (pointer) we
already have separately.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:00:59 +02:00
Johannes Berg
3d7af87835 nl80211: use netlink policy validation function for elements
Instead of open-coding a lot of calls to is_valid_ie_attr(),
add this validation directly to the policy, now that we can.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:00:59 +02:00
Johannes Berg
ab0d76f682 nl80211: use policy range validation where applicable
Many range checks can be done in the policy, move them
there. A few in mesh are added in the code (taken out of
the macros) because they don't fit into the s16 range in
the policy validation.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-10-11 16:00:58 +02:00
Florian Westphal
9dffff200f xfrm: policy: use hlist rcu variants on insert
bydst table/list lookups use rcu, so insertions must use rcu versions.

Fixes: a7c44247f704e ("xfrm: policy: make xfrm_policy_lookup_bytype lockless")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-10-11 13:24:46 +02:00
Alexei Starovoitov
9f7e43da6a net/xfrm: fix out-of-bounds packet access
BUG: KASAN: slab-out-of-bounds in _decode_session6+0x1331/0x14e0
net/ipv6/xfrm6_policy.c:161
Read of size 1 at addr ffff8801d882eec7 by task syz-executor1/6667
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
  print_address_description+0x6c/0x20b mm/kasan/report.c:256
  kasan_report_error mm/kasan/report.c:354 [inline]
  kasan_report.cold.7+0x242/0x30d mm/kasan/report.c:412
  __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430
  _decode_session6+0x1331/0x14e0 net/ipv6/xfrm6_policy.c:161
  __xfrm_decode_session+0x71/0x140 net/xfrm/xfrm_policy.c:2299
  xfrm_decode_session include/net/xfrm.h:1232 [inline]
  vti6_tnl_xmit+0x3c3/0x1bc1 net/ipv6/ip6_vti.c:542
  __netdev_start_xmit include/linux/netdevice.h:4313 [inline]
  netdev_start_xmit include/linux/netdevice.h:4322 [inline]
  xmit_one net/core/dev.c:3217 [inline]
  dev_hard_start_xmit+0x272/0xc10 net/core/dev.c:3233
  __dev_queue_xmit+0x2ab2/0x3870 net/core/dev.c:3803
  dev_queue_xmit+0x17/0x20 net/core/dev.c:3836

Reported-by: syzbot+acffccec848dc13fe459@syzkaller.appspotmail.com
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-10-11 13:24:46 +02:00
Pablo Neira Ayuso
d701d81172 netfilter: nft_compat: do not dump private area
Zero pad private area, otherwise we expose private kernel pointer to
userspace. This patch also zeroes the tail area after the ->matchsize
and ->targetsize that results from XT_ALIGN().

Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-11 11:29:53 +02:00
Taehee Yoo
18c0ab8736 netfilter: xt_TEE: add missing code to get interface index in checkentry.
checkentry(tee_tg_check) should initialize priv->oif from dev if possible.
But only netdevice notifier handler can set that.
Hence priv->oif is always -1 until notifier handler is called.

Fixes: 9e2f6c5d78db ("netfilter: Rework xt_TEE netdevice notifier")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-11 11:29:14 +02:00
Taehee Yoo
f24d2d4f95 netfilter: xt_TEE: fix wrong interface selection
TEE netdevice notifier handler checks only interface name. however
each netns can have same interface name. hence other netns's interface
could be selected.

test commands:
   %ip netns add vm1
   %iptables -I INPUT -p icmp -j TEE --gateway 192.168.1.1 --oif enp2s0
   %ip link set enp2s0 netns vm1

Above rule is in the root netns. but that rule could get enp2s0
ifindex of vm1 by notifier handler.

After this patch, TEE rule is added to the per-netns list.

Fixes: 9e2f6c5d78db ("netfilter: Rework xt_TEE netdevice notifier")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-11 11:29:14 +02:00
Fernando Fernandez Mancera
4a3e71b7b7 netfilter: nft_osf: usage from output path is not valid
The nft_osf extension, like xt_osf, is not supported from the output
path.

Fixes: b96af92d6eaf ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf")
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-11 11:29:14 +02:00
Pablo Neira Ayuso
3b18d5eba4 netfilter: nft_set_rbtree: allow loose matching of closing element in interval
Allow to find closest matching for the right side of an interval (end
flag set on) so we allow lookups in inner ranges, eg. 10-20 in 5-25.

Fixes: ba0e4d9917b4 ("netfilter: nf_tables: get set elements via netlink")
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-11 11:29:14 +02:00
Björn Töpel
cee271678d xsk: do not call synchronize_net() under RCU read lock
The XSKMAP update and delete functions called synchronize_net(), which
can sleep. It is not allowed to sleep during an RCU read section.

Instead we need to make sure that the sock sk_destruct (xsk_destruct)
function is asynchronously called after an RCU grace period. Setting
the SOCK_RCU_FREE flag for XDP sockets takes care of this.

Fixes: fbfc504a24f5 ("bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-11 10:19:01 +02:00
Parthasarathy Bhuvaragan
e7eb058238 tipc: queue socket protocol error messages into socket receive buffer
In tipc_sk_filter_rcv(), when we detect protocol messages with error we
call tipc_sk_conn_proto_rcv() and let it reset the connection and notify
the socket by calling sk->sk_state_change().

However, tipc_sk_filter_rcv() may have been called from the function
tipc_backlog_rcv(), in which case the socket lock is held and the socket
already awake. This means that the sk_state_change() call is ignored and
the error notification lost. Now the receive queue will remain empty and
the socket sleeps forever.

In this commit, we convert the protocol message into a connection abort
message and enqueue it into the socket's receive queue. By this addition
to the above state change we cover all conditions.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:56:07 -07:00
Jon Maloy
047491ea33 tipc: set link tolerance correctly in broadcast link
In the patch referred to below we added link tolerance as an additional
criteria for declaring broadcast transmission "stale" and resetting the
affected links.

However, the 'tolerance' field of the broadcast link is never set, and
remains at zero. This renders the whole commit without the intended
improving effect, but luckily also with no negative effect.

In this commit we add the missing initialization.

Fixes: a4dc70d46cf1 ("tipc: extend link reset criteria for stale packet retransmission")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:56:07 -07:00
Eric Dumazet
f98ebd47fd net: sched: avoid writing on noop_qdisc
While noop_qdisc.gso_skb and noop_qdisc.skb_bad_txq are not used
in other places, it seems not correct to overwrite their fields
in dev_init_scheduler_queue().

noop_qdisc is essentially a shared and read-only object, even if
it is not marked as const because of some implementation detail.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:49:16 -07:00
David Ahern
d8a66aa254 net/mpls: Implement handler for strict data checking on dumps
Without CONFIG_INET enabled compiles fail with:

net/mpls/af_mpls.o: In function `mpls_dump_routes':
af_mpls.c:(.text+0xed0): undefined reference to `ip_valid_fib_dump_req'

The preference is for MPLS to use the same handler as ipv4 and ipv6
to allow consistency when doing a dump for AF_UNSPEC which walks
all address families invoking the route dump handler. If INET is
disabled then fallback to an MPLS version which can be tighter on
the data checks.

Fixes: e8ba330ac0c5 ("rtnetlink: Update fib dumps for strict data checking")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:46:17 -07:00
Sabrina Dubroca
28d35bcdd3 net: ipv4: don't let PMTU updates increase route MTU
When an MTU update with PMTU smaller than net.ipv4.route.min_pmtu is
received, we must clamp its value. However, we can receive a PMTU
exception with PMTU < old_mtu < ip_rt_min_pmtu, which would lead to an
increase in PMTU.

To fix this, take the smallest of the old MTU and ip_rt_min_pmtu.

Before this patch, in case of an update, the exception's MTU would
always change. Now, an exception can have only its lock flag updated,
but not the MTU, so we need to add a check on locking to the following
"is this exception getting updated, or close to expiring?" test.

Fixes: d52e5a7e7ca4 ("ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:44:46 -07:00
Sabrina Dubroca
af7d6cce53 net: ipv4: update fnhe_pmtu when first hop's MTU changes
Since commit 5aad1de5ea2c ("ipv4: use separate genid for next hop
exceptions"), exceptions get deprecated separately from cached
routes. In particular, administrative changes don't clear PMTU anymore.

As Stefano described in commit e9fa1495d738 ("ipv6: Reflect MTU changes
on PMTU of exceptions for MTU-less routes"), the PMTU discovered before
the local MTU change can become stale:
 - if the local MTU is now lower than the PMTU, that PMTU is now
   incorrect
 - if the local MTU was the lowest value in the path, and is increased,
   we might discover a higher PMTU

Similarly to what commit e9fa1495d738 did for IPv6, update PMTU in those
cases.

If the exception was locked, the discovered PMTU was smaller than the
minimal accepted PMTU. In that case, if the new local MTU is smaller
than the current PMTU, let PMTU discovery figure out if locking of the
exception is still needed.

To do this, we need to know the old link MTU in the NETDEV_CHANGEMTU
notifier. By the time the notifier is called, dev->mtu has been
changed. This patch adds the old MTU as additional information in the
notifier structure, and a new call_netdevice_notifiers_u32() function.

Fixes: 5aad1de5ea2c ("ipv4: use separate genid for next hop exceptions")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:44:46 -07:00
Mike Rapoport
7abab7b9b4 net/ipv6: stop leaking percpu memory in fib6 info
The fib6_info_alloc() function allocates percpu memory to hold per CPU
pointers to rt6_info, but this memory is never freed. Fix it.

Fixes: a64efe142f5e ("net/ipv6: introduce fib6_info struct and helpers")
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:42:54 -07:00
David S. Miller
49b538e79b RxRPC fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAW7vWy/u3V2unywtrAQJkeA//UTmUmv9bL8k38XvnnHQ82iTyIjvZ40md
 9XbNagwjkxuDv2eqW1rTxk5+Po596Nb4s/cBvpyAABjCA+zoT0/fklK5PqsSKFla
 ibAHlrBiv/1FlSUSTf4dvAfVVszRAItCP4OhKTU+0fkX3Hq75T7Lty1h92Yk/ijx
 ccqSV9K/nB4WN9bbg+sDngu3RAVR3VJ5RhVSmrnE9xd3M9ihz2RHacJ6GWNQf5PV
 j8cKuP3qY/2K245hb4sGn1i9OO6/WVVQhnojYA+jR8Vg4+8mY0Y++MLsW8ywlsuo
 xlrDCVPBaR6YqjnlDoOXIXYZ8EbEsmy5FP51zfEZJqeKJcVPLavQyoku0xJ8fHgr
 oK/3DBUkwgi0hiq4dC1CrmXZRSbBQqGTEJMA6AKh2ApXb9BkmhmqRGMx+2nqocwF
 TjAQmCu0FJTAYdwJuqzv5SRFjocjrHGcYDpQ27CgNqSiiDlh8CZ1ej/mbV+o8pUQ
 tOrpnRCLY8Nl55AgJbl2260mKcpcKQcRUwggKAOzqRjXwcy9q3KKm8H149qejJSl
 AQihDkPVqBZrv5ODKR0z88e5cVF9ad+vBPd+2TrRVGtTWhm/q8N/0Hh0M2sYDSc6
 vLklYfwI4U5U1/naJC9WCkW+ybj5ekVNHgW91ICvQVNG0TH2zfE7sUGMQn6lRZL8
 jfpLTVbtKfI=
 =q6E6
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-fixes-20181008' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Fix packet reception code

Here are a set of patches that prepares for and fix problems in rxrpc's
package reception code.  There serious problems are:

 (A) There's a window between binding the socket and setting the data_ready
     hook in which packets can find their way into the UDP socket's receive
     queues.

 (B) The skb_recv_udp() will return an error (and clear the error state) if
     there was an error on the Tx side.  rxrpc doesn't handle this.

 (C) The rxrpc data_ready handler doesn't fully drain the UDP receive
     queue.

 (D) The rxrpc data_ready handler assumes it is called in a non-reentrant
 state.

The second patch fixes (A) - (C); the third patch renders (B) and (C)
non-issues by using the recap_rcv hook instead of data_ready - and the
final patch fixes (D).  That last is the most complex.

The preparatory patches are:

 (1) Fix some places that are doing things in the wrong net namespace.

 (2) Stop taking the rcu read lock as it's held by the IP input routine in
     the call chain.

 (3) Only end the Tx phase if *we* rotated the final packet out of the Tx
     buffer.

 (4) Don't assume that the call state won't change after dropping the
     call_state lock.

 (5) Only take receive window and MTU suze parameters from an ACK packet if
     it's the latest ACK packet.

 (6) Record connection-level abort information correctly.

 (7) Fix a trace line.

And then there are three main patches - note that these are mixed in with
the preparatory patches somewhat:

 (1) Fix the setup window (A), skb_recv_udp() error check (B) and packet
     drainage (C).

 (2) Switch to using the encap_rcv instead of data_ready to cut out the
     effects of the UDP read queues and get the packets delivered directly.

 (3) Add more locking into the various packet input paths to defend against
     re-entrance (D).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:27:38 -07:00
Yuchung Cheng
ffd177dea5 tcp: refactor DCTCP ECN ACK handling
DCTCP has two parts - a new ECN signalling mechanism and the response
function to it. The first part can be used by other congestion
control for DCTCP-ECN deployed networks. This patch moves that part
into a separate tcp_dctcp.h to be used by other congestion control
module (like how Yeah uses Vegas algorithmas). For example, BBR is
experimenting such ECN signal currently
https://tinyurl.com/ietf-102-iccrg-bbr2

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:26:00 -07:00
David Ahern
ed792e28c4 net/ipv6: Make ipv6_route_table_template static
ipv6_route_table_template is exported but there are no users outside
of route.c. Make it static.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:25:10 -07:00
David Ahern
e75fa0735c rtnetlink: Update comment in rtnl_stats_dump regarding strict data checking
The NLM_F_DUMP_PROPER_HDR netlink flag was replaced by a setsockopt.
Update the comment in rtnl_stats_dump.

Fixes: 841891ec0c65 ("rtnetlink: Update rtnl_stats_dump for strict data checking")
Reported-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:24:51 -07:00
David Ahern
4565d7e5a3 rtnetlink: Move ifm in valid_fdb_dump_legacy to closer to use
Move setting of local variable ifm to after the message parsing in
valid_fdb_dump_legacy. Avoid potential future use of unchecked variable.

Fixes: 8dfbda19a21b ("rtnetlink: Move input checking for rtnl_fdb_dump to helper")
Reported-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:24:33 -07:00