Commit Graph

57583 Commits

Author SHA1 Message Date
Willem de Bruijn
c6af0c227a ip: support SO_MARK cmsg
Enable setting skb->mark for UDP and RAW sockets using cmsg.

This is analogous to existing support for TOS, TTL, txtime, etc.

Packet sockets already support this as of commit c7d39e3263
("packet: support per-packet fwmark for af_packet sendmsg").

Similar to other fields, implement by
1. initialize the sockcm_cookie.mark from socket option sk_mark
2. optionally overwrite this in ip_cmsg_send/ip6_datagram_send_ctl
3. initialize inet_cork.mark from sockcm_cookie.mark
4. initialize each (usually just one) skb->mark from inet_cork.mark

Step 1 is handled in one location for most protocols by ipcm_init_sk
as of commit 351782067b ("ipv4: ipcm_cookie initializers").

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-13 21:44:19 +02:00
David S. Miller
022c10d6c7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Fix error path of nf_tables_updobj(), from Dan Carpenter.

2) Move large structure away from stack in the nf_tables offload
   infrastructure, from Arnd Bergmann.

3) Move indirect flow_block logic to nf_tables_offload.

4) Support for synproxy objects, from Fernando Fernandez Mancera.

5) Support for fwd and dup offload.

6) Add __nft_offload_get_chain() helper, this implicitly fixes missing
   mutex and check for offload flags in the indirect block support,
   patch from wenxu.

7) Remove rules on device unregistration, from wenxu. This includes
   two preparation patches to reuse nft_flow_offload_chain() and
   nft_flow_offload_rule().

Large batch from Jeremy Sowden to make a second pass to the
CONFIG_HEADER_TEST support and a bit of housekeeping:

8) Missing include guard in conntrack label header, from Jeremy Sowden.

9) A few coding style errors: trailing whitespace, incorrect indent in
   Kconfig, and semicolons at the end of function definitions.

10) Remove unused ipt_init() and ip6t_init() declarations.

11) Inline xt_hashlimit, ebt_802_3 and xt_physdev headers. They are
    only used once.

12) Update include directive in several netfilter files.

13) Remove unused include/net/netfilter/ipv6/nf_conntrack_icmpv6.h.

14) Move nf_ip6_ext_hdr() to include/linux/netfilter_ipv6.h

15) Move several synproxy structure definitions to nf_synproxy.h

16) Move nf_bridge_frag_data structure to include/linux/netfilter_bridge.h

17) Clean up static inline definitions in nf_conntrack_ecache.h.

18) Replace defined(CONFIG...) || defined(CONFIG...MODULE) with IS_ENABLED(CONFIG...).

19) Missing inline function conditional definitions based on Kconfig
    preferences in synproxy and nf_conntrack_timeout.

20) Update br_nf_pre_routing_ipv6() definition.

21) Move conntrack code in linux/skbuff.h to nf_conntrack headers.

22) Several patches to remove superfluous CONFIG_NETFILTER and
    CONFIG_NF_CONNTRACK checks in headers, coming from the initial batch
    support for CONFIG_HEADER_TEST for netfilter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-13 14:22:43 +01:00
Jeremy Sowden
261db6c2fb netfilter: conntrack: move code to linux/nf_conntrack_common.h.
Move some `struct nf_conntrack` code from linux/skbuff.h to
linux/nf_conntrack_common.h.  Together with a couple of helpers for
getting and setting skb->_nfct, it allows us to remove
CONFIG_NF_CONNTRACK checks from net/netfilter/nf_conntrack.h.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 12:47:11 +02:00
Jeremy Sowden
46705b070c netfilter: move nf_bridge_frag_data struct definition to a more appropriate header.
There is a struct definition function in nf_conntrack_bridge.h which is
not specific to conntrack and is used elswhere in netfilter.  Move it
into netfilter_bridge.h.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 12:35:33 +02:00
Jeremy Sowden
44dde23698 netfilter: move inline nf_ip6_ext_hdr() function to a more appropriate header.
There is an inline function in ip6_tables.h which is not specific to
ip6tables and is used elswhere in netfilter.  Move it into
netfilter_ipv6.h and update the callers.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 12:34:09 +02:00
Jeremy Sowden
8bf3cbe32b netfilter: remove nf_conntrack_icmpv6.h header.
nf_conntrack_icmpv6.h contains two object macros which duplicate macros
in linux/icmpv6.h.  The latter definitions are also visible wherever it
is included, so remove it.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 12:33:06 +02:00
Jeremy Sowden
40d102cde0 netfilter: update include directives.
Include some headers in files which require them, and remove others
which are not required.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 12:33:06 +02:00
Jeremy Sowden
85cfbc25e5 netfilter: inline xt_hashlimit, ebt_802_3 and xt_physdev headers
Three netfilter headers are only included once.  Inline their contents
at those sites and remove them.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 12:32:48 +02:00
Jeremy Sowden
b0edba2af7 netfilter: fix coding-style errors.
Several header-files, Kconfig files and Makefiles have trailing
white-space.  Remove it.

In netfilter/Kconfig, indent the type of CONFIG_NETFILTER_NETLINK_ACCT
correctly.

There are semicolons at the end of two function definitions in
include/net/netfilter/nf_conntrack_acct.h and
include/net/netfilter/nf_conntrack_ecache.h. Remove them.

Fix indentation in nf_conntrack_l4proto.h.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 11:39:38 +02:00
wenxu
06d392cbe3 netfilter: nf_tables_offload: remove rules when the device unregisters
If the net_device unregisters, clean up the offload rules before the
chain is destroy.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 10:58:10 +02:00
wenxu
e211aab73d netfilter: nf_tables_offload: refactor the nft_flow_offload_rule function
Pass rule, chain and flow_rule object parameters to nft_flow_offload_rule
to reuse it.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 10:12:05 +02:00
wenxu
8fc618c52d netfilter: nf_tables_offload: refactor the nft_flow_offload_chain function
Pass chain and policy parameters to nft_flow_offload_chain to reuse it.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 10:11:57 +02:00
wenxu
504882db83 netfilter: nf_tables_offload: add __nft_offload_get_chain function
Add __nft_offload_get_chain function to get basechain from device. This
function requires that caller holds the per-netns nftables mutex. This
patch implicitly fixes missing offload flags check and proper mutex from
nft_indr_block_cb().

Fixes: 9a32669fec ("netfilter: nf_tables_offload: support indr block call")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 01:09:15 +02:00
George McCollister
f4073e9164 net: dsa: microchip: remove NET_DSA_TAG_KSZ_COMMON
Remove the superfluous NET_DSA_TAG_KSZ_COMMON and just use the existing
NET_DSA_TAG_KSZ. Update the description to mention the three switch
families it supports. No functional change.

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-12 11:36:12 +01:00
Eric Dumazet
051ba67447 tcp: force a PSH flag on TSO packets
When tcp sends a TSO packet, adding a PSH flag on it
reduces the sojourn time of GRO packet in GRO receivers.

This is particularly the case under pressure, since RX queues
receive packets for many concurrent flows.

A sender can give a hint to GRO engines when it is
appropriate to flush a super-packet, especially when pacing
is in the picture, since next packet is probably delayed by
one ms.

Having less packets in GRO engine reduces chance
of LRU eviction or inflated RTT, and reduces GRO cost.

We found recently that we must not set the PSH flag on
individual full-size MSS segments [1] :

 Under pressure (CWR state), we better let the packet sit
 for a small delay (depending on NAPI logic) so that the
 ACK packet is delayed, and thus next packet we send is
 also delayed a bit. Eventually the bottleneck queue can
 be drained. DCTCP flows with CWND=1 have demonstrated
 the issue.

This patch allows to slowdown the aggregate traffic without
involving high resolution timers on senders and/or
receivers.

It has been used at Google for about four years,
and has been discussed at various networking conferences.

[1] segments smaller than MSS already have PSH flag set
    by tcp_sendmsg() / tcp_mark_push(), unless MSG_MORE
    has been requested by the user.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11 23:59:01 +01:00
Stefano Brivio
cbfd68913c ipv6: Don't use dst gateway directly in ip6_confirm_neigh()
This is the equivalent of commit 2c6b55f45d ("ipv6: fix neighbour
resolution with raw socket") for ip6_confirm_neigh(): we can send a
packet with MSG_CONFIRM on a raw socket for a connected route, so the
gateway would be :: here, and we should pick the next hop using
rt6_nexthop() instead.

This was found by code review and, to the best of my knowledge, doesn't
actually fix a practical issue: the destination address from the packet
is not considered while confirming a neighbour, as ip6_confirm_neigh()
calls choose_neigh_daddr() without passing the packet, so there are no
similar issues as the one fixed by said commit.

A possible source of issues with the existing implementation might come
from the fact that, if we have a cached dst, we won't consider it,
while rt6_nexthop() takes care of that. I might just not be creative
enough to find a practical problem here: the only way to affect this
with cached routes is to have one coming from an ICMPv6 redirect, but
if the next hop is a directly connected host, there should be no
topology for which a redirect applies here, and tests with redirected
routes show no differences for MSG_CONFIRM (and MSG_PROBE) packets on
raw sockets destined to a directly connected host.

However, directly using the dst gateway here is not consistent anymore
with neighbour resolution, and, in general, as we want the next hop,
using rt6_nexthop() looks like the only sane way to fetch it.

Reported-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11 23:52:17 +01:00
David S. Miller
c1b3ddf7c3 We have a number of changes, but things are settling down:
* a fix in the new 6 GHz channel support
  * a fix for recent minstrel (rate control) updates
    for an infinite loop
  * handle interface type changes better wrt. management frame
    registrations (for management frames sent to userspace)
  * add in-BSS RX time to survey information
  * handle HW rfkill properly if !CONFIG_RFKILL
  * send deauth on IBSS station expiry, to avoid state mismatches
  * handle deferred crypto tailroom updates in mac80211 better
    when device restart happens
  * fix a spectre-v1 - really a continuation of a previous patch
  * advertise NL80211_CMD_UPDATE_FT_IES as supported if so
  * add some missing parsing in VHT extended NSS support
  * support HE in mac80211_hwsim
  * let mac80211 drivers determine the max MTU themselves
 along with the usual cleanups etc.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl148l4ACgkQB8qZga/f
 l8TafQ//Yjunnxq49ClKMDxyKwxIBvLRAMr4D3QwaWAcWUpar1/V0Ft/5glnHtbZ
 7QptwbxVNUk/N68hYi8wlpMpvfzv/nShZD8QBNS1bk5E3Gng3yow03LhOx5iaYb2
 KJXS1GE4jwkMD7Xn65+eMeb8rt1vEj8LleX91cguilq+y5YbcNFsP1nil2RtQyBU
 cf5i8CBu4I5rTBoFaRvcz2xn+blqPSm2/piA+yXjzFp9vyVmhD+FjR5T482u48pj
 wi/1zersGVUzBNElnZOKg67XPir1fcJqCfLILr7okPItWuYXHdVHGfn+arhimK4W
 dyIpv1EfCe0lwKl4VTdhXt1GwhKvWCc2Ja7lz/RnDisGq9CYPJNfW7jFgR3tw4eg
 SccnUhnxRIgD1V2KDcvsRadPo+2YsBJBmC61JRX1K6L3zpHLbktDyzxvTwCeVQ9A
 TQp5bmQcKqDoq+/60JNHI6IxMbmX/vc2PC7dENGWUkem/JEmSWBB1wcL9gsRkhVi
 c4uHyvFeXJaIV7YFA36hWCfa0fr+UUYxfzxudRSvxq/tTpayqBKu6fr3Pvv0SqTj
 /BKkezoIdJntClhJv4PcDkZMva4uMCPtHCero9eICX5J+4AanD6caRNefX6L0PfF
 DtN1sscVOy9zY2fV4tqvn3IASmOE5yB/dxjtKkMYyNHkZsGiDI0=
 =HAuT
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2019-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
We have a number of changes, but things are settling down:
 * a fix in the new 6 GHz channel support
 * a fix for recent minstrel (rate control) updates
   for an infinite loop
 * handle interface type changes better wrt. management frame
   registrations (for management frames sent to userspace)
 * add in-BSS RX time to survey information
 * handle HW rfkill properly if !CONFIG_RFKILL
 * send deauth on IBSS station expiry, to avoid state mismatches
 * handle deferred crypto tailroom updates in mac80211 better
   when device restart happens
 * fix a spectre-v1 - really a continuation of a previous patch
 * advertise NL80211_CMD_UPDATE_FT_IES as supported if so
 * add some missing parsing in VHT extended NSS support
 * support HE in mac80211_hwsim
 * let mac80211 drivers determine the max MTU themselves
along with the usual cleanups etc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-11 14:57:17 +01:00
Denis Kenzior
c1d3ad84ea cfg80211: Purge frame registrations on iftype change
Currently frame registrations are not purged, even when changing the
interface type.  This can lead to potentially weird situations where
frames possibly not allowed on a given interface type remain registered
due to the type switching happening after registration.

The kernel currently relies on userspace apps to actually purge the
registrations themselves, this is not something that the kernel should
rely on.

Add a call to cfg80211_mlme_purge_registrations() to forcefully remove
any registrations left over prior to switching the iftype.

Cc: stable@vger.kernel.org
Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Link: https://lore.kernel.org/r/20190828211110.15005-1-denkenz@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 10:45:10 +02:00
Masashi Honma
4b2c5a14cd nl80211: Fix possible Spectre-v1 for CQM RSSI thresholds
commit 1222a16014 ("nl80211: Fix possible Spectre-v1 for CQM
RSSI thresholds") was incomplete and requires one more fix to
prevent accessing to rssi_thresholds[n] because user can control
rssi_thresholds[i] values to make i reach to n. For example,
rssi_thresholds = {-400, -300, -200, -100} when last is -34.

Cc: stable@vger.kernel.org
Fixes: 1222a16014 ("nl80211: Fix possible Spectre-v1 for CQM RSSI thresholds")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
Link: https://lore.kernel.org/r/20190908005653.17433-1-masashi.honma@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 09:33:29 +02:00
Wen Gong
06354665f9 mac80211: allow drivers to set max MTU
Make it possibly for drivers to adjust the default max_mtu
by storing it in the hardware struct and using that value
for all interfaces.

Signed-off-by: Wen Gong <wgong@codeaurora.org>
Link: https://lore.kernel.org/r/1567738137-31748-1-git-send-email-wgong@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 09:33:29 +02:00
zhong jiang
24f6d765c8 cfg80211: Do not compare with boolean in nl80211_common_reg_change_event
With the help of boolinit.cocci, we use !nl80211_reg_change_event_fill
instead of (nl80211_reg_change_event_fill == false). Meanwhile, Clean
up the code.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Link: https://lore.kernel.org/r/1567657537-65472-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 09:33:29 +02:00
Johannes Berg
4b08d1b6a9 mac80211: IBSS: send deauth when expiring inactive STAs
When we expire an inactive station, try to send it a deauth. This
helps if it's actually still around, and just has issues with
beacon distribution (or we do), and it will not also remove us.
Then, if we have shared state, this may not be reset properly,
causing problems; for example, we saw a case where aggregation
sessions weren't removed properly (due to the TX start being
offloaded to firmware and it relying on deauth for stop), causing
a lot of traffic to get lost due to the SN reset after remove/add
of the peer.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/20190830112451.21655-9-luca@coelho.fi
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 09:33:29 +02:00
Luca Coelho
753a9a729f mac80211: don't check if key is NULL in ieee80211_key_link()
We already assume that key is not NULL and dereference it in a few
other places before we check whether it is NULL, so the check is
unnecessary.  Remove it.

Fixes: 96fc6efb9a ("mac80211: IEEE 802.11 Extended Key ID support")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/20190830112451.21655-8-luca@coelho.fi
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 09:33:28 +02:00
Lior Cohen
624ff4b210 mac80211: clear crypto tx tailroom counter upon keys enable
In case we got a fw restart while roaming from encrypted AP to
non-encrypted one, we might end up with hitting a warning on the pending
counter crypto_tx_tailroom_pending_dec having a non-zero value.

The following comment taken from net/mac80211/key.c explains the rational
for the delayed tailroom needed:

	/*
	* The reason for the delayed tailroom needed decrementing is to
	* make roaming faster: during roaming, all keys are first deleted
	* and then new keys are installed. The first new key causes the
	* crypto_tx_tailroom_needed_cnt to go from 0 to 1, which invokes
	* the cost of synchronize_net() (which can be slow). Avoid this
	* by deferring the crypto_tx_tailroom_needed_cnt decrementing on
	* key removal for a while, so if we roam the value is larger than
	* zero and no 0->1 transition happens.
	*
	* The cost is that if the AP switching was from an AP with keys
	* to one without, we still allocate tailroom while it would no
	* longer be needed. However, in the typical (fast) roaming case
	* within an ESS this usually won't happen.
	*/

The next flow lead to the warning eventually reported as a bug:
1. Disconnect from encrypted AP
2. Set crypto_tx_tailroom_pending_dec = 1 for the key
3. Schedule work
4. Reconnect to non-encrypted AP
5. Add a new key, setting the tailroom counter = 1
6. Got FW restart while pending counter is set ---> hit the warning

While on it, the ieee80211_reset_crypto_tx_tailroom() func was merged into
its single caller ieee80211_reenable_keys (previously called
ieee80211_enable_keys). Also, we reset the crypto_tx_tailroom_pending_dec
and remove the counters warning as we just reset both.

Signed-off-by: Lior Cohen <lior2.cohen@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/20190830112451.21655-7-luca@coelho.fi
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 09:33:28 +02:00
Johannes Berg
1c9559734e mac80211: remove unnecessary key condition
When we reach this point, the key cannot be NULL. Remove the condition
that suggests otherwise.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/20190830112451.21655-6-luca@coelho.fi
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 09:33:28 +02:00
Johannes Berg
5462632488 mac80211: list features in WEP/TKIP disable in better order
"HE/HT/VHT" is a bit confusing since really the order of
development (and possible support) is different - change
this to "HT/VHT/HE".

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/20190830112451.21655-4-luca@coelho.fi
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 09:13:42 +02:00
Johannes Berg
3cfe91c4c3 cfg80211: always shut down on HW rfkill
When the RFKILL subsystem isn't available, then rfkill_blocked()
always returns false. In the case of hardware rfkill this will
be wrong though, as if the hardware reported being killed then
it cannot operate any longer.

Since we only ever call the rfkill_sync work in this case, just
rename it to rfkill_block and always pass "true" for the blocked
parameter, rather than passing rfkill_blocked().

We rely on the underlying driver to still reject any new attempt
to bring up the device by itself.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/20190830112451.21655-2-luca@coelho.fi
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 09:13:26 +02:00
Mordechay Goodstein
e5c0b0fff6 mac80211: vht: add support VHT EXT NSS BW in parsing VHT
This fixes was missed in parsing the vht capabilities max bw
support.

Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Fixes: e80d642552 ("mac80211: copy VHT EXT NSS BW Support/Capable data to station")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/20190830114057.22197-1-luca@coelho.fi
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 09:13:03 +02:00
Arend van Spriel
df5d7a88bc cfg80211: fix boundary value in ieee80211_frequency_to_channel()
The boundary value used for the 6G band was incorrect as it would
result in invalid 6G channel number for certain frequencies.

Reported-by: Amar Singhal <asinghal@codeaurora.org>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Link: https://lore.kernel.org/r/1567510772-24263-1-git-send-email-arend.vanspriel@broadcom.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-09-11 09:12:55 +02:00
Pablo Neira Ayuso
be2861dc36 netfilter: nft_{fwd,dup}_netdev: add offload support
This patch adds support for packet mirroring and redirection. The
nft_fwd_dup_netdev_offload() function configures the flow_action object
for the fwd and the dup actions.

Extend nft_flow_rule_destroy() to release the net_device object when the
flow_rule object is released, since nft_fwd_dup_netdev_offload() bumps
the net_device reference counter.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: wenxu <wenxu@ucloud.cn>
2019-09-10 22:44:29 +02:00
Fernando Fernandez Mancera
ee394f96ad netfilter: nft_synproxy: add synproxy stateful object support
Register a new synproxy stateful object type into the stateful object
infrastructure.

Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-10 22:35:37 +02:00
Dirk van der Merwe
5bbd21df5a devlink: add 'reset_dev_on_drv_probe' param
Add the 'reset_dev_on_drv_probe' devlink parameter, controlling the
device reset policy on driver probe.

This parameter is useful in conjunction with the existing
'fw_load_policy' parameter.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-10 17:29:26 +01:00
Pablo Neira Ayuso
3474a2c62f netfilter: nf_tables_offload: move indirect flow_block callback logic to core
Add nft_offload_init() and nft_offload_exit() function to deal with the
init and the exit path of the offload infrastructure.

Rename nft_indr_block_get_and_ing_cmd() to nft_indr_block_cb().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-08 19:18:04 +02:00
Arnd Bergmann
b44492afd2 netfilter: nf_tables_offload: avoid excessive stack usage
The nft_offload_ctx structure is much too large to put on the
stack:

net/netfilter/nf_tables_offload.c:31:23: error: stack frame size of 1200 bytes in function 'nft_flow_rule_create' [-Werror,-Wframe-larger-than=]

Use dynamic allocation here, as we do elsewhere in the same
function.

Fixes: c9626a2cbd ("netfilter: nf_tables: add hardware offload support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-08 18:16:59 +02:00
Dan Carpenter
b74ae9618b netfilter: nf_tables: Fix an Oops in nf_tables_updobj() error handling
The "newobj" is an error pointer so we can't pass it to kfree().  It
doesn't need to be freed so we can remove that and I also renamed the
error label.

Fixes: d62d0ba97b ("netfilter: nf_tables: Introduce stateful object update operation")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-08 18:10:13 +02:00
Jakub Kicinski
e681cc603a net/tls: align non temporal copy to cache lines
Unlike normal TCP code TLS has to touch the cache lines
it copies into to fill header info. On memory-heavy workloads
having non temporal stores and normal accesses targeting
the same cache line leads to significant overhead.

Measured 3% overhead running 3600 round robin connections
with additional memory heavy workload.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 18:10:34 +02:00
Jakub Kicinski
e7b159a48b net/tls: remove the record tail optimization
For TLS device offload the tag/message authentication code are
filled in by the device. The kernel merely reserves space for
them. Because device overwrites it, the contents of the tag make
do no matter. Current code tries to save space by reusing the
header as the tag. This, however, leads to an additional frag
being created and defeats buffer coalescing (which trickles
all the way down to the drivers).

Remove this optimization, and try to allocate the space for
the tag in the usual way, leave the memory uninitialized.
If memory allocation fails rewind the record pointer so that
we use the already copied user data as tag.

Note that the optimization was actually buggy, as the tag
for TLS 1.2 is 16 bytes, but header is just 13, so the reuse
may had looked past the end of the page..

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 18:10:34 +02:00
Jakub Kicinski
d4774ac0d4 net/tls: use RCU for the adder to the offload record list
All modifications to TLS record list happen under the socket
lock. Since records form an ordered queue readers are only
concerned about elements being removed, additions can happen
concurrently.

Use RCU primitives to ensure the correct access types
(READ_ONCE/WRITE_ONCE).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 18:10:34 +02:00
Jakub Kicinski
7ccd451912 net/tls: unref frags in order
It's generally more cache friendly to walk arrays in order,
especially those which are likely not in cache.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 18:10:34 +02:00
David S. Miller
fcd8c62709 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2019-09-06

Here's the main bluetooth-next pull request for the 5.4 kernel.

 - Cleanups & fixes to btrtl driver
 - Fixes for Realtek devices in btusb, e.g. for suspend handling
 - Firmware loading support for BCM4345C5
 - hidp_send_message() return value handling fixes
 - Added support for utilizing Fast Advertising Interval
 - Various other minor cleanups & fixes

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 18:07:27 +02:00
Hangbin Liu
0079ad8e8d ipmr: remove hard code cache_resolve_queue_len limit
This is a re-post of previous patch wrote by David Miller[1].

Phil Karn reported[2] that on busy networks with lots of unresolved
multicast routing entries, the creation of new multicast group routes
can be extremely slow and unreliable.

The reason is we hard-coded multicast route entries with unresolved source
addresses(cache_resolve_queue_len) to 10. If some multicast route never
resolves and the unresolved source addresses increased, there will
be no ability to create new multicast route cache.

To resolve this issue, we need either add a sysctl entry to make the
cache_resolve_queue_len configurable, or just remove cache_resolve_queue_len
limit directly, as we already have the socket receive queue limits of mrouted
socket, pointed by David.

>From my side, I'd perfer to remove the cache_resolve_queue_len limit instead
of creating two more(IPv4 and IPv6 version) sysctl entry.

[1] https://lkml.org/lkml/2018/7/22/11
[2] https://lkml.org/lkml/2018/7/21/343

v3: instead of remove cache_resolve_queue_len totally, let's only remove
the hard code limit when allocate the unresolved cache, as Eric Dumazet
suggested, so we don't need to re-count it in other places.

v2: hold the mfc_unres_lock while walking the unresolved list in
queue_count(), as Nikolay Aleksandrov remind.

Reported-by: Phil Karn <karn@ka9q.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:49:00 +02:00
Eric Dumazet
b58662a5f7 tcp: ulp: fix possible crash in tcp_diag_get_aux_size()
tcp_diag_get_aux_size() can be called with sockets in any state.

icsk_ulp_ops is only present for full sockets.

For SYN_RECV or TIME_WAIT ones we would access garbage.

Fixes: 61723b3932 ("tcp: ulp: add functions to dump ulp-specific information")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Luke Hsiao <lukehsiao@google.com>
Reported-by: Neal Cardwell <ncardwell@google.com>
Cc: Davide Caratti <dcaratti@redhat.com>
Acked-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:32:28 +02:00
Jiri Pirko
3dd97a0827 net: fib_notifier: move fib_notifier_ops from struct net into per-net struct
No need for fib_notifier_ops to be in struct net. It is used only by
fib_notifier as a private data. Use net_generic to introduce per-net
fib_notifier struct and move fib_notifier_ops there.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 17:28:22 +02:00
David S. Miller
b8f6a0eeb9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Add nft_reg_store64() and nft_reg_load64() helpers, from Ander Juaristi.

2) Time matching support, also from Ander Juaristi.

3) VLAN support for nfnetlink_log, from Michael Braun.

4) Support for set element deletions from the packet path, also from Ander.

5) Remove __read_mostly from conntrack spinlock, from Li RongQing.

6) Support for updating stateful objects, this also includes the initial
   client for this infrastructure: the quota extension. A follow up fix
   for the control plane also comes in this batch. Patches from
   Fernando Fernandez Mancera.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-07 16:31:30 +02:00
David S. Miller
1e46c09ec1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add the ability to use unaligned chunks in the AF_XDP umem. By
   relaxing where the chunks can be placed, it allows to use an
   arbitrary buffer size and place whenever there is a free
   address in the umem. Helps more seamless DPDK AF_XDP driver
   integration. Support for i40e, ixgbe and mlx5e, from Kevin and
   Maxim.

2) Addition of a wakeup flag for AF_XDP tx and fill rings so the
   application can wake up the kernel for rx/tx processing which
   avoids busy-spinning of the latter, useful when app and driver
   is located on the same core. Support for i40e, ixgbe and mlx5e,
   from Magnus and Maxim.

3) bpftool fixes for printf()-like functions so compiler can actually
   enforce checks, bpftool build system improvements for custom output
   directories, and addition of 'bpftool map freeze' command, from Quentin.

4) Support attaching/detaching XDP programs from 'bpftool net' command,
   from Daniel.

5) Automatic xskmap cleanup when AF_XDP socket is released, and several
   barrier/{read,write}_once fixes in AF_XDP code, from Björn.

6) Relicense of bpf_helpers.h/bpf_endian.h for future libbpf
   inclusion as well as libbpf versioning improvements, from Andrii.

7) Several new BPF kselftests for verifier precision tracking, from Alexei.

8) Several BPF kselftest fixes wrt endianess to run on s390x, from Ilya.

9) And more BPF kselftest improvements all over the place, from Stanislav.

10) Add simple BPF map op cache for nfp driver to batch dumps, from Jakub.

11) AF_XDP socket umem mapping improvements for 32bit archs, from Ivan.

12) Add BPF-to-BPF call and BTF line info support for s390x JIT, from Yauheni.

13) Small optimization in arm64 JIT to spare 1 insns for BPF_MOD, from Jerin.

14) Fix an error check in bpf_tcp_gen_syncookie() helper, from Petar.

15) Various minor fixes and cleanups, from Nathan, Masahiro, Masanari,
    Peter, Wei, Yue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 16:49:17 +02:00
Dan Elkouby
8bb3537095 Bluetooth: hidp: Fix assumptions on the return value of hidp_send_message
hidp_send_message was changed to return non-zero values on success,
which some other bits did not expect. This caused spurious errors to be
propagated through the stack, breaking some drivers, such as hid-sony
for the Dualshock 4 in Bluetooth mode.

As pointed out by Dan Carpenter, hid-microsoft directly relied on that
assumption as well.

Fixes: 48d9cc9d85 ("Bluetooth: hidp: Let hidp_send_message return number of queued bytes")

Signed-off-by: Dan Elkouby <streetwalkermc@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-09-06 15:55:40 +02:00
David Dai
d1967e495a net_sched: act_police: add 2 new attributes to support police 64bit rate and peakrate
For high speed adapter like Mellanox CX-5 card, it can reach upto
100 Gbits per second bandwidth. Currently htb already supports 64bit rate
in tc utility. However police action rate and peakrate are still limited
to 32bit value (upto 32 Gbits per second). Add 2 new attributes
TCA_POLICE_RATE64 and TCA_POLICE_RATE64 in kernel for 64bit support
so that tc utility can use them for 64bit rate and peakrate value to
break the 32bit limit, and still keep the backward binary compatibility.

Tested-by: David Dai <zdai@linux.vnet.ibm.com>
Signed-off-by: David Dai <zdai@linux.vnet.ibm.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 15:02:16 +02:00
Paul Blakey
95a7233c45 net: openvswitch: Set OvS recirc_id from tc chain index
Offloaded OvS datapath rules are translated one to one to tc rules,
for example the following simplified OvS rule:

recirc_id(0),in_port(dev1),eth_type(0x0800),ct_state(-trk) actions:ct(),recirc(2)

Will be translated to the following tc rule:

$ tc filter add dev dev1 ingress \
	    prio 1 chain 0 proto ip \
		flower tcp ct_state -trk \
		action ct pipe \
		action goto chain 2

Received packets will first travel though tc, and if they aren't stolen
by it, like in the above rule, they will continue to OvS datapath.
Since we already did some actions (action ct in this case) which might
modify the packets, and updated action stats, we would like to continue
the proccessing with the correct recirc_id in OvS (here recirc_id(2))
where we left off.

To support this, introduce a new skb extension for tc, which
will be used for translating tc chain to ovs recirc_id to
handle these miss cases. Last tc chain index will be set
by tc goto chain action and read by OvS datapath.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06 14:59:18 +02:00
Gustavo A. R. Silva
72bb169e02 Bluetooth: mgmt: Use struct_size() helper
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct mgmt_rp_get_connections {
	...
        struct mgmt_addr_info addr[0];
} __packed;

Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.

So, replace the following form:

sizeof(*rp) + (i * sizeof(struct mgmt_addr_info));

with:

struct_size(rp, addr, i)

Also, notice that, in this case, variable rp_len is not necessary,
hence it is removed.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-09-05 17:27:22 +02:00
Nishka Dasgupta
569428dabc Bluetooth: 6lowpan: Make variable header_ops constant
Static variable header_ops, of type header_ops, is used only once, when
it is assigned to field header_ops of a variable having type net_device.
This corresponding field is declared as const in the definition of
net_device. Hence make header_ops constant as well to protect it from
unnecessary modification.
Issue found with Coccinelle.

Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-09-05 17:27:21 +02:00