67884 Commits

Author SHA1 Message Date
Nathan Errera
6d50176428 mac80211: introduce channel switch disconnect function
Introduce a disconnect function that can be used when a
channel switch error occurs. The channel switch can request to
block the tx, and so, we need to make sure we do not send a deauth
frame in this case.

Signed-off-by: Nathan Errera <nathan.errera@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20211129152938.cd2a615a0702.I9edb14785586344af17644b610ab5be109dcef00@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-12-20 10:36:51 +01:00
Johannes Berg
3bb1ccc4ed cfg80211: simplify cfg80211_chandef_valid()
There are a lot of duplicate checks in this function to
check the delta between the control channel and CF1.
With the addition of 320 MHz, this will become even more.
Simplify the code so that the common checks are done
only once for multiple bandwidths.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20211129152938.2d0240b07f11.I759e8e990f5386ba2b56ffb2488a8d4e16e22c1b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-12-20 10:36:24 +01:00
Ilan Peer
cee04f3c3a mac80211: Remove a couple of obsolete TODO
The HE capability IE is an extension IE so remove
an irrelevant comments.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20211129152938.550b95b5fca7.Ia31395e880172aefcc0a8c70ed060f84b94bdb83@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-12-20 10:36:14 +01:00
P Praneesh
57553c3a6c mac80211: fix FEC flag in radio tap header
In mac80211, while building radiotap header
IEEE80211_RADIOTAP_MCS_HAVE_FEC flag is missing when LDPC enabled
from driver, hence LDPC is not updated properly in radiotap header.
Fix that by adding HAVE_FEC flag while building radiotap header.

Signed-off-by: P Praneesh <quic_ppranees@quicinc.com>
Link: https://lore.kernel.org/r/1638294648-844-2-git-send-email-quic_ppranees@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-12-20 10:25:00 +01:00
Felix Fietkau
6a789ba679 mac80211: use coarse boottime for airtime fairness code
The time values used by the airtime fairness code only need to be accurate
enough to cover station activity detection.
Using ktime_get_coarse_boottime_ns instead of ktime_get_boottime_ns will
drop the accuracy down to jiffies intervals, but at the same time saves
a lot of CPU cycles in a hot path

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20211217114258.14619-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-12-20 10:24:41 +01:00
Greg Kroah-Hartman
35eaa42c4a Linux 5.16-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmG/rsoeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGLW8H/0sfRUxZNJ+DgO6W
 0LpyIx1DF7S2x1wOQGl7VZsjdy+dtuOW4BtEoqhuQWgQ2sr0zbavbeyZYTXD1+d6
 I1h5C6JMD138JkoqZNdeyWwsalAFtVItH+lzszUGKTxueTbKCJTkGMerBZVu/gVY
 U3QYZHEkUH6o/2fINYfSMhEgyJpjVEUoaztB2drsC0tNUGeDjTmrhlRrWIKZ6Kn+
 oAiMqsomSBF09TZ++GEqMYDTm39CwNDPJzKGyyOx86ydcg4ErWvjp1jfAUX0cvFG
 dso2K5mDzTewdF8bW0A1KvsQaliMFpxgZ5CTblYQ6HvyltjRj+0B8207VzHQXBKV
 YFZ8y9A=
 =mfug
 -----END PGP SIGNATURE-----

Merge 5.16-rc6 into tty-next

We need the tty/serial fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-20 10:00:30 +01:00
Baowen Zheng
c86e0209dc flow_offload: validate flags of filter and actions
Add process to validate flags of filter and actions when adding
a tc filter.

We need to prevent adding filter with flags conflicts with its actions.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:48 +00:00
Baowen Zheng
13926d19a1 flow_offload: add reoffload process to update hw_count
Add reoffload process to update hw_count when driver
is inserted or removed.

We will delete the action if it is with skip_sw flag and
not offloaded to any hardware in reoffload process.

When reoffloading actions, we still offload the actions
that are added independent of filters.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:48 +00:00
Baowen Zheng
e8cb5bcf6e net: sched: save full flags for tc action
Save full action flags and return user flags when return flags to
user space.

Save full action flags to distinguish if the action is created
independent from classifier.

We made this change mainly for further patch to reoffload tc actions.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:48 +00:00
Baowen Zheng
c7a66f8d8a flow_offload: add process to update action stats from hardware
When collecting stats for actions update them using both
hardware and software counters.

Stats update process should not run in context of preempt_disable.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:48 +00:00
Baowen Zheng
bcd6436858 flow_offload: rename exts stats update functions with hw
Rename exts stats update functions with hw for readability.

We make this change also to update stats from hw for an action
when it is offloaded to hw as a single action.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:48 +00:00
Baowen Zheng
7adc576512 flow_offload: add skip_hw and skip_sw to control if offload the action
We add skip_hw and skip_sw for user to control if offload the action
to hardware.

We also add in_hw_count for user to indicate if the action is offloaded
to any hardware.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:48 +00:00
Baowen Zheng
8cbfe939ab flow_offload: allow user to offload tc action to net device
Use flow_indr_dev_register/flow_indr_dev_setup_offload to
offload tc action.

We need to call tc_cleanup_flow_action to clean up tc action entry since
in tc_setup_action, some actions may hold dev refcnt, especially the mirror
action.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:48 +00:00
Baowen Zheng
c54e1d920f flow_offload: add ops to tc_action_ops for flow action setup
Add a new ops to tc_action_ops for flow action setup.

Refactor function tc_setup_flow_action to use this new ops.

We make this change to facilitate to add standalone action module.

We will also use this ops to offload action independent of filter
in following patch.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:48 +00:00
Baowen Zheng
9c1c0e124c flow_offload: rename offload functions with offload instead of flow
To improves readability, we rename offload functions with offload instead
of flow.

The term flow is related to exact matches, so we rename these functions
with offload.

We make this change to facilitate single action offload functions naming.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:48 +00:00
Baowen Zheng
5a9959008f flow_offload: add index to flow_action_entry structure
Add index to flow_action_entry structure and delete index from police and
gate child structure.

We make this change to offload tc action for driver to identify a tc
action.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:47 +00:00
Baowen Zheng
40bd094d65 flow_offload: fill flags to action structure
Fill flags to action structure to allow user control if
the action should be offloaded to hardware or not.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-19 14:08:47 +00:00
Hao Luo
216e3cd2f2 bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem.
Some helper functions may modify its arguments, for example,
bpf_d_path, bpf_get_stack etc. Previously, their argument types
were marked as ARG_PTR_TO_MEM, which is compatible with read-only
mem types, such as PTR_TO_RDONLY_BUF. Therefore it's legitimate,
but technically incorrect, to modify a read-only memory by passing
it into one of such helper functions.

This patch tags the bpf_args compatible with immutable memory with
MEM_RDONLY flag. The arguments that don't have this flag will be
only compatible with mutable memory types, preventing the helper
from modifying a read-only memory. The bpf_args that have
MEM_RDONLY are compatible with both mutable memory and immutable
memory.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211217003152.48334-9-haoluo@google.com
2021-12-18 13:27:41 -08:00
Hao Luo
20b2aff4bc bpf: Introduce MEM_RDONLY flag
This patch introduce a flag MEM_RDONLY to tag a reg value
pointing to read-only memory. It makes the following changes:

1. PTR_TO_RDWR_BUF -> PTR_TO_BUF
2. PTR_TO_RDONLY_BUF -> PTR_TO_BUF | MEM_RDONLY

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211217003152.48334-6-haoluo@google.com
2021-12-18 13:27:41 -08:00
Hao Luo
c25b2ae136 bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL
We have introduced a new type to make bpf_reg composable, by
allocating bits in the type to represent flags.

One of the flags is PTR_MAYBE_NULL which indicates a pointer
may be NULL. This patch switches the qualified reg_types to
use this flag. The reg_types changed in this patch include:

1. PTR_TO_MAP_VALUE_OR_NULL
2. PTR_TO_SOCKET_OR_NULL
3. PTR_TO_SOCK_COMMON_OR_NULL
4. PTR_TO_TCP_SOCK_OR_NULL
5. PTR_TO_BTF_ID_OR_NULL
6. PTR_TO_MEM_OR_NULL
7. PTR_TO_RDONLY_BUF_OR_NULL
8. PTR_TO_RDWR_BUF_OR_NULL

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20211217003152.48334-5-haoluo@google.com
2021-12-18 13:27:23 -08:00
Yajun Deng
f85b244ee3 xdp: move the if dev statements to the first
The xdp_rxq_info_unreg() called by xdp_rxq_info_reg() is meaningless when
dev is NULL, so move the if dev statements to the first.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-18 12:35:49 +00:00
Lin Ma
1ade48d0c2 ax25: NPD bug when detaching AX25 device
The existing cleanup routine implementation is not well synchronized
with the syscall routine. When a device is detaching, below race could
occur.

static int ax25_sendmsg(...) {
  ...
  lock_sock()
  ax25 = sk_to_ax25(sk);
  if (ax25->ax25_dev == NULL) // CHECK
  ...
  ax25_queue_xmit(skb, ax25->ax25_dev->dev); // USE
  ...
}

static void ax25_kill_by_device(...) {
  ...
  if (s->ax25_dev == ax25_dev) {
    s->ax25_dev = NULL;
    ...
}

Other syscall functions like ax25_getsockopt, ax25_getname,
ax25_info_show also suffer from similar races. To fix them, this patch
introduce lock_sock() into ax25_kill_by_device in order to guarantee
that the nullify action in cleanup routine cannot proceed when another
socket request is pending.

Signed-off-by: Hanjie Wu <nagi@zju.edu.cn>
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-18 12:33:56 +00:00
zhuxinran
f27456693b 9p/trans_virtio: Fix typo in the comment for p9_virtio_create()
couldlook ==> could look

Link: https://lkml.kernel.org/r/20211216061439.4186-1-zhuran@mail.ustc.edu.cn
Signed-off-by: zhuxinran <zhuran@mail.ustc.edu.cn>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2021-12-18 21:04:43 +09:00
Jean Sacren
59060a47ca mptcp: clean up harmless false expressions
entry->addr.id is u8 with a range from 0 to 255 and MAX_ADDR_ID is 255.
We should drop both false expressions of (entry->addr.id > MAX_ADDR_ID).

We should also remove the obsolete parentheses in the first if branch.

Use U8_MAX for MAX_ADDR_ID and add a comment to show the link to
mptcp_addr_info.id as suggested by Mr. Matthieu Baerts.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 19:27:05 -08:00
Paolo Abeni
3ce0852c86 mptcp: enforce HoL-blocking estimation
The MPTCP packet scheduler has sub-optimal behavior with asymmetric
subflows: if the faster subflow-level cwin is closed, the packet
scheduler can enqueue "too much" data on a slower subflow.

When all the data on the faster subflow is acked, if the mptcp-level
cwin is closed, and link utilization becomes suboptimal.

The solution is implementing blest-like[1] HoL-blocking estimation,
transmitting only on the subflow with the shorter estimated time to
flush the queued memory. If such subflows cwin is closed, we wait
even if other subflows are available.

This is quite simpler than the original blest implementation, as we
leverage the pacing rate provided by the TCP socket. To get a more
accurate estimation for the subflow linger-time, we maintain a
per-subflow weighted average of such info.

Additionally drop magic numbers usage in favor of newly defined
macros and use more meaningful names for status variable.

[1] http://dl.ifip.org/db/conf/networking/networking2016/1570234725.pdf

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/137
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 19:27:04 -08:00
Hoang Le
f845fe5819 Revert "tipc: use consistent GFP flags"
This reverts commit 86c3a3e964d910a62eeb277d60b2a60ebefa9feb.

The tipc_aead_init() function can be calling from an interrupt routine.
This allocation might sleep with GFP_KERNEL flag, hence the following BUG
is reported.

[   17.657509] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:230
[   17.660916] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/3
[   17.664093] preempt_count: 302, expected: 0
[   17.665619] RCU nest depth: 2, expected: 0
[   17.667163] Preemption disabled at:
[   17.667165] [<0000000000000000>] 0x0
[   17.669753] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Tainted: G        W         5.16.0-rc4+ #1
[   17.673006] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[   17.675540] Call Trace:
[   17.676285]  <IRQ>
[   17.676913]  dump_stack_lvl+0x34/0x44
[   17.678033]  __might_resched.cold+0xd6/0x10f
[   17.679311]  kmem_cache_alloc_trace+0x14d/0x220
[   17.680663]  tipc_crypto_start+0x4a/0x2b0 [tipc]
[   17.682146]  ? kmem_cache_alloc_trace+0xd3/0x220
[   17.683545]  tipc_node_create+0x2f0/0x790 [tipc]
[   17.684956]  tipc_node_check_dest+0x72/0x680 [tipc]
[   17.686706]  ? ___cache_free+0x31/0x350
[   17.688008]  ? skb_release_data+0x128/0x140
[   17.689431]  tipc_disc_rcv+0x479/0x510 [tipc]
[   17.690904]  tipc_rcv+0x71c/0x730 [tipc]
[   17.692219]  ? __netif_receive_skb_core+0xb7/0xf60
[   17.693856]  tipc_l2_rcv_msg+0x5e/0x90 [tipc]
[   17.695333]  __netif_receive_skb_list_core+0x20b/0x260
[   17.697072]  netif_receive_skb_list_internal+0x1bf/0x2e0
[   17.698870]  ? dev_gro_receive+0x4c2/0x680
[   17.700255]  napi_complete_done+0x6f/0x180
[   17.701657]  virtnet_poll+0x29c/0x42e [virtio_net]
[   17.703262]  __napi_poll+0x2c/0x170
[   17.704429]  net_rx_action+0x22f/0x280
[   17.705706]  __do_softirq+0xfd/0x30a
[   17.706921]  common_interrupt+0xa4/0xc0
[   17.708206]  </IRQ>
[   17.708922]  <TASK>
[   17.709651]  asm_common_interrupt+0x1e/0x40
[   17.711078] RIP: 0010:default_idle+0x18/0x20

Fixes: 86c3a3e964d9 ("tipc: use consistent GFP flags")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Link: https://lore.kernel.org/r/20211217030059.5947-1-hoang.h.le@dektech.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 19:18:48 -08:00
Paul Blakey
635d448a1c net: openvswitch: Fix matching zone id for invalid conns arriving from tc
Zone id is not restored if we passed ct and ct rejected the connection,
as there is no ct info on the skb.

Save the zone from tc skb cb to tc skb extension and pass it on to
ovs, use that info to restore the zone id for invalid connections.

Fixes: d29334c15d33 ("net/sched: act_api: fix miss set post_ct for ovs after do conntrack in act_ct")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 18:06:36 -08:00
Paul Blakey
3849595866 net/sched: flow_dissector: Fix matching on zone id for invalid conns
If ct rejects a flow, it removes the conntrack info from the skb.
act_ct sets the post_ct variable so the dissector will see this case
as an +tracked +invalid state, but the zone id is lost with the
conntrack info.

To restore the zone id on such cases, set the last executed zone,
via the tc control block, when passing ct, and read it back in the
dissector if there is no ct info on the skb (invalid connection).

Fixes: 7baf2429a1a9 ("net/sched: cls_flower add CT_FLAGS_INVALID flag support")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 18:06:35 -08:00
Paul Blakey
ec624fe740 net/sched: Extend qdisc control block with tc control block
BPF layer extends the qdisc control block via struct bpf_skb_data_end
and because of that there is no more room to add variables to the
qdisc layer control block without going over the skb->cb size.

Extend the qdisc control block with a tc control block,
and move all tc related variables to there as a pre-step for
extending the tc control block with additional members.

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-17 18:06:35 -08:00
Magnus Karlsson
0706a78f31 Revert "xsk: Do not sleep in poll() when need_wakeup set"
This reverts commit bd0687c18e635b63233dc87f38058cd728802ab4.

This patch causes a Tx only workload to go to sleep even when it does
not have to, leading to misserable performance in skb mode. It fixed
one rare problem but created a much worse one, so this need to be
reverted while I try to craft a proper solution to the original
problem.

Fixes: bd0687c18e63 ("xsk: Do not sleep in poll() when need_wakeup set")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211217145646.26449-1-magnus.karlsson@gmail.com
2021-12-17 23:07:23 +01:00
Manivannan Sadhasivam
227fee5fc9 bus: mhi: core: Add an API for auto queueing buffers for DL channel
Add a new API "mhi_prepare_for_transfer_autoqueue" for using with client
drivers like QRTR to request MHI core to autoqueue buffers for the DL
channel along with starting both UL and DL channels.

So far, the "auto_queue" flag specified by the controller drivers in
channel definition served this purpose but this will be removed at some
point in future.

Cc: netdev@vger.kernel.org
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Co-developed-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20211216081227.237749-9-manivannan.sadhasivam@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-17 17:17:14 +01:00
David S. Miller
8ca4090fec Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Fix UAF in set catch-all element, from Eric Dumazet.

2) Fix MAC mangling for multicast/loopback traffic in nfnetlink_queue
   and nfnetlink_log, from Ignacy Gawędzki.

3) Remove expired entries from ctnetlink dump path regardless the tuple
   direction, from Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-17 10:52:04 +00:00
Antony Antony
68ac0f3810 xfrm: state and policy should fail if XFRMA_IF_ID 0
xfrm ineterface does not allow xfrm if_id = 0
fail to create or update xfrm state and policy.

With this commit:
 ip xfrm policy add src 192.0.2.1 dst 192.0.2.2 dir out if_id 0
 RTNETLINK answers: Invalid argument

 ip xfrm state add src 192.0.2.1 dst 192.0.2.2 proto esp spi 1 \
            reqid 1 mode tunnel aead 'rfc4106(gcm(aes))' \
            0x1111111111111111111111111111111111111111 96 if_id 0
 RTNETLINK answers: Invalid argument

v1->v2 change:
 - add Fixes: tag

Fixes: 9f8550e4bd9d ("xfrm: fix disable_xfrm sysctl when used on xfrm interfaces")
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-12-17 07:17:35 +01:00
Antony Antony
8dce439195 xfrm: interface with if_id 0 should return error
xfrm interface if_id = 0 would cause xfrm policy lookup errors since
Commit 9f8550e4bd9d.

Now explicitly fail to create an xfrm interface when if_id = 0

With this commit:
 ip link add ipsec0  type xfrm dev lo  if_id 0
 Error: if_id must be non zero.

v1->v2 change:
 - add Fixes: tag

Fixes: 9f8550e4bd9d ("xfrm: fix disable_xfrm sysctl when used on xfrm interfaces")
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Reviewed-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-12-17 07:17:13 +01:00
Jakub Kicinski
7cd2802d74 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 16:13:19 -08:00
Linus Torvalds
180f3bcfe3 Networking fixes for 5.16-rc6, including fixes from mac80211, wifi, bpf.
Current release - regressions:
 
  - dpaa2-eth: fix buffer overrun when reporting ethtool statistics
 
 Current release - new code bugs:
 
  - bpf: fix incorrect state pruning for <8B spill/fill
 
  - iavf:
      - add missing unlocks in iavf_watchdog_task()
      - do not override the adapter state in the watchdog task (again)
 
  - mlxsw: spectrum_router: consolidate MAC profiles when possible
 
 Previous releases - regressions:
 
  - mac80211, fix:
      - rate control, avoid driver crash for retransmitted frames
      - regression in SSN handling of addba tx
      - a memory leak where sta_info is not freed
      - marking TX-during-stop for TX in in_reconfig, prevent stall
 
  - cfg80211: acquire wiphy mutex on regulatory work
 
  - wifi drivers: fix build regressions and LED config dependency
 
  - virtio_net: fix rx_drops stat for small pkts
 
  - dsa: mv88e6xxx: unforce speed & duplex in mac_link_down()
 
 Previous releases - always broken:
 
  - bpf, fix:
     - kernel address leakage in atomic fetch
     - kernel address leakage in atomic cmpxchg's r0 aux reg
     - signed bounds propagation after mov32
     - extable fixup offset
     - extable address check
 
  - mac80211:
      - fix the size used for building probe request
      - send ADDBA requests using the tid/queue of the aggregation
        session
      - agg-tx: don't schedule_and_wake_txq() under sta->lock,
        avoid deadlocks
      - validate extended element ID is present
 
  - mptcp:
      - never allow the PM to close a listener subflow (null-defer)
      - clear 'kern' flag from fallback sockets, prevent crash
      - fix deadlock in __mptcp_push_pending()
 
  - inet_diag: fix kernel-infoleak for UDP sockets
 
  - xsk: do not sleep in poll() when need_wakeup set
 
  - smc: avoid very long waits in smc_release()
 
  - sch_ets: don't remove idle classes from the round-robin list
 
  - netdevsim:
      - zero-initialize memory for bpf map's value, prevent info leak
      - don't let user space overwrite read only (max) ethtool parms
 
  - ixgbe: set X550 MDIO speed before talking to PHY
 
  - stmmac:
      - fix null-deref in flower deletion w/ VLAN prio Rx steering
      - dwmac-rk: fix oob read in rk_gmac_setup
 
  - ice: time stamping fixes
 
  - systemport: add global locking for descriptor life cycle
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmG7rdUACgkQMUZtbf5S
 IrtRvw//etsgeg2+zxe+fBSbe7ZihcCB4yzWUoRDdNzPrLNLsnWxKT1wYblDcZft
 b1f/SpTy9ycfg+fspn2qET8gzydn4m9xHkjmlQPzmXB9tdIDF6mECFTAXYlar1hQ
 RQIijpfZYyrZeGdgHpsyq72YC4dpNdbZrxmQFVdpMr3cK8P2N0Dn32bBVa//+jb+
 LCv3Uw9C0yNbqhtRIiukkWIE20+/pXtKm0uErDVmvonqFMWPo6mYD0C2PwC20PwR
 Kv5ok6jH+44fCSwDoLChbB+Wes0AtrIQdUvUwXGXaF3MDfZl+24oLkX5xJl3EHWT
 90Mh0k0NhRORgBZ3NItwK7OliohrRHCYxlAXPjg1Dicxl+kxl0wPlva8v64eAA+u
 ZhwXwaQpCrZNdKoxHJw9kQ/CmbggtxcWkVolbZp3TzDjYY1E7qxuwg51YMhGmGT1
 FPjradYGvHKi+thizJiEdiZaMKRc8bpaL0hbpROxFQvfjNwFOwREQhtnXYP3W5Kd
 lK88fWaH86dxqL+ABvbrMnSZKuNlSL8R/CROWpZuF+vyLRXaxhAvYRrL79bgmkKq
 zvImnh1mFovdyKGJhibFMdy92X14z8FzoyX3VQuFcl9EB+2NQXnNZ6abDLJlufZX
 A0jQ5r46Ce/yyaXXmS61PrP7Pf5sxhs/69fqAIDQfSSzpyUKHd4=
 =VIbd
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from mac80211, wifi, bpf.

  Relatively large batches of fixes from BPF and the WiFi stack, calm in
  general networking.

  Current release - regressions:

   - dpaa2-eth: fix buffer overrun when reporting ethtool statistics

  Current release - new code bugs:

   - bpf: fix incorrect state pruning for <8B spill/fill

   - iavf:
       - add missing unlocks in iavf_watchdog_task()
       - do not override the adapter state in the watchdog task (again)

   - mlxsw: spectrum_router: consolidate MAC profiles when possible

  Previous releases - regressions:

   - mac80211 fixes:
       - rate control, avoid driver crash for retransmitted frames
       - regression in SSN handling of addba tx
       - a memory leak where sta_info is not freed
       - marking TX-during-stop for TX in in_reconfig, prevent stall

   - cfg80211: acquire wiphy mutex on regulatory work

   - wifi drivers: fix build regressions and LED config dependency

   - virtio_net: fix rx_drops stat for small pkts

   - dsa: mv88e6xxx: unforce speed & duplex in mac_link_down()

  Previous releases - always broken:

   - bpf fixes:
       - kernel address leakage in atomic fetch
       - kernel address leakage in atomic cmpxchg's r0 aux reg
       - signed bounds propagation after mov32
       - extable fixup offset
       - extable address check

   - mac80211:
       - fix the size used for building probe request
       - send ADDBA requests using the tid/queue of the aggregation
         session
       - agg-tx: don't schedule_and_wake_txq() under sta->lock, avoid
         deadlocks
       - validate extended element ID is present

   - mptcp:
       - never allow the PM to close a listener subflow (null-defer)
       - clear 'kern' flag from fallback sockets, prevent crash
       - fix deadlock in __mptcp_push_pending()

   - inet_diag: fix kernel-infoleak for UDP sockets

   - xsk: do not sleep in poll() when need_wakeup set

   - smc: avoid very long waits in smc_release()

   - sch_ets: don't remove idle classes from the round-robin list

   - netdevsim:
       - zero-initialize memory for bpf map's value, prevent info leak
       - don't let user space overwrite read only (max) ethtool parms

   - ixgbe: set X550 MDIO speed before talking to PHY

   - stmmac:
       - fix null-deref in flower deletion w/ VLAN prio Rx steering
       - dwmac-rk: fix oob read in rk_gmac_setup

   - ice: time stamping fixes

   - systemport: add global locking for descriptor life cycle"

* tag 'net-5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (89 commits)
  bpf, selftests: Fix racing issue in btf_skc_cls_ingress test
  selftest/bpf: Add a test that reads various addresses.
  bpf: Fix extable address check.
  bpf: Fix extable fixup offset.
  bpf, selftests: Add test case trying to taint map value pointer
  bpf: Make 32->64 bounds propagation slightly more robust
  bpf: Fix signed bounds propagation after mov32
  sit: do not call ipip6_dev_free() from sit_init_net()
  net: systemport: Add global locking for descriptor lifecycle
  net/smc: Prevent smc_release() from long blocking
  net: Fix double 0x prefix print in SKB dump
  virtio_net: fix rx_drops stat for small pkts
  dsa: mv88e6xxx: fix debug print for SPEED_UNFORCED
  sfc_ef100: potential dereference of null pointer
  net: stmmac: dwmac-rk: fix oob read in rk_gmac_setup
  net: usb: lan78xx: add Allied Telesis AT29M2-AF
  net/packet: rx_owner_map depends on pg_vec
  netdevsim: Zero-initialize memory for new map's value in function nsim_bpf_map_alloc
  dpaa2-eth: fix ethtool statistics
  ixgbe: set X550 MDIO speed before talking to PHY
  ...
2021-12-16 15:02:14 -08:00
Jakub Kicinski
aef2feda97 add missing bpf-cgroup.h includes
We're about to break the cgroup-defs.h -> bpf-cgroup.h dependency,
make sure those who actually need more than the definition of
struct cgroup_bpf include bpf-cgroup.h explicitly.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/bpf/20211216025538.1649516-3-kuba@kernel.org
2021-12-16 14:57:09 -08:00
Jakub Kicinski
0c3e247460 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-12-16

We've added 15 non-merge commits during the last 7 day(s) which contain
a total of 12 files changed, 434 insertions(+), 30 deletions(-).

The main changes are:

1) Fix incorrect verifier state pruning behavior for <8B register spill/fill,
   from Paul Chaignon.

2) Fix x86-64 JIT's extable handling for fentry/fexit when return pointer
   is an ERR_PTR(), from Alexei Starovoitov.

3) Fix 3 different possibilities that BPF verifier missed where unprivileged
   could leak kernel addresses, from Daniel Borkmann.

4) Fix xsk's poll behavior under need_wakeup flag, from Magnus Karlsson.

5) Fix an oob-write in test_verifier due to a missed MAX_NR_MAPS bump,
   from Kumar Kartikeya Dwivedi.

6) Fix a race in test_btf_skc_cls_ingress selftest, from Martin KaFai Lau.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, selftests: Fix racing issue in btf_skc_cls_ingress test
  selftest/bpf: Add a test that reads various addresses.
  bpf: Fix extable address check.
  bpf: Fix extable fixup offset.
  bpf, selftests: Add test case trying to taint map value pointer
  bpf: Make 32->64 bounds propagation slightly more robust
  bpf: Fix signed bounds propagation after mov32
  bpf, selftests: Update test case for atomic cmpxchg on r0 with pointer
  bpf: Fix kernel address leakage in atomic cmpxchg's r0 aux reg
  bpf, selftests: Add test case for atomic fetch on spilled pointer
  bpf: Fix kernel address leakage in atomic fetch
  selftests/bpf: Fix OOB write in test_verifier
  xsk: Do not sleep in poll() when need_wakeup set
  selftests/bpf: Tests for state pruning with u32 spill/fill
  bpf: Fix incorrect state pruning for <8B spill/fill
====================

Link: https://lore.kernel.org/r/20211216210005.13815-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 13:06:49 -08:00
Eric Dumazet
e28587cc49 sit: do not call ipip6_dev_free() from sit_init_net()
ipip6_dev_free is sit dev->priv_destructor, already called
by register_netdevice() if something goes wrong.

Alternative would be to make ipip6_dev_free() robust against
multiple invocations, but other drivers do not implement this
strategy.

syzbot reported:

dst_release underflow
WARNING: CPU: 0 PID: 5059 at net/core/dst.c:173 dst_release+0xd8/0xe0 net/core/dst.c:173
Modules linked in:
CPU: 1 PID: 5059 Comm: syz-executor.4 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:dst_release+0xd8/0xe0 net/core/dst.c:173
Code: 4c 89 f2 89 d9 31 c0 5b 41 5e 5d e9 da d5 44 f9 e8 1d 90 5f f9 c6 05 87 48 c6 05 01 48 c7 c7 80 44 99 8b 31 c0 e8 e8 67 29 f9 <0f> 0b eb 85 0f 1f 40 00 53 48 89 fb e8 f7 8f 5f f9 48 83 c3 a8 48
RSP: 0018:ffffc9000aa5faa0 EFLAGS: 00010246
RAX: d6894a925dd15a00 RBX: 00000000ffffffff RCX: 0000000000040000
RDX: ffffc90005e19000 RSI: 000000000003ffff RDI: 0000000000040000
RBP: 0000000000000000 R08: ffffffff816a1f42 R09: ffffed1017344f2c
R10: ffffed1017344f2c R11: 0000000000000000 R12: 0000607f462b1358
R13: 1ffffffff1bfd305 R14: ffffe8ffffcb1358 R15: dffffc0000000000
FS:  00007f66c71a2700(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f88aaed5058 CR3: 0000000023e0f000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 dst_cache_destroy+0x107/0x1e0 net/core/dst_cache.c:160
 ipip6_dev_free net/ipv6/sit.c:1414 [inline]
 sit_init_net+0x229/0x550 net/ipv6/sit.c:1936
 ops_init+0x313/0x430 net/core/net_namespace.c:140
 setup_net+0x35b/0x9d0 net/core/net_namespace.c:326
 copy_net_ns+0x359/0x5c0 net/core/net_namespace.c:470
 create_new_namespaces+0x4ce/0xa00 kernel/nsproxy.c:110
 unshare_nsproxy_namespaces+0x11e/0x180 kernel/nsproxy.c:226
 ksys_unshare+0x57d/0xb50 kernel/fork.c:3075
 __do_sys_unshare kernel/fork.c:3146 [inline]
 __se_sys_unshare kernel/fork.c:3144 [inline]
 __x64_sys_unshare+0x34/0x40 kernel/fork.c:3144
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f66c882ce99
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f66c71a2168 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 00007f66c893ff60 RCX: 00007f66c882ce99
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000048040200
RBP: 00007f66c8886ff1 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff6634832f R14: 00007f66c71a2300 R15: 0000000000022000
 </TASK>

Fixes: cf124db566e6 ("net: Fix inconsistent teardown and release of private netdev state.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20211216111741.1387540-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 08:38:10 -08:00
D. Wythe
5c15b3123f net/smc: Prevent smc_release() from long blocking
In nginx/wrk benchmark, there's a hung problem with high probability
on case likes that: (client will last several minutes to exit)

server: smc_run nginx

client: smc_run wrk -c 10000 -t 1 http://server

Client hangs with the following backtrace:

0 [ffffa7ce8Of3bbf8] __schedule at ffffffff9f9eOd5f
1 [ffffa7ce8Of3bc88] schedule at ffffffff9f9eløe6
2 [ffffa7ce8Of3bcaO] schedule_timeout at ffffffff9f9e3f3c
3 [ffffa7ce8Of3bd2O] wait_for_common at ffffffff9f9el9de
4 [ffffa7ce8Of3bd8O] __flush_work at ffffffff9fOfeOl3
5 [ffffa7ce8øf3bdfO] smc_release at ffffffffcO697d24 [smc]
6 [ffffa7ce8Of3be2O] __sock_release at ffffffff9f8O2e2d
7 [ffffa7ce8Of3be4ø] sock_close at ffffffff9f8ø2ebl
8 [ffffa7ce8øf3be48] __fput at ffffffff9f334f93
9 [ffffa7ce8Of3be78] task_work_run at ffffffff9flOlff5
10 [ffffa7ce8Of3beaO] do_exit at ffffffff9fOe5Ol2
11 [ffffa7ce8Of3bflO] do_group_exit at ffffffff9fOe592a
12 [ffffa7ce8Of3bf38] __x64_sys_exit_group at ffffffff9fOe5994
13 [ffffa7ce8Of3bf4O] do_syscall_64 at ffffffff9f9d4373
14 [ffffa7ce8Of3bfsO] entry_SYSCALL_64_after_hwframe at ffffffff9fa0007c

This issue dues to flush_work(), which is used to wait for
smc_connect_work() to finish in smc_release(). Once lots of
smc_connect_work() was pending or all executing work dangling,
smc_release() has to block until one worker comes to free, which
is equivalent to wait another smc_connnect_work() to finish.

In order to fix this, There are two changes:

1. For those idle smc_connect_work(), cancel it from the workqueue; for
   executing smc_connect_work(), waiting for it to finish. For that
   purpose, replace flush_work() with cancel_work_sync().

2. Since smc_connect() hold a reference for passive closing, if
   smc_connect_work() has been cancelled, release the reference.

Fixes: 24ac3a08e658 ("net/smc: rebuild nonblocking connect")
Reported-by: Tony Lu <tonylu@linux.alibaba.com>
Tested-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/1639571361-101128-1-git-send-email-alibuda@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 08:11:05 -08:00
Florian Westphal
66495f301c fib: expand fib_rule_policy
Now that there is only one fib nla_policy there is no need to
keep the macro around.  Place it where its used.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 07:18:35 -08:00
Florian Westphal
92e1bcee06 fib: rules: remove duplicated nla policies
The attributes are identical in all implementations so move the ipv4 one
into the core and remove the per-family nla policies.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 07:18:35 -08:00
Florian Westphal
76f12e632a netfilter: ctnetlink: remove expired entries first
When dumping conntrack table to userspace via ctnetlink, check if the ct has
already expired before doing any of the 'skip' checks.

This expires dead entries faster.
/proc handler also removes outdated entries first.

Reported-by: Vitaly Zuevsky <vzuevsky@ns1.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-16 14:10:52 +01:00
Eric Dumazet
fc0d026a2f netfilter: nf_nat_masquerade: add netns refcount tracker to masq_dev_work
If compiled with CONFIG_NET_NS_REFCNT_TRACKER=y,
using put_net_track() in iterate_cleanup_work()
and netns_tracker_alloc() in nf_nat_masq_schedule()
might help us finding netns refcount imbalances.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-16 12:49:34 +01:00
Eric Dumazet
a9382d9389 netfilter: nfnetlink: add netns refcount tracker to struct nfulnl_instance
If compiled with CONFIG_NET_NS_REFCNT_TRACKER=y,
using put_net_track() in nfulnl_instance_free_rcu()
and get_net_track() in instance_create()
might help us finding netns refcount imbalances.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-16 12:49:34 +01:00
Gal Pressman
8a03ef676a net: Fix double 0x prefix print in SKB dump
When printing netdev features %pNF already takes care of the 0x prefix,
remove the explicit one.

Fixes: 6413139dfc64 ("skbuff: increase verbosity when dumping skb data")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 11:08:15 +00:00
Willem de Bruijn
ec6af094ea net/packet: rx_owner_map depends on pg_vec
Packet sockets may switch ring versions. Avoid misinterpreting state
between versions, whose fields share a union. rx_owner_map is only
allocated with a packet ring (pg_vec) and both are swapped together.
If pg_vec is NULL, meaning no packet ring was allocated, then neither
was rx_owner_map. And the field may be old state from a tpacket_v3.

Fixes: 61fad6816fc1 ("net/packet: tpacket_rcv: avoid a producer race condition")
Reported-by: Syzbot <syzbot+1ac0994a0a0c55151121@syzkaller.appspotmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211215143937.106178-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-15 17:49:36 -08:00
Jakub Kicinski
bd1d97d861 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next, mostly
rather small housekeeping patches:

1) Remove unused variable in IPVS, from GuoYong Zheng.

2) Use memset_after in conntrack, from Kees Cook.

3) Remove leftover function in nfnetlink_queue, from Florian Westphal.

4) Remove redundant test on bool in conntrack, from Bernard Zhao.

5) egress support for nft_fwd, from Lukas Wunner.

6) Make pppoe work for br_netfilter, from Florian Westphal.

7) Remove unused variable in conntrack resize routine, from luo penghao.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next:
  netfilter: conntrack: Remove useless assignment statements
  netfilter: bridge: add support for pppoe filtering
  netfilter: nft_fwd_netdev: Support egress hook
  netfilter: ctnetlink: remove useless type conversion to bool
  netfilter: nf_queue: remove leftover synchronize_rcu
  netfilter: conntrack: Use memset_startat() to zero struct nf_conn
  ipvs: remove unused variable for ip_vs_new_dest
====================

Link: https://lore.kernel.org/r/20211215234911.170741-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-15 17:29:28 -08:00
luo penghao
284ca7647c netfilter: conntrack: Remove useless assignment statements
The old_size assignment here will not be used anymore

The clang_analyzer complains as follows:

Value stored to 'old_size' is never read

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: luo penghao <luo.penghao@zte.com.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-16 00:17:40 +01:00
Ignacy Gawędzki
ebb966d3bd netfilter: fix regression in looped (broad|multi)cast's MAC handling
In commit 5648b5e1169f ("netfilter: nfnetlink_queue: fix OOB when mac
header was cleared"), the test for non-empty MAC header introduced in
commit 2c38de4c1f8da7 ("netfilter: fix looped (broad|multi)cast's MAC
handling") has been replaced with a test for a set MAC header.

This breaks the case when the MAC header has been reset (using
skb_reset_mac_header), as is the case with looped-back multicast
packets.  As a result, the packets ending up in NFQUEUE get a bogus
hwaddr interpreted from the first bytes of the IP header.

This patch adds a test for a non-empty MAC header in addition to the
test for a set MAC header.  The same two tests are also implemented in
nfnetlink_log.c, where the initial code of commit 2c38de4c1f8da7
("netfilter: fix looped (broad|multi)cast's MAC handling") has not been
touched, but where supposedly the same situation may happen.

Fixes: 5648b5e1169f ("netfilter: nfnetlink_queue: fix OOB when mac header was cleared")
Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-16 00:06:24 +01:00