linux

iv/linux

Author	SHA1	Message	Date
Eric Dumazet	ac3959fd0d	tcp: remove obsolete check in __tcp_retransmit_skb() TSQ provides a nice way to avoid bufferbloat on individual socket, including retransmit packets. We can get rid of the old heuristic: /* Do not sent more than we queued. 1/4 is reserved for possible * copying overhead: fragmentation, tunneling, mangling etc. */ if (refcount_read(&sk->sk_wmem_alloc) > min_t(u32, sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2), sk->sk_sndbuf)) return -EAGAIN; This heuristic was giving false positives according to Jakub, whenever TX completions are delayed above RTT. (Ack packets are processed by TCP stack before clones are orphaned/freed) Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Jakub Kicinski <kuba@kernel.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:35:31 -08:00
Eric Dumazet	a7abf3cd76	tcp: consider using standard rtx logic in tcp_rcv_fastopen_synack() Jakub reported Data included in a Fastopen SYN that had to be retransmit would have to wait for an RTO if TX completions are slow, even with prior fix. This is because tcp_rcv_fastopen_synack() does not use standard rtx logic, meaning TSQ handler exits early in tcp_tsq_write() because tp->lost_out == tp->retrans_out Lets make tcp_rcv_fastopen_synack() use standard rtx logic, by using tcp_mark_skb_lost() on the skb thats needs to be sent again. Not this raised a warning in tcp_fastretrans_alert() during my tests since we consider the data not being aknowledged by the receiver does not mean packet was lost on the network. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Jakub Kicinski <kuba@kernel.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:35:31 -08:00
Eric Dumazet	f4dae54e48	tcp: plug skb_still_in_host_queue() to TSQ Jakub and Neil reported an increase of RTO timers whenever TX completions are delayed a bit more (by increasing NIC TX coalescing parameters) Main issue is that TCP stack has a logic preventing a packet being retransmit if the prior clone has not yet been orphaned or freed. This logic came with commit `1f3279ae0c` ("tcp: avoid retransmits of TCP packets hanging in host queues") Thankfully, in the case skb_still_in_host_queue() detects the initial clone is still in flight, it can use TSQ logic that will eventually retry later, at the moment the clone is freed or orphaned. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Neil Spring <ntspring@fb.com> Reported-by: Jakub Kicinski <kuba@kernel.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:35:31 -08:00
Hoang Huu Le	97bc84bbd4	tipc: clean up warnings detected by sparse This patch fixes the following warning from sparse: net/tipc/monitor.c:263:35: warning: incorrect type in assignment (different base types) net/tipc/monitor.c:263:35: expected unsigned int net/tipc/monitor.c:263:35: got restricted __be32 [usertype] [...] net/tipc/node.c:374:13: warning: context imbalance in 'tipc_node_read_lock' - wrong count at exit net/tipc/node.c:379:13: warning: context imbalance in 'tipc_node_read_unlock' - unexpected unlock net/tipc/node.c:384:13: warning: context imbalance in 'tipc_node_write_lock' - wrong count at exit net/tipc/node.c:389:13: warning: context imbalance in 'tipc_node_write_unlock_fast' - unexpected unlock net/tipc/node.c:404:17: warning: context imbalance in 'tipc_node_write_unlock' - unexpected unlock [...] net/tipc/crypto.c:1201:9: warning: incorrect type in initializer (different address spaces) net/tipc/crypto.c:1201:9: expected struct tipc_aead [noderef] __rcu __tmp net/tipc/crypto.c:1201:9: got struct tipc_aead [...] Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:06:54 -08:00
Hoang Le	1980d37565	tipc: convert dest node's address to network order (struct tipc_link_info)->dest is in network order (__be32), so we must convert the value to network order before assigning. The problem detected by sparse: net/tipc/netlink_compat.c:699:24: warning: incorrect type in assignment (different base types) net/tipc/netlink_compat.c:699:24: expected restricted __be32 [usertype] dest net/tipc/netlink_compat.c:699:24: got int Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 18:06:54 -08:00
Petr Machata	15e1dd5703	nexthop: Enable resilient next-hop groups Now that all the code is in place, stop rejecting requests to create resilient next-hop groups. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:13:00 -08:00
Petr Machata	0b4818aabc	nexthop: Notify userspace about bucket migrations Nexthop replacements et.al. are notified through netlink, but if a delayed work migrates buckets on the background, userspace will stay oblivious. Notify these as RTM_NEWNEXTHOPBUCKET events. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:13:00 -08:00
Petr Machata	187d4c6b97	nexthop: Add netlink handlers for bucket get Allow getting (but not setting) individual buckets to inspect the next hop mapped therein, idle time, and flags. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:13:00 -08:00
Petr Machata	8a1bbabb03	nexthop: Add netlink handlers for bucket dump Add a dump handler for resilient next hop buckets. When next-hop group ID is given, it walks buckets of that group, otherwise it walks buckets of all groups. It then dumps the buckets whose next hops match the given filtering criteria. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:13:00 -08:00
Petr Machata	a2601e2b1e	nexthop: Add netlink handlers for resilient nexthop groups Implement the netlink messages that allow creation and dumping of resilient nexthop groups. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:13:00 -08:00
Ido Schimmel	cfc15c1dbb	nexthop: Allow reporting activity of nexthop buckets The kernel periodically checks the idle time of nexthop buckets to determine if they are idle and can be re-populated with a new nexthop. When the resilient nexthop group is offloaded to hardware, the kernel will not see activity on nexthop buckets unless it is reported from hardware. Add a function that can be periodically called by device drivers to report activity on nexthop buckets after querying it from the underlying device. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:13:00 -08:00
Ido Schimmel	56ad5ba344	nexthop: Allow setting "offload" and "trap" indication of nexthop buckets Add a function that can be called by device drivers to set "offload" or "trap" indication on nexthop buckets following nexthop notifications and other changes such as a neighbour becoming invalid. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:12:59 -08:00
Petr Machata	7c37c7e004	nexthop: Implement notifiers for resilient nexthop groups Implement the following notifications towards drivers: - NEXTHOP_EVENT_REPLACE, when a resilient nexthop group is created. - NEXTHOP_EVENT_BUCKET_REPLACE any time there is a change in assignment of next hops to hash table buckets. That includes replacements, deletions, and delayed upkeep cycles. Some bucket notifications can be vetoed by the driver, to make it possible to propagate bucket busy-ness flags from the HW back to the algorithm. Some are however forced, e.g. if a next hop is deleted, all buckets that use this next hop simply must be migrated, whether the HW wishes so or not. - NEXTHOP_EVENT_RES_TABLE_PRE_REPLACE, before a resilient nexthop group is replaced. Usually the driver will get the bucket notifications as well, and could veto those. But in some cases, a bucket may not be migrated immediately, but during delayed upkeep, and that is too late to roll the transaction back. This notification allows the driver to take a look and veto the new proposed group up front, before anything is committed. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:12:59 -08:00
Petr Machata	283a72a559	nexthop: Add implementation of resilient next-hop groups At this moment, there is only one type of next-hop group: an mpath group, which implements the hash-threshold algorithm. To select a next hop, hash-threshold algorithm first assigns a range of hashes to each next hop in the group, and then selects the next hop by comparing the SKB hash with the individual ranges. When a next hop is removed from the group, the ranges are recomputed, which leads to reassignment of parts of hash space from one next hop to another. While there will usually be some overlap between the previous and the new distribution, some traffic flows change the next hop that they resolve to. That causes problems e.g. as established TCP connections are reset, because the traffic is forwarded to a server that is not familiar with the connection. Resilient hashing is a technique to address the above problem. Resilient next-hop group has another layer of indirection between the group itself and its constituent next hops: a hash table. The selection algorithm uses a straightforward modulo operation to choose a hash bucket, and then reads the next hop that this bucket contains, and forwards traffic there. This indirection brings an important feature. In the hash-threshold algorithm, the range of hashes associated with a next hop must be continuous. With a hash table, mapping between the hash table buckets and the individual next hops is arbitrary. Therefore when a next hop is deleted the buckets that held it are simply reassigned to other next hops. When weights of next hops in a group are altered, it may be possible to choose a subset of buckets that are currently not used for forwarding traffic, and use those to satisfy the new next-hop distribution demands, keeping the "busy" buckets intact. This way, established flows are ideally kept being forwarded to the same endpoints through the same paths as before the next-hop group change. In a nutshell, the algorithm works as follows. Each next hop has a number of buckets that it wants to have, according to its weight and the number of buckets in the hash table. In case of an event that might cause bucket allocation change, the numbers for individual next hops are updated, similarly to how ranges are updated for mpath group next hops. Following that, a new "upkeep" algorithm runs, and for idle buckets that belong to a next hop that is currently occupying more buckets than it wants (it is "overweight"), it migrates the buckets to one of the next hops that has fewer buckets than it wants (it is "underweight"). If, after this, there are still underweight next hops, another upkeep run is scheduled to a future time. Chances are there are not enough "idle" buckets to satisfy the new demands. The algorithm has knobs to select both what it means for a bucket to be idle, and for whether and when to forcefully migrate buckets if there keeps being an insufficient number of idle buckets. There are three users of the resilient data structures. - The forwarding code accesses them under RCU, and does not modify them except for updating the time a selected bucket was last used. - Netlink code, running under RTNL, which may modify the data. - The delayed upkeep code, which may modify the data. This runs unlocked, and mutual exclusion between the RTNL code and the delayed upkeep is maintained by canceling the delayed work synchronously before the RTNL code touches anything. Later it restarts the delayed work if necessary. The RTNL code has to implement next-hop group replacement, next hop removal, etc. For removal, the mpath code uses a neat trick of having a backup next hop group structure, doing the necessary changes offline, and then RCU-swapping them in. However, the hash tables for resilient hashing are about an order of magnitude larger than the groups themselves (the size might be e.g. 4K entries), and it was felt that keeping two of them is an overkill. Both the primary next-hop group and the spare therefore use the same resilient table, and writers are careful to keep all references valid for the forwarding code. The hash table references next-hop group entries from the next-hop group that is currently in the primary role (i.e. not spare). During the transition from primary to spare, the table references a mix of both the primary group and the spare. When a next hop is deleted, the corresponding buckets are not set to NULL, but instead marked as empty, so that the pointer is valid and can be used by the forwarding code. The buckets are then migrated to a new next-hop group entry during upkeep. The only times that the hash table is invalid is the very beginning and very end of its lifetime. Between those points, it is always kept valid. This patch introduces the core support code itself. It does not handle notifications towards drivers, which are kept as if the group were an mpath one. It does not handle netlink either. The only bit currently exposed to user space is the new next-hop group type, and that is currently bounced. There is therefore no way to actually access this code. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:12:59 -08:00
Ido Schimmel	710ec56223	nexthop: Add netlink defines and enumerators for resilient NH groups - RTM_NEWNEXTHOP et.al. that handle resilient groups will have a new nested attribute, NHA_RES_GROUP, whose elements are attributes NHA_RES_GROUP_. - RTM_NEWNEXTHOPBUCKET et.al. is a suite of new messages that will currently serve only for dumping of individual buckets of resilient next hop groups. For nexthop group buckets, these messages will carry a nested attribute NHA_RES_BUCKET, whose elements are attributes NHA_RES_BUCKET_. There are several reasons why a new suite of messages is created for nexthop buckets instead of overloading the information on the existing RTM_{NEW,DEL,GET}NEXTHOP messages. First, a nexthop group can contain a large number of nexthop buckets (4k is not unheard of). This imposes limits on the amount of information that can be encoded for each nexthop bucket given a netlink message is limited to 64k bytes. Second, while RTM_NEWNEXTHOPBUCKET is only used for notifications at this point, in the future it can be extended to provide user space with control over nexthop buckets configuration. - The new group type is NEXTHOP_GRP_TYPE_RES. Note that nexthop code is adjusted to bounce groups with that type for now. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:12:59 -08:00
Petr Machata	90e1a9e213	nexthop: Add a dedicated flag for multipath next-hop groups With the introduction of resilient nexthop groups, there will be two types of multipath groups: the current hash-threshold "mpath" ones, and resilient groups. Both are multipath, but to determine the fact, the system needs to consider two flags. This might prove costly in the datapath. Therefore, introduce a new flag, that should be set for next-hop groups that have more than one nexthop, and should be considered multipath. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:12:59 -08:00
Petr Machata	96a856256a	nexthop: __nh_notifier_single_info_init(): Make nh_info an argument The cited function currently uses rtnl_dereference() to get nh_info from a handed-in nexthop. However, under the resilient hashing scheme, this function will not always be called under RTNL, sometimes the mutual exclusion will be achieved differently. Therefore move the nh_info extraction from the function to its callers to make it possible to use a different synchronization guarantee. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:12:59 -08:00
Petr Machata	597f48e46b	nexthop: Pass nh_config to replace_nexthop() Currently, replace assumes that the new group that is given is a fully-formed object. But mpath groups really only have one attribute, and that is the constituent next hop configuration. This may not be universally true. From the usability perspective, it is desirable to allow the replace operation to adjust just the constituent next hop configuration and leave the group attributes as such intact. But the object that keeps track of whether an attribute was or was not given is the nh_config object, not the next hop or next-hop group. To allow (selective) attribute updates during NH group replacement, propagate `cfg' to replace_nexthop() and further to replace_nexthop_grp(). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:12:59 -08:00
Julien Massonneau	fbbc5bc2ab	seg6: ignore routing header with segments left equal to 0 When there are 2 segments routing header, after an End.B6 action for example, the second SRH will never be handled by an action, packet will be dropped when the first SRH has segments left equal to 0. For actions that doesn't perform decapsulation (currently: End, End.X, End.T, End.B6, End.B6.Encaps), this patch adds the IP6_FH_F_SKIP_RH flag in arguments for ipv6_find_hdr(). Signed-off-by: Julien Massonneau <julien.massonneau@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:09:21 -08:00
Julien Massonneau	ee90c6ba34	seg6: add support for IPv4 decapsulation in ipv6_srh_rcv() As specified in IETF RFC 8754, section 4.3.1.2, if the upper layer header is IPv4 or IPv6, perform IPv6 decapsulation and resubmit the decapsulated packet to the IPv4 or IPv6 module. Only IPv6 decapsulation was implemented. This patch adds support for IPv4 decapsulation. Link: https://tools.ietf.org/html/rfc8754#section-4.3.1.2 Signed-off-by: Julien Massonneau <julien.massonneau@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-11 16:09:21 -08:00
Shubhankar Kuranagatti	6b9c8f46af	net: ipv4: route.c: fix space before tab The extra space before tab space has been removed. Signed-off-by: Shubhankar Kuranagatti <shubhankarvk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 15:37:19 -08:00
Shubhankar Kuranagatti	13fdb9403d	net: ipv6: route.c:fix indentation The series of space has been replaced by tab space wherever required. Signed-off-by: Shubhankar Kuranagatti <shubhankarvk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 12:45:16 -08:00
Ido Schimmel	58c04397f7	sched: act_sample: Implement stats_update callback Implement this callback in order to get the offloaded stats added to the kernel stats. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 12:45:15 -08:00
Yunsheng Lin	1ddc3229ad	skbuff: remove some unnecessary operation in skb_segment_list() gro list uses skb_shinfo(skb)->frag_list to link two skb together, and NAPI_GRO_CB(p)->last->next is used when there are more skb, see skb_gro_receive_list(). gso expects that each segmented skb is linked together using skb->next, so only the first skb->next need to set to skb_shinfo(skb)-> frag_list when doing gso list segment. It is the same reason that nskb->next does not need to be set to list_skb before goto the error handling, because nskb->next already pointers to list_skb. And nskb is also the last skb at the end of loop, so remove tail variable and use nskb instead. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 12:45:15 -08:00
Gustavo A. R. Silva	90d181ca48	net: rose: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple warnings by explicitly adding multiple break statements instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 12:45:15 -08:00
Gustavo A. R. Silva	b1866bfff9	net: core: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 12:45:15 -08:00
Gustavo A. R. Silva	ecd1c6a51f	net: bridge: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 12:45:15 -08:00
Gustavo A. R. Silva	5646fba6ea	net: ax25: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 12:45:15 -08:00
Gustavo A. R. Silva	4cdbe58b4b	decnet: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 12:45:15 -08:00
Yejune Deng	3e6f20e09a	net/rds: Drop duplicate sin and sin6 assignments There is no need to assign the msg->msg_name to sin or sin6, because there is DECLARE_SOCKADDR statement. Signed-off-by: Yejune Deng <yejune.deng@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 12:45:15 -08:00
David S. Miller	c1acda9807	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-03-09 The following pull-request contains BPF updates for your net-next tree. We've added 90 non-merge commits during the last 17 day(s) which contain a total of 114 files changed, 5158 insertions(+), 1288 deletions(-). The main changes are: 1) Faster bpf_redirect_map(), from Björn. 2) skmsg cleanup, from Cong. 3) Support for floating point types in BTF, from Ilya. 4) Documentation for sys_bpf commands, from Joe. 5) Support for sk_lookup in bpf_prog_test_run, form Lorenz. 6) Enable task local storage for tracing programs, from Song. 7) bpf_for_each_map_elem() helper, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-09 18:07:05 -08:00
Linus Torvalds	05a59d7979	Merge git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from David Miller: 1) Fix transmissions in dynamic SMPS mode in ath9k, from Felix Fietkau. 2) TX skb error handling fix in mt76 driver, also from Felix. 3) Fix BPF_FETCH atomic in x86 JIT, from Brendan Jackman. 4) Avoid double free of percpu pointers when freeing a cloned bpf prog. From Cong Wang. 5) Use correct printf format for dma_addr_t in ath11k, from Geert Uytterhoeven. 6) Fix resolve_btfids build with older toolchains, from Kun-Chuan Hsieh. 7) Don't report truncated frames to mac80211 in mt76 driver, from Lorenzop Bianconi. 8) Fix watcdog timeout on suspend/resume of stmmac, from Joakim Zhang. 9) mscc ocelot needs NET_DEVLINK selct in Kconfig, from Arnd Bergmann. 10) Fix sign comparison bug in TCP_ZEROCOPY_RECEIVE getsockopt(), from Arjun Roy. 11) Ignore routes with deleted nexthop object in mlxsw, from Ido Schimmel. 12) Need to undo tcp early demux lookup sometimes in nf_nat, from Florian Westphal. 13) Fix gro aggregation for udp encaps with zero csum, from Daniel Borkmann. 14) Make sure to always use imp_ndo_send when necessaey, from Jason A. Donenfeld. 15) Fix TRSCER masks in sh_eth driver from Sergey Shtylyov. 16) prevent overly huge skb allocationsd in qrtr, from Pavel Skripkin. 17) Prevent rx ring copnsumer index loss of sync in enetc, from Vladimir Oltean. 18) Make sure textsearch copntrol block is large enough, from Wilem de Bruijn. 19) Revert MAC changes to r8152 leading to instability, from Hates Wang. 20) Advance iov in 9p even for empty reads, from Jissheng Zhang. 21) Double hook unregister in nftables, from PabloNeira Ayuso. 22) Fix memleak in ixgbe, fropm Dinghao Liu. 23) Avoid dups in pkt scheduler class dumps, from Maximilian Heyne. 24) Various mptcp fixes from Florian Westphal, Paolo Abeni, and Geliang Tang. 25) Fix DOI refcount bugs in cipso, from Paul Moore. 26) One too many irqsave in ibmvnic, from Junlin Yang. 27) Fix infinite loop with MPLS gso segmenting via virtio_net, from Balazs Nemeth. git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net: (164 commits) s390/qeth: fix notification for pending buffers during teardown s390/qeth: schedule TX NAPI on QAOB completion s390/qeth: improve completion of pending TX buffers s390/qeth: fix memory leak after failed TX Buffer allocation net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0 net: check if protocol extracted by virtio_net_hdr_set_proto is correct net: dsa: xrs700x: check if partner is same as port in hsr join net: lapbether: Remove netif_start_queue / netif_stop_queue atm: idt77252: fix null-ptr-dereference atm: uPD98402: fix incorrect allocation atm: fix a typo in the struct description net: qrtr: fix error return code of qrtr_sendmsg() mptcp: fix length of ADD_ADDR with port sub-option net: bonding: fix error return code of bond_neigh_init() net: enetc: allow hardware timestamping on TX queues with tc-etf enabled net: enetc: set MAC RX FIFO to recommended value net: davicom: Use platform_get_irq_optional() net: davicom: Fix regulator not turned off on driver removal net: davicom: Fix regulator not turned off on failed probe net: dsa: fix switchdev objects on bridge master mistakenly being applied on ports ...	2021-03-09 17:15:56 -08:00
Balazs Nemeth	d348ede32e	net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0 A packet with skb_inner_network_header(skb) == skb_network_header(skb) and ETH_P_MPLS_UC will prevent mpls_gso_segment from pulling any headers from the packet. Subsequently, the call to skb_mac_gso_segment will again call mpls_gso_segment with the same packet leading to an infinite loop. In addition, ensure that the header length is a multiple of four, which should hold irrespective of the number of stacked labels. Signed-off-by: Balazs Nemeth <bnemeth@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-09 16:12:20 -08:00
Björn Töpel	ee75aef23a	bpf, xdp: Restructure redirect actions The XDP_REDIRECT implementations for maps and non-maps are fairly similar, but obviously need to take different code paths depending on if the target is using a map or not. Today, the redirect targets for XDP either uses a map, or is based on ifindex. Here, the map type and id are added to bpf_redirect_info, instead of the actual map. Map type, map item/ifindex, and the map_id (if any) is passed to xdp_do_redirect(). For ifindex-based redirect, used by the bpf_redirect() XDP BFP helper, a special map type/id are used. Map type of UNSPEC together with map id equal to INT_MAX has the special meaning of an ifindex based redirect. Note that valid map ids are 1 inclusive, INT_MAX exclusive ([1,INT_MAX[). In addition to making the code easier to follow, using explicit type and id in bpf_redirect_info has a slight positive performance impact by avoiding a pointer indirection for the map type lookup, and instead use the cacheline for bpf_redirect_info. Since the actual map is not passed via bpf_redirect_info anymore, the map lookup is only done in the BPF helper. This means that the bpf_clear_redirect_map() function can be removed. The actual map item is RCU protected. The bpf_redirect_info flags member is not used by XDP, and not read/written any more. The map member is only written to when required/used, and not unconditionally. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210308112907.559576-3-bjorn.topel@gmail.com	2021-03-10 01:06:34 +01:00
Björn Töpel	e6a4750ffe	bpf, xdp: Make bpf_redirect_map() a map operation Currently the bpf_redirect_map() implementation dispatches to the correct map-lookup function via a switch-statement. To avoid the dispatching, this change adds bpf_redirect_map() as a map operation. Each map provides its bpf_redirect_map() version, and correct function is automatically selected by the BPF verifier. A nice side-effect of the code movement is that the map lookup functions are now local to the map implementation files, which removes one additional function call. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210308112907.559576-2-bjorn.topel@gmail.com	2021-03-10 01:06:34 +01:00
Jia-Ju Bai	179d0ba0c4	net: qrtr: fix error return code of qrtr_sendmsg() When sock_alloc_send_skb() returns NULL to skb, no error return code of qrtr_sendmsg() is assigned. To fix this bug, rc is assigned with -ENOMEM in this case. Fixes: `194ccc8829` ("net: qrtr: Support decoding incoming v2 packets") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-08 15:02:53 -08:00
Davide Caratti	27ab92d999	mptcp: fix length of ADD_ADDR with port sub-option in current Linux, MPTCP peers advertising endpoints with port numbers use a sub-option length that wrongly accounts for the trailing TCP NOP. Also, receivers will only process incoming ADD_ADDR with port having such wrong sub-option length. Fix this, making ADD_ADDR compliant to RFC8684 §3.4.1. this can be verified running tcpdump on the kselftests artifacts: unpatched kernel: [root@bottarga mptcp]# tcpdump -tnnr unpatched.pcap \| grep add-addr reading from file unpatched.pcap, link-type LINUX_SLL (Linux cooked v1), snapshot length 65535 IP 10.0.1.1.10000 > 10.0.1.2.53078: Flags [.], ack 101, win 509, options [nop,nop,TS val 214459678 ecr 521312851,mptcp add-addr v1 id 1 a00:201:2774:2d88:7436:85c3:17fd:101], length 0 IP 10.0.1.2.53078 > 10.0.1.1.10000: Flags [.], ack 101, win 502, options [nop,nop,TS val 521312852 ecr 214459678,mptcp add-addr[bad opt]] patched kernel: [root@bottarga mptcp]# tcpdump -tnnr patched.pcap \| grep add-addr reading from file patched.pcap, link-type LINUX_SLL (Linux cooked v1), snapshot length 65535 IP 10.0.1.1.10000 > 10.0.1.2.38178: Flags [.], ack 101, win 509, options [nop,nop,TS val 3728873902 ecr 2732713192,mptcp add-addr v1 id 1 10.0.2.1:10100 hmac 0xbccdfcbe59292a1f,nop,nop], length 0 IP 10.0.1.2.38178 > 10.0.1.1.10000: Flags [.], ack 101, win 502, options [nop,nop,TS val 2732713195 ecr 3728873902,mptcp add-addr v1-echo id 1 10.0.2.1:10100,nop,nop], length 0 Fixes: `22fb85ffae` ("mptcp: add port support for ADD_ADDR suboption writing") CC: stable@vger.kernel.org # 5.11+ Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-and-tested-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-08 15:02:03 -08:00
Vladimir Oltean	03cbb87054	net: dsa: fix switchdev objects on bridge master mistakenly being applied on ports Tobias reports that after the blamed patch, VLAN objects being added to a bridge device are being added to all slave ports instead (swp2, swp3). ip link add br0 type bridge vlan_filtering 1 ip link set swp2 master br0 ip link set swp3 master br0 bridge vlan add dev br0 vid 100 self This is because the fix was too broad: we made dsa_port_offloads_netdev say "yes, I offload the br0 bridge" for all slave ports, but we didn't add the checks whether the switchdev object was in fact meant for the physical port or for the bridge itself. So we are reacting on events in a way in which we shouldn't. The reason why the fix was too broad is because the question itself, "does this DSA port offload this netdev", was too broad in the first place. The solution is to disambiguate the question and separate it into two different functions, one to be called for each switchdev attribute / object that has an orig_dev == net_bridge (dsa_port_offloads_bridge), and the other for orig_dev == net_bridge_port (*_offloads_bridge_port). In the case of VLAN objects on the bridge interface, this solves the problem because we know that VLAN objects are per bridge port and not per bridge. And when orig_dev is equal to the net_bridge, we offload it as a bridge, but not as a bridge port; that's how we are able to skip reacting on those events. Note that this is compatible with future plans to have explicit offloading of VLAN objects on the bridge interface as a bridge port (in DSA, this signifies that we should add that VLAN towards the CPU port). Fixes: `99b8202b17` ("net: dsa: fix SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING getting ignored") Reported-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> Tested-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-08 11:59:00 -08:00
Björn Töpel	a23b3f5697	xsk: Update rings for load-acquire/store-release barriers Currently, the AF_XDP rings uses general smp_{r,w,}mb() barriers on the kernel-side. On most modern architectures load-acquire/store-release barriers perform better, and results in simpler code for circular ring buffers. This change updates the XDP socket rings to use load-acquire/store-release barriers. It is important to note that changing from the old smp_{r,w,}mb() barriers, to load-acquire/store-release barriers does not break compatibility. The old semantics work with the new one, and vice versa. As pointed out by "Documentation/memory-barriers.txt" in the "SMP BARRIER PAIRING" section: "General barriers pair with each other, though they also pair with most other types of barriers, albeit without multicopy atomicity. An acquire barrier pairs with a release barrier, but both may also pair with other barriers, including of course general barriers." How different barriers behaves and pairs is outlined in "tools/memory-model/Documentation/cheatsheet.txt". In order to make sure that compatibility is not broken, LKMM herd7 based litmus tests can be constructed and verified. We generalize the XDP socket ring to a one entry ring, and create two scenarios; One where the ring is full, where only the consumer can proceed, followed by the producer. One where the ring is empty, where only the producer can proceed, followed by the consumer. Each scenario is then expanded to four different tests: general producer/general consumer, general producer/acqrel consumer, acqrel producer/general consumer, acqrel producer/acqrel consumer. In total eight tests. The empty ring test: C spsc-rb+empty // Simple one entry ring: // prod cons allowed action prod cons // 0 0 => prod => 1 0 // 0 1 => cons => 0 0 // 1 0 => cons => 1 1 // 1 1 => prod => 0 1 {} // We start at prod==0, cons==0, data==0, i.e. nothing has been // written to the ring. From here only the producer can start, and // should write 1. Afterwards, consumer can continue and read 1 to // data. Can we enter state prod==1, cons==1, but consumer observed // the incorrect value of 0? P0(int prod, int cons, int data) { ... producer } P1(int prod, int cons, int data) { ... consumer } exists( 1:d=0 /\ prod=1 /\ cons=1 ); The full ring test: C spsc-rb+full // Simple one entry ring: // prod cons allowed action prod cons // 0 0 => prod => 1 0 // 0 1 => cons => 0 0 // 1 0 => cons => 1 1 // 1 1 => prod => 0 1 { prod = 1; } // We start at prod==1, cons==0, data==1, i.e. producer has // written 0, so from here only the consumer can start, and should // consume 0. Afterwards, producer can continue and write 1 to // data. Can we enter state prod==0, cons==1, but consumer observed // the write of 1? P0(int prod, int cons, int data) { ... producer } P1(int prod, int cons, int data) { ... consumer } exists( 1:d=1 /\ prod=0 /\ cons=1 ); where P0 and P1 are: P0(int prod, int cons, int data) { int p; p = READ_ONCE(prod); if (READ_ONCE(cons) == p) { WRITE_ONCE(data, 1); smp_wmb(); WRITE_ONCE(prod, p ^ 1); } } P0(int prod, int cons, int data) { int p; p = READ_ONCE(prod); if (READ_ONCE(cons) == p) { WRITE_ONCE(data, 1); smp_store_release(prod, p ^ 1); } } P1(int prod, int cons, int data) { int c; int d = -1; c = READ_ONCE(cons); if (READ_ONCE(prod) != c) { smp_rmb(); d = READ_ONCE(data); smp_mb(); WRITE_ONCE(cons, c ^ 1); } } P1(int prod, int cons, int data) { int c; int d = -1; c = READ_ONCE(cons); if (smp_load_acquire(prod) != c) { d = READ_ONCE(*data); smp_store_release(cons, c ^ 1); } } The full LKMM litmus tests are found at [1]. On x86-64 systems the l2fwd AF_XDP xdpsock sample performance increases by 1%. This is mostly due to that the smp_mb() is removed, which is a relatively expensive operation on these platforms. Weakly-ordered platforms, such as ARM64 might benefit even more. [1] https://github.com/bjoto/litmus-xsk Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210305094113.413544-2-bjorn.topel@gmail.com	2021-03-08 08:52:05 -08:00
David S. Miller	9270bbe258	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix incorrect enum type definition in nfnetlink_cthelper UAPI, from Dmitry V. Levin. 2) Remove extra space in deprecated automatic helper assignment notice, from Klemen Košir. 3) Drop early socket demux socket after NAT mangling, from Florian Westphal. Add a test to exercise this bug. 4) Fix bogus invalid packet report in the conntrack TCP tracker, also from Florian. 5) Fix access to xt[NFPROTO_UNSPEC] list with no mutex in target/match_revfn(), from Vasily Averin. 6) Disallow updates on the table ownership flag. 7) Fix double hook unregistration of tables with owner. 8) Remove bogus check on the table owner in __nft_release_tables(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-06 17:02:40 -08:00
Sergey Nazarov	e233febda6	CIPSO: Fix unaligned memory access in cipso_v4_gentag_hdr We need to use put_unaligned when writing 32-bit DOI value in cipso_v4_gentag_hdr to avoid unaligned memory access. v2: unneeded type cast removed as Ondrej Mosnacek suggested. Signed-off-by: Sergey Nazarov <s-nazarov@yandex.ru> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-05 13:00:38 -08:00
Xuesen Huang	d01b59c9ae	bpf: Add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH bpf_skb_adjust_room sets the inner_protocol as skb->protocol for packets encapsulation. But that is not appropriate when pushing Ethernet header. Add an option to further specify encap L2 type and set the inner_protocol as ETH_P_TEB. Suggested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Xuesen Huang <huangxuesen@kuaishou.com> Signed-off-by: Zhiyong Cheng <chengzhiyong@kuaishou.com> Signed-off-by: Li Wang <wangli09@kuaishou.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/bpf/20210304064046.6232-1-hxseverything@gmail.com	2021-03-05 16:59:00 +01:00
Lorenz Bauer	7c32e8f8bc	bpf: Add PROG_TEST_RUN support for sk_lookup programs Allow to pass sk_lookup programs to PROG_TEST_RUN. User space provides the full bpf_sk_lookup struct as context. Since the context includes a socket pointer that can't be exposed to user space we define that PROG_TEST_RUN returns the cookie of the selected socket or zero in place of the socket pointer. We don't support testing programs that select a reuseport socket, since this would mean running another (unrelated) BPF program from the sk_lookup test handler. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210303101816.36774-3-lmb@cloudflare.com	2021-03-04 19:11:29 -08:00
Lorenz Bauer	607b9cc92b	bpf: Consolidate shared test timing code Share the timing / signal interruption logic between different implementations of PROG_TEST_RUN. There is a change in behaviour as well. We check the loop exit condition before checking for pending signals. This resolves an edge case where a signal arrives during the last iteration. Instead of aborting with EINTR we return the successful result to user space. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210303101816.36774-2-lmb@cloudflare.com	2021-03-04 19:11:29 -08:00
Paul Moore	ad5d07f4a9	cipso,calipso: resolve a number of problems with the DOI refcounts The current CIPSO and CALIPSO refcounting scheme for the DOI definitions is a bit flawed in that we: 1. Don't correctly match gets/puts in netlbl_cipsov4_list(). 2. Decrement the refcount on each attempt to remove the DOI from the DOI list, only removing it from the list once the refcount drops to zero. This patch fixes these problems by adding the missing "puts" to netlbl_cipsov4_list() and introduces a more conventional, i.e. not-buggy, refcounting mechanism to the DOI definitions. Upon the addition of a DOI to the DOI list, it is initialized with a refcount of one, removing a DOI from the list removes it from the list and drops the refcount by one; "gets" and "puts" behave as expected with respect to refcounts, increasing and decreasing the DOI's refcount by one. Fixes: `b1edeb1023` ("netlabel: Replace protocol/NetLabel linking with refrerence counts") Fixes: `d7cce01504` ("netlabel: Add support for removing a CALIPSO DOI.") Reported-by: syzbot+9ec037722d2603a9f52e@syzkaller.appspotmail.com Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-04 15:26:57 -08:00
Geliang Tang	9238e900d6	mptcp: free resources when the port number is mismatched When the port number is mismatched with the announced ones, use 'goto dispose_child' to free the resources instead of using 'goto out'. This patch also moves the port number checking code in subflow_syn_recv_sock before mptcp_finish_join, otherwise subflow_drop_ctx will fail in dispose_child. Fixes: `5bc56388c7` ("mptcp: add port number check for MP_JOIN") Reported-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-04 14:30:13 -08:00
Paolo Abeni	417789df4a	mptcp: fix missing wakeup __mptcp_clean_una() can free write memory and should wake-up user-space processes when needed. When such function is invoked by the MPTCP receive path, the wakeup is not needed, as the TCP stack will later trigger subflow_write_space which will do the wakeup as needed. Other __mptcp_clean_una() call sites need an additional wakeup check Let's bundle the relevant code in a new helper and use it. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/165 Fixes: `6e628cd3a8` ("mptcp: use mptcp release_cb for delayed tasks") Fixes: `64b9cea7a0` ("mptcp: fix spurious retransmissions") Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-04 14:30:13 -08:00
Paolo Abeni	c2e6048fa1	mptcp: fix race in release_cb If we receive a MPTCP_PUSH_PENDING even from a subflow when mptcp_release_cb() is serving the previous one, the latter will be delayed up to the next release_sock(msk). Address the issue implementing a test/serve loop for such event. Additionally rename the push helper to __mptcp_push_pending() to be more consistent with the existing code. Fixes: `6e628cd3a8` ("mptcp: use mptcp release_cb for delayed tasks") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-04 14:30:13 -08:00
Paolo Abeni	2948d0a1e5	mptcp: factor out __mptcp_retrans helper() Will simplify the following patch, no functional change intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-04 14:30:13 -08:00
Florian Westphal	c8fe62f076	mptcp: reset 'first' and ack_hint on subflow close Just like with last_snd, we have to NULL 'first' on subflow close. ack_hint isn't strictly required (its never dereferenced), but better to clear this explicitly as well instead of making it an exception. msk->first is dereferenced unconditionally at accept time, but at that point the ssk is not on the conn_list yet -- this means worker can't see it when iterating the conn_list. Reported-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-04 14:30:13 -08:00

1 2 3 4 5 ...

64212 Commits