linux

iv/linux

Author	SHA1	Message	Date
Florian Westphal	cc1bb845ad	xfrm: policy: return NULL when inexact search needed currently policy_hash_bysel() returns the hash bucket list (for exact policies), or the inexact list (when policy uses a prefix). Searching this inexact list is slow, so it might be better to pre-sort inexact lists into a tree or another data structure for faster searching. However, due to 'any' policies, that need to be searched in any case, doing so will require that 'inexact' policies need to be handled specially to decide the best search strategy. So change hash_bysel() and return NULL if the policy can't be handled via the policy hash table. Right now, we simply use the inexact list when this happens, but future patch can then implement a different strategy. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2018-11-09 11:57:38 +01:00
Florian Westphal	a927d6af53	xfrm: policy: split list insertion into a helper ... so we can reuse this later without code duplication when we add policy to a second inexact list. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2018-11-09 11:57:30 +01:00
Florian Westphal	ceb159e30a	xfrm: security: iterate all, not inexact lists currently all non-socket policies are either hashed in the dst table, or placed on the 'inexact list'. When flushing, we first walk the table, then the (per-direction) inexact lists. When we try and get rid of the inexact lists to having "n" inexact lists (e.g. per-af inexact lists, or sorted into a tree), this walk would become more complicated. Simplify this: walk the 'all' list and skip socket policies during traversal so we don't need to handle exact and inexact policies separately anymore. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2018-11-09 11:57:22 +01:00
Florian Westphal	b69d540da7	selftests: add xfrm policy test script add a script that adds a ipsec tunnel between two network namespaces plus following policies: .0/24 -> ipsec tunnel .240/28 -> bypass .253/32 -> ipsec tunnel Then check that .254 bypasses tunnel (match /28 exception), and .2 (match /24) and .253 (match direct policy) pass through the tunnel. Abuses iptables to check if ping did resolve an ipsec policy or not. Also adds a bunch of 'block' rules that are not supposed to match. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2018-11-09 11:57:08 +01:00
Edward Cree	29e1220717	sfc: use the new __netdev_tx_sent_queue BQL optimisation As added in `3e59020abf` ("net: bql: add __netdev_tx_sent_queue()"), which see for performance rationale. Signed-off-by: Edward Cree <ecree@solarflare.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 20:01:29 -08:00
David S. Miller	eb4149c9a5	Merge branch 'net-Remove-VLAN_TAG_PRESENT-from-drivers' Michał Mirosław says: ==================== net: Remove VLAN_TAG_PRESENT from drivers This series removes VLAN_TAG_PRESENT use from network drivers in preparation to removing its special meaning. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:49:32 -08:00
Michał Mirosław	f4f9a5e6cc	gianfar: remove use of VLAN_TAG_PRESENT Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:49:32 -08:00
Michał Mirosław	9df46aefaf	OVS: remove use of VLAN_TAG_PRESENT This is a minimal change to allow removing of VLAN_TAG_PRESENT. It leaves OVS unable to use CFI bit, as fixing this would need a deeper surgery involving userspace interface. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:49:31 -08:00
Michał Mirosław	f723a1a293	cnic: remove use of VLAN_TAG_PRESENT This just removes VLAN_TAG_PRESENT use. VLAN TCI=0 special meaning is deeply embedded in the driver code and so is left as is. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:49:31 -08:00
Michał Mirosław	1ef212afa4	i40iw: remove use of VLAN_TAG_PRESENT Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:49:31 -08:00
Ilias Apalodimas	0d404a6128	net: socionext: refactor netsec_alloc_dring() return -ENOMEM directly instead of assigning it in a variable Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:42:41 -08:00
Ilias Apalodimas	4acb20b462	net: socionext: different approach on DMA Current driver dynamically allocates an skb and maps it as DMA Rx buffer. In order to prepare for upcoming XDP changes, let's introduce a different allocation scheme. Buffers are allocated dynamically and mapped into hardware. During the Rx operation the driver uses build_skb() to produce the necessary buffers for the network stack. This change increases performance ~15% on 64b packets with smmu disabled and ~5% with smmu enabled Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:42:41 -08:00
Stefan Wahren	026b907d58	net: qca_spi: Add available buffer space verification Interferences on the SPI line could distort the response of available buffer space. So at least we should check that the response doesn't exceed the maximum available buffer space. In error case increase a new error counter and retry it later. This behavior avoids buffer errors in the QCA7000, which results in an unnecessary chip reset including packet loss. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:41:01 -08:00
David Barmann	50254256f3	sock: Reset dst when changing sk_mark via setsockopt When setting the SO_MARK socket option, if the mark changes, the dst needs to be reset so that a new route lookup is performed. This fixes the case where an application wants to change routing by setting a new sk_mark. If this is done after some packets have already been sent, the dst is cached and has no effect. Signed-off-by: David Barmann <david.barmann@stackpath.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 19:36:13 -08:00
David S. Miller	52358cb5a3	Merge branch 's390-qeth-next' Julian Wiedmann says: ==================== s390/qeth: updates 2018-11-08 please apply the following qeth patches to net-next. The first patch allows one more device type to query the FW for a MAC address, the others are all basically just removal of duplicated or unused code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:22:24 -08:00
Julian Wiedmann	ded9da1fc2	s390/qeth: don't process hsuid in qeth_l3_setup_netdev() qeth_l3_setup_netdev() checks if the hsuid attribute is set on the qeth device, and propagates it to the net_device. In the past this was needed to pick up any hsuid that was set before allocation of the net_device. With commit `d3d1b205e8` ("s390/qeth: allocate netdevice early") this is no longer necessary, qeth_l3_dev_hsuid_store() always stores the hsuid straight into dev->perm_addr. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:22:24 -08:00
Julian Wiedmann	9168f5ae38	s390/qeth: remove unused fallback in Layer3's MAC code If the CREATE ADDR sent by qeth_l3_iqd_read_initial_mac() fails, its callback sets a random MAC address on the net_device. The error then propagates back, and qeth_l3_setup_netdev() bails out without registering the net_device. Any subsequent call to qeth_l3_setup_netdev() will then attempt a fresh CREATE ADDR which either 1) also fails, or 2) sets a proper MAC address on the net_device. Consequently, the net_device will never be registered with a random MAC and we can drop the fallback code. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:22:24 -08:00
Julian Wiedmann	4fa55fa94f	s390/qeth: remove two IPA command helpers qeth_l3_send_ipa_arp_cmd() is merely a wrapper around qeth_send_control_data() now. So push the length adjustment into QETH_SETASS_BASE_LEN, and remove the wrapper. While at it, also remove some redundant 0-initializations. qeth_send_setassparms() requires that callers prepare their command parameters, so that they can be copied into the parameter area in one go. Skip the indirection, and just let callers set up the command themselves. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:22:24 -08:00
Julian Wiedmann	605c9d5f58	s390/qeth: replace open-coded cmd setup Call qeth_prepare_ipa_cmd() during setup of a new IPA cmd buffer, so that it is used for all commands. Thus ARP and SNMP requests don't have to do their own initialization. This will now also set the proper MPC protocol version for SNMP requests on L2 devices. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:22:23 -08:00
Julian Wiedmann	d7d18da1f7	s390/qeth: remove card list Re-implement the card-by-RDEV lookup by using device model concepts, and remove the now redundant list of all qeth card instances in the system. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:22:23 -08:00
Julian Wiedmann	81ec543939	s390/qeth: unify transmit code Since commit `82bf5c0867` ("s390/qeth: add support for IPv6 TSO"), qeth_xmit() also knows how to build TSO packets and is practically identical to qeth_l3_xmit(). Convert qeth_l3_xmit() into a thin wrapper that merely strips the L2 header off a packet, and calls qeth_xmit() for the actual TX processing. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:22:23 -08:00
Julian Wiedmann	5a541f6d00	s390/qeth: handle af_iucv skbs in qeth_l3_fill_header() Filling the HW header from one single function will make it easier to rip out all the duplicated transmit code in qeth_l3_xmit(). On top, this saves one conditional branch in the TSO path. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:22:23 -08:00
Julian Wiedmann	b144b99fff	s390/qeth: utilize virtual MAC for Layer2 OSD devices By default, READ MAC on a Layer2 OSD device returns the adapter's burnt-in MAC address. Given the default scenario of many virtual devices on the same adapter, qeth can't make any use of this address and therefore skips the READ MAC call for this device type. But in some configurations, the READ MAC command for a Layer2 OSD device actually returns a pre-provisioned, virtual MAC address. So enable the READ MAC code to detect this situation, and let the L2 subdriver call READ MAC for OSD devices. This also removes the QETH_LAYER2_MAC_READ flag, which protects L2 devices against calling READ MAC multiple times. Instead protect the whole call to qeth_l2_request_initial_mac(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:22:23 -08:00
Li RongQing	04087d9a89	openvswitch: remove BUG_ON from get_dpdev if local is NULL pointer, and the following access of local's dev will trigger panic, which is same as BUG_ON Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:14:59 -08:00
David S. Miller	20da4ef91c	Merge branch 'ICMP-error-handling-for-UDP-tunnels' Stefano Brivio says: ==================== ICMP error handling for UDP tunnels This series introduces ICMP error handling for UDP tunnels and encapsulations and related selftests. We need to handle ICMP errors to support PMTU discovery and route redirection -- this support is entirely missing right now: - patch 1/11 adds a socket lookup for UDP tunnels that use, by design, the same destination port on both endpoints -- i.e. VXLAN and GENEVE - patches 2/11 to 7/11 are specific to VxLAN and GENEVE - patches 8/11 and 9/11 add infrastructure for lookup of encapsulations where sent packets cannot be matched via receiving socket lookup, i.e. FoU and GUE - patches 10/11 and 11/11 are specific to FoU and GUE v2: changes are listed in the single patches ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:09 -08:00
Stefano Brivio	56fd865f46	selftests: pmtu: Introduce FoU and GUE PMTU exceptions tests Introduce eight tests, for FoU and GUE, with IPv4 and IPv6 payload, on IPv4 and IPv6 transport, that check that PMTU exceptions are created with the right value when exceeding the MTU on a link of the path. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:08 -08:00
Stefano Brivio	b8a51b38e4	fou, fou6: ICMP error handlers for FoU and GUE As the destination port in FoU and GUE receiving sockets doesn't necessarily match the remote destination port, we can't associate errors to the encapsulating tunnels with a socket lookup -- we need to blindly try them instead. This means we don't even know if we are handling errors for FoU or GUE without digging into the packets. Hence, implement a single handler for both, one for IPv4 and one for IPv6, that will check whether the packet that generated the ICMP error used a direct IP encapsulation or if it had a GUE header, and send the error to the matching protocol handler, if any. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:08 -08:00
Stefano Brivio	e7cc082455	udp: Support for error handlers of tunnels with arbitrary destination port ICMP error handling is currently not possible for UDP tunnels not employing a receiving socket with local destination port matching the remote one, because we have no way to look them up. Add an err_handler tunnel encapsulation operation that can be exported by tunnels in order to pass the error to the protocol implementing the encapsulation. We can't easily use a lookup function as we did for VXLAN and GENEVE, as protocol error handlers, which would be in turn called by implementations of this new operation, handle the errors themselves, together with the tunnel lookup. Without a socket, we can't be sure which encapsulation error handler is the appropriate one: encapsulation handlers (the ones for FoU and GUE introduced in the next patch, e.g.) will need to check the new error codes returned by protocol handlers to figure out if errors match the given encapsulation, and, in turn, report this error back, so that we can try all of them in __udp{4,6}_lib_err_encap_no_sk() until we have a match. v2: - Name all arguments in err_handler prototypes (David Miller) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:08 -08:00
Stefano Brivio	32bbd8793f	net: Convert protocol error handlers from void to int We'll need this to handle ICMP errors for tunnels without a sending socket (i.e. FoU and GUE). There, we might have to look up different types of IP tunnels, registered as network protocols, before we get a match, so we want this for the error handlers of IPPROTO_IPIP and IPPROTO_IPV6 in both inet_protos and inet6_protos. These error codes will be used in the next patch. For consistency, return sensible error codes in protocol error handlers whenever handlers can't handle errors because, even if valid, they don't match a protocol or any of its states. This has no effect on existing error handling paths. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:08 -08:00
Stefano Brivio	ce7336610c	selftests: pmtu: Introduce tests for IPv4/IPv6 over GENEVE over IPv4/IPv6 Use a router between endpoints, implemented via namespaces, set a low MTU between router and destination endpoint, exceed it and check PMTU value in route exceptions. v2: - Introduce IPv4 tests right away, if iproute2 doesn't support the 'df' link option they will be skipped (David Ahern) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:08 -08:00
Stefano Brivio	a025fb5f49	geneve: Allow configuration of DF behaviour draft-ietf-nvo3-geneve-08 says: It is strongly RECOMMENDED that Path MTU Discovery ([RFC1191], [RFC1981]) be used by setting the DF bit in the IP header when Geneve packets are transmitted over IPv4 (this is the default with IPv6). Now that ICMP error handling is working for GENEVE, we can comply with this recommendation. Make this configurable, though, to avoid breaking existing setups. By default, DF won't be set. It can be set or inherited from inner IPv4 packets. If it's configured to be inherited and we are encapsulating IPv6, it will be set. This only applies to non-lwt tunnels: if an external control plane is used, tunnel key will still control the DF flag. v2: - DF behaviour configuration only applies for non-lwt tunnels, apply DF setting only if (!geneve->collect_md) in geneve_xmit_skb() (Stephen Hemminger) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:08 -08:00
Stefano Brivio	a07966447f	geneve: ICMP error lookup handler Export an encap_err_lookup() operation to match an ICMP error against a valid VNI. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:08 -08:00
Stefano Brivio	582888792f	selftests: pmtu: Introduce tests for IPv4/IPv6 over VXLAN over IPv4/IPv6 Use a router between endpoints, implemented via namespaces, set a low MTU between router and destination endpoint, exceed it and check PMTU value in route exceptions. v2: - Change all occurrences of VxLAN to VXLAN (Jiri Benc) - Introduce IPv4 tests right away, if iproute2 doesn't support the 'df' link option they will be skipped (David Ahern) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:08 -08:00
Stefano Brivio	b4d3069783	vxlan: Allow configuration of DF behaviour Allow users to set the IPv4 DF bit in outgoing packets, or to inherit its value from the IPv4 inner header. If the encapsulated protocol is IPv6 and DF is configured to be inherited, always set it. For IPv4, inheriting DF from the inner header was probably intended from the very beginning judging by the comment to vxlan_xmit(), but it wasn't actually implemented -- also because it would have done more harm than good, without handling for ICMP Fragmentation Needed messages. According to RFC 7348, "Path MTU discovery MAY be used". An expired RFC draft, draft-saum-nvo3-pmtud-over-vxlan-05, whose purpose was to describe PMTUD implementation, says that "is a MUST that Vxlan gateways [...] SHOULD set the DF-bit [...]", whatever that means. Given this background, the only sane option is probably to let the user decide, and keep the current behaviour as default. This only applies to non-lwt tunnels: if an external control plane is used, tunnel key will still control the DF flag. v2: - DF behaviour configuration only applies for non-lwt tunnels, move DF setting to if (!info) block in vxlan_xmit_one() (Stephen Hemminger) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:08 -08:00
Stefano Brivio	c3a43b9fec	vxlan: ICMP error lookup handler Export an encap_err_lookup() operation to match an ICMP error against a valid VNI. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:08 -08:00
Stefano Brivio	a36e185e8c	udp: Handle ICMP errors for tunnels with same destination port on both endpoints For both IPv4 and IPv6, if we can't match errors to a socket, try tunnels before ignoring them. Look up a socket with the original source and destination ports as found in the UDP packet inside the ICMP payload, this will work for tunnels that force the same destination port for both endpoints, i.e. VXLAN and GENEVE. Actually, lwtunnels could break this assumption if they are configured by an external control plane to have different destination ports on the endpoints: in this case, we won't be able to trace ICMP messages back to them. For IPv6 redirect messages, call ip6_redirect() directly with the output interface argument set to the interface we received the packet from (as it's the very interface we should build the exception on), otherwise the new nexthop will be rejected. There's no such need for IPv4. Tunnels can now export an encap_err_lookup() operation that indicates a match. Pass the packet to the lookup function, and if the tunnel driver reports a matching association, continue with regular ICMP error handling. v2: - Added newline between network and transport header sets in __udp{4,6}_lib_err_encap() (David Miller) - Removed redundant skb_reset_network_header(skb); in __udp4_lib_err_encap() - Removed redundant reassignment of iph in __udp4_lib_err_encap() (Sabrina Dubroca) - Edited comment to __udp{4,6}_lib_err_encap() to reflect the fact this won't work with lwtunnels configured to use asymmetric ports. By the way, it's VXLAN, not VxLAN (Jiri Benc) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:13:08 -08:00
Colin Ian King	141b95d551	net: hns3: fix spelling mistake, "assertting" -> "asserting" Trivial fix to spelling mistake in dev_err error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:07:56 -08:00
Ganesh Goudar	6d444c4efc	cxgb4: Add new T6 PCI device ids 0x608a Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:05:20 -08:00
Li RongQing	1c51dc9ad6	net/ipv6: compute anycast address hash only if dev is null avoid to compute the hash value if dev is not null, since hash value is not used Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 17:04:43 -08:00
YueHaibing	0db55093b5	net: bcmgenet: return correct value 'ret' from bcmgenet_power_down Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/broadcom/genet/bcmgenet.c: In function 'bcmgenet_power_down': drivers/net/ethernet/broadcom/genet/bcmgenet.c:1136:6: warning: variable 'ret' set but not used [-Wunused-but-set-variable] bcmgenet_power_down should return 'ret' instead of 0. Fixes: `ca8cf34190` ("net: bcmgenet: propagate errors from bcmgenet_power_down") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 16:21:41 -08:00
David S. Miller	3ed3857011	Merge branch 'net-sched-prepare-for-more-Qdisc-offloads' Jakub Kicinski says: ==================== net: sched: prepare for more Qdisc offloads This series refactors the "switchdev" Qdisc offloads a little. We have a few Qdiscs which can be fully offloaded today to the forwarding plane of switching devices. First patch adds a helper for handing statistic dumps, the code seems to be copy pasted between PRIO and RED. Second patch removes unnecessary parameter from RED offload function. Third patch makes the MQ offload use the dump helper which helps it behave much like PRIO and RED when it comes to the TCQ_F_OFFLOADED flag. Patch 4 adds a graft helper, similar to the dump helper. Patch 5 is unrelated to offloads, qdisc_graft() code seemed ripe for a small refactor - no functional changes there. Last two patches move the qdisc_put() call outside of the sch_tree_lock section for RED and PRIO. The child Qdiscs will get removed from the hierarchy under the lock, but having the put (and potentially destroy) called outside of the lock helps offload which may choose to sleep, and it should generally lower the Qdisc change impact. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 16:19:48 -08:00
Jakub Kicinski	7b8e0b6e65	net: sched: prio: delay destroying child qdiscs on change Move destroying of the old child qdiscs outside of the sch_tree_lock() section. This should improve the software qdisc replace but is even more important for offloads. Calling offloads under a spin lock is best avoided, and child's destroy would be called under sch_tree_lock(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 16:19:48 -08:00
Jakub Kicinski	0c8d13ac96	net: sched: red: delay destroying child qdisc on replace Move destroying of the old child qdisc outside of the sch_tree_lock() section. This should improve the software qdisc replace but is even more important for offloads. Firstly calling offloads under a spin lock is best avoided. Secondly the destroy event of existing child would have been sent to the offload device before the replace, causing confusion. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 16:19:48 -08:00
Jakub Kicinski	9da93ece59	net: sched: refactor grafting Qdiscs with a parent The code for grafting Qdiscs when there is a parent has two needless indentation levels, and breaks the "keep the success path unindented" guideline. Refactor. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 16:19:48 -08:00
Jakub Kicinski	bfaee9113f	net: sched: add an offload graft helper Qdisc graft operation of offload-capable qdiscs performs a few extra steps which are identical among all the qdiscs. Add a helper to share this code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 16:19:48 -08:00
Jakub Kicinski	58f8927399	net: sched: set TCQ_F_OFFLOADED flag for MQ PRIO and RED mark the qdisc with TCQ_F_OFFLOADED upon successful offload, make MQ do the same. The consistency will help with consistent graft callback behaviour. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 16:19:48 -08:00
Jakub Kicinski	dad54c0fab	net: sched: red: remove unnecessary red_dump_offload_stats parameter Offload dump helper does not use opt parameter, remove it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 16:19:47 -08:00
Jakub Kicinski	b592843c67	net: sched: add an offload dump helper Qdisc dump operation of offload-capable qdiscs performs a few extra steps which are identical among all the qdiscs. Add a helper to share this code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 16:19:47 -08:00
David S. Miller	80b6265c0f	Merge branch 'net-phy-improve-and-simplify-phylib-state-machine' Heiner Kallweit says: ==================== net: phy: improve and simplify phylib state machine This patch series is based on two axioms: - During autoneg a PHY always reports the link being down - Info in clause 22/45 registers doesn't allow to differentiate between these two states: 1. Link is physically down 2. A link partner is connected and PHY is autonegotiating In both cases "link up" and "aneg finished" bits aren't set. One consequence is that having separate states PHY_NOLINK and PHY_AN isn't needed. By using these two axioms the state machine can be significantly simplified. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 15:02:06 -08:00
Heiner Kallweit	c8e977bab3	net: phy: use phy_check_link_status in more places in the state machine Use phy_check_link_status in more places in the state machine. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-08 15:02:06 -08:00

1 2 3 4 5 ...

796818 Commits