linux

iv/linux

Author	SHA1	Message	Date
Ido Schimmel	12ee822039	selftests: mlxsw: Add a test for FIB offload indication Test that the offload indication for unicast routes is correctly set in different scenarios. IPv4 support will be added in the future. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	d5382fef70	ipv6: Stop sending in-kernel notifications for each nexthop Both listeners - mlxsw and netdevsim - of IPv6 FIB notifications are now ready to handle IPv6 multipath notifications. Therefore, stop ignoring such notifications in both drivers and stop sending notification for each added / deleted nexthop. v2: * Remove 'multipath_rt' from 'struct fib6_entry_notifier_info' Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	2d9dd7ec79	mlxsw: spectrum_router: Create IPv6 multipath routes in one go Allow the driver to create an IPv6 multipath route in one go by passing an array of sibling routes and iterating over them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	d21afd3029	mlxsw: spectrum_router: Add / delete multiple IPv6 nexthops Currently, the functions that take care of populating IPv6 nexthop groups only add / delete a single nexthop. Prepare them to handle multiple routes in one notification by passing an array of routes and adding / deleting all of them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	921bc539cb	mlxsw: spectrum_router: Pass array of routes to route handling functions Prepare the driver to handle multiple routes in a single notification by passing an array of routes to the functions that actually add / delete a route. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	94d628d1f9	mlxsw: spectrum_router: Adjust IPv6 replace logic to new notifications Previously, IPv6 replace notifications were only sent from fib6_add_rt2node(). The function only emitted such notifications if a route actually replaced another route. A previous patch added another call site in ip6_route_multipath_add() from which such notification can be emitted even if a route was merely added and did not replace another route. Adjust the driver to take this into account and potentially set the 'replace' flag to 'false' if the notified route did not replace an existing route. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	928c0b534f	mlxsw: spectrum_router: Pass multiple routes to work item Prepare the driver to process IPv6 multipath notifications by passing an array of 'struct fib6_info' instead of just one route. A reference is taken on each sibling route in order to prevent them from being freed until they are processed by the workqueue. v2: * Remove 'multipath_rt' usage Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:37 -07:00
Ido Schimmel	ccd56a5f50	mlxsw: spectrum_router: Prepare function to return errors The function mlxsw_sp_router_fib6_event() takes care of preparing the needed information for the work item that actually inserts the route into the device. When processing an IPv6 multipath route, the function will need to allocate an array to store pointers to all the sibling routes. Change the function's signature to return an error code and adjust the single call site. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:36 -07:00
Ido Schimmel	20247fcab3	mlxsw: spectrum_router: Remove processing of IPv6 append notifications No such notifications are sent by the IPv6 code, so remove them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:36 -07:00
Ido Schimmel	2881fd61b6	ipv6: Add IPv6 multipath notification for route delete If all the nexthops of a multipath route are being deleted, send one notification for the entire route, instead of one per-nexthop. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:36 -07:00
Ido Schimmel	ebee3cad83	ipv6: Add IPv6 multipath notifications for add / replace Emit a notification when a multipath routes is added or replace. Note that unlike the replace notifications sent from fib6_add_rt2node(), it is possible we are sending a 'FIB_EVENT_ENTRY_REPLACE' when a route was merely added and not replaced. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:36 -07:00
Ido Schimmel	d133e4f1fa	netdevsim: Ignore IPv6 multipath notifications In a similar fashion to previous patch, have netdevsim ignore IPv6 multipath notifications for now. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:36 -07:00
Ido Schimmel	f6c3bb7516	mlxsw: spectrum_router: Ignore IPv6 multipath notifications IPv6 multipath notifications are about to be sent, but mlxsw is not ready to process them, so ignore them. The limitation will be lifted by a subsequent patch which will also stop the kernel from sending a notification for each nexthop. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:36 -07:00
Ido Schimmel	d4b96c7b51	ipv6: Extend notifier info for multipath routes Extend the IPv6 FIB notifier info with number of sibling routes being notified. This will later allow listeners to process one notification for a multipath routes instead of N, where N is the number of nexthops. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:36 -07:00
Ido Schimmel	c82481f7ea	netlink: Add field to skip in-kernel notifications The struct includes a 'skip_notify' flag that indicates if netlink notifications to user space should be suppressed. As explained in commit `3b1137fe74` ("net: ipv6: Change notifications for multipath add to RTA_MULTIPATH"), this is useful to suppress per-nexthop RTM_NEWROUTE notifications when an IPv6 multipath route is added / deleted. Instead, one notification is sent for the entire multipath route. This concept is also useful for in-kernel notifications. Sending one in-kernel notification for the addition / deletion of an IPv6 multipath route - instead of one per-nexthop - provides a significant increase in the insertion / deletion rate to underlying devices. Add a 'skip_notify_kernel' flag to suppress in-kernel notifications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:36 -07:00
Ido Schimmel	3de205cde4	netlink: Document all fields of 'struct nl_info' Some fields were not documented. Add documentation. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:45:36 -07:00
David S. Miller	714a485aae	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2019-06-17 This series contains updates to the iavf driver only. Akeem updates the driver to change how VLAN tags are being populated and programmed into the hardware by starting from the first member of the list until the number of allowed VLAN tags is exhausted. Mitch fixed the variable type since the variable counter starts out negative and climbs to zero, so use a signed integer instead of unsigned. Also increase the timeout to avoid erroneous errors. Fixed the driver to be able to handle when the hardware hands us a null receive descriptor with no data attached, yet is still valid. Aleksandr fixes the driver to use GFP_ATOMIC when allocating memory in atomic context. Avinash updates the driver to fix a calculation error in virtchnl regarding the valid length. Jakub does some refactoring of the commands processing the watchdog state machine to reduce the length and complexity of the function. Also decalre watchdog task as delayed work and use a dedicated work queue to service the driver tasks. Paul updated the iavf_process_aq_command to call the necessary functions to be able to clear cloud filter bits that need to be cleared. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 09:33:15 -07:00
Shalom Toledo	cd4bb2a334	mlxsw: spectrum_ptp: Fix compilation on 32-bit ARM Compilation on 32-bit ARM fails after commit `992aa864dc` ("mlxsw: spectrum_ptp: Add implementation for physical hardware clock operations") because of 64-bit division: arm-linux-gnueabi-ld: drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.o: in function `mlxsw_sp1_ptp_phc_settime': spectrum_ptp.c:(.text+0x39c): undefined reference to `__aeabi_uldivmod' Fix by using div_u64(). Fixes: `992aa864dc` ("mlxsw: spectrum_ptp: Add implementation for physical hardware clock operations") Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-18 08:52:36 -07:00
David S. Miller	13091aa305	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Honestly all the conflicts were simple overlapping changes, nothing really interesting to report. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 20:20:36 -07:00
David S. Miller	f97252a8c3	Merge branch 'UDP-GSO-audit-tests' Fred Klassen says: ==================== UDP GSO audit tests Updates to UDP GSO selftests ot optionally stress test CMSG subsytem, and report the reliability and performance of both TX Timestamping and ZEROCOPY messages. ==================== Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 16:30:38 -07:00
Fred Klassen	4ffc37f5c0	net/udpgso_bench.sh test fails on error Ensure that failure on any individual test results in an overall failure of the test script. Signed-off-by: Fred Klassen <fklassen@appneta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 16:30:37 -07:00
Fred Klassen	ade90d69ff	net/udpgso_bench.sh add UDP GSO audit tests Audit tests count the total number of messages sent and compares with total number of CMSG received on error queue. Example: udp gso zerocopy timestamp audit udp rx: 1599 MB/s 1166414 calls/s udp tx: 1615 MB/s 27395 calls/s 27395 msg/s udp rx: 1634 MB/s 1192261 calls/s udp tx: 1633 MB/s 27699 calls/s 27699 msg/s udp rx: 1633 MB/s 1191358 calls/s udp tx: 1631 MB/s 27678 calls/s 27678 msg/s Summary over 4.000 seconds... sum udp tx: 1665 MB/s 82772 calls (27590/s) 82772 msgs (27590/s) Tx Timestamps: 82772 received 0 errors Zerocopy acks: 82772 received Errors are thrown if CMSG count does not equal send count, example: Summary over 4.000 seconds... sum tcp tx: 7451 MB/s 493706 calls (123426/s) 493706 msgs (123426/s) ./udpgso_bench_tx: Unexpected number of Zerocopy completions: 493706 expected 493704 received Also reduce individual test time from 4 to 3 seconds so that overall test time does not increase significantly. v3: Enhancements as per Willem de Bruijn <willemb@google.com> - document -P option for TCP audit Signed-off-by: Fred Klassen <fklassen@appneta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 16:30:37 -07:00
Fred Klassen	79ebc3c260	net/udpgso_bench_tx: options to exercise TX CMSG This enhancement adds options that facilitate load testing with additional TX CMSG options, and to optionally print results of various send CMSG operations. These options are especially useful in isolating situations where error-queue messages are lost when combined with other CMSG operations (e.g. SO_ZEROCOPY). New options: -a - count all CMSG messages and match to sent messages -T - add TX CMSG that requests TX software timestamps -H - similar to -T except request TX hardware timestamps -P - call poll() before reading error queue -v - print detailed results v2: Enhancements as per Willem de Bruijn <willemb@google.com> - Updated control and buffer parameters for recvmsg - poll() parameter cleanup - fail on bad audit results - remove TOS options - improved reporting v3: Enhancements as per Willem de Bruijn <willemb@google.com> - add SOF_TIMESTAMPING_OPT_TSONLY to eliminate MSG_TRUNC - general code cleanup Signed-off-by: Fred Klassen <fklassen@appneta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 16:30:37 -07:00
Linus Torvalds	29f785ff76	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "MS_MOVE regression fix + breakage in fsmount(2) (also introduced in this cycle, along with fsmount(2) itself). I'm still digging through the piles of mail, so there might be more fixes to follow, but these two are obvious and self-contained, so there's no point delaying those..." * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs/namespace: fix unprivileged mount propagation vfs: fsmount: add missing mntget()	2019-06-17 16:28:28 -07:00
David S. Miller	4bd366cece	Merge branch 'net-ipv4-remove-erroneous-advancement-of-list-pointer' Florian Westphal says: ==================== net: ipv4: remove erroneous advancement of list pointer Tariq reported a soft lockup on net-next that Mellanox was able to bisect to `2638eb8b50` ("net: ipv4: provide __rcu annotation for ifa_list"). While reviewing above patch I found a regression when addresses have a lifetime specified. Second patch extends rtnetlink.sh to trigger crash (without first patch applied). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 16:27:43 -07:00
Florian Westphal	3cfa148826	selftests: rtnetlink: add addresses with fixed life time This exercises kernel code path that deal with addresses that have a limited lifetime. Without previous fix, this triggers following crash on net-next: BUG: KASAN: null-ptr-deref in check_lifetime+0x403/0x670 Read of size 8 at addr 0000000000000010 by task kworker [..] Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 16:27:42 -07:00
Florian Westphal	40008e9211	net: ipv4: remove erroneous advancement of list pointer Causes crash when lifetime expires on an adress as garbage is dereferenced soon after. This used to look like this: for (ifap = &ifa->ifa_dev->ifa_list; ifap != NULL; ifap = &(ifap)->ifa_next) { if (ifap == ifa) ... but this was changed to: struct in_ifaddr tmp; ifap = &ifa->ifa_dev->ifa_list; tmp = rtnl_dereference(ifap); while (tmp) { tmp = rtnl_dereference(tmp->ifa_next); // Bogus if (rtnl_dereference(ifap) == ifa) { ... ifap = &tmp->ifa_next; // Can be NULL tmp = rtnl_dereference(*ifap); // Dereference } } Remove the bogus assigment/list entry skip. Fixes: `2638eb8b50` ("net: ipv4: provide __rcu annotation for ifa_list") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 16:27:42 -07:00
Arnd Bergmann	78fe8a28fb	net: dsa: sja1105: fix ptp link error Due to a reversed dependency, it is possible to build the lower ptp driver as a loadable module and the actual driver using it as built-in, causing a link error: drivers/net/dsa/sja1105/sja1105_spi.o: In function `sja1105_static_config_upload': sja1105_spi.c:(.text+0x6f0): undefined reference to `sja1105_ptp_reset' drivers/net/dsa/sja1105/sja1105_spi.o:(.data+0x2d4): undefined reference to `sja1105et_ptp_cmd' drivers/net/dsa/sja1105/sja1105_spi.o:(.data+0x604): undefined reference to `sja1105pqrs_ptp_cmd' drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_remove': sja1105_main.c:(.text+0x8d4): undefined reference to `sja1105_ptp_clock_unregister' drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_rxtstamp_work': sja1105_main.c:(.text+0x964): undefined reference to `sja1105_tstamp_reconstruct' drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_setup': sja1105_main.c:(.text+0xb7c): undefined reference to `sja1105_ptp_clock_register' drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_port_deferred_xmit': sja1105_main.c:(.text+0x1fa0): undefined reference to `sja1105_ptpegr_ts_poll' sja1105_main.c:(.text+0x1fc4): undefined reference to `sja1105_tstamp_reconstruct' drivers/net/dsa/sja1105/sja1105_main.o:(.rodata+0x5b0): undefined reference to `sja1105_get_ts_info' Change the Makefile logic to always build the ptp module the same way as the rest. Another option would be to just add it to the same module and remove the exports, but I don't know if there was a good reason to keep them separate. Fixes: `bb77f36ac2` ("net: dsa: sja1105: Add support for the PTP clock") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 16:25:29 -07:00
Arnd Bergmann	c63d1e5c2d	net: stmmac: fix unused-variable warning When building without CONFIG_OF, we get a harmless build warning: drivers/net/ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_phy_setup': drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:973:22: error: unused variable 'node' [-Werror=unused-variable] struct device_node *node = priv->plat->phy_node; Reword it so we always use the local variable, by making it the fwnode pointer instead of the device_node. Fixes: `74371272f9` ("net: stmmac: Convert to phylink and remove phylib logic") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 16:24:12 -07:00
Linus Torvalds	da0f382029	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Lots of bug fixes here: 1) Out of bounds access in __bpf_skc_lookup, from Lorenz Bauer. 2) Fix rate reporting in cfg80211_calculate_bitrate_he(), from John Crispin. 3) Use after free in psock backlog workqueue, from John Fastabend. 4) Fix source port matching in fdb peer flow rule of mlx5, from Raed Salem. 5) Use atomic_inc_not_zero() in fl6_sock_lookup(), from Eric Dumazet. 6) Network header needs to be set for packet redirect in nfp, from John Hurley. 7) Fix udp zerocopy refcnt, from Willem de Bruijn. 8) Don't assume linear buffers in vxlan and geneve error handlers, from Stefano Brivio. 9) Fix TOS matching in mlxsw, from Jiri Pirko. 10) More SCTP cookie memory leak fixes, from Neil Horman. 11) Fix VLAN filtering in rtl8366, from Linus Walluij. 12) Various TCP SACK payload size and fragmentation memory limit fixes from Eric Dumazet. 13) Use after free in pneigh_get_next(), also from Eric Dumazet. 14) LAPB control block leak fix from Jeremy Sowden" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (145 commits) lapb: fixed leak of control-blocks. tipc: purge deferredq list for each grp member in tipc_group_delete ax25: fix inconsistent lock state in ax25_destroy_timer neigh: fix use-after-free read in pneigh_get_next tcp: fix compile error if !CONFIG_SYSCTL hv_sock: Suppress bogus "may be used uninitialized" warnings be2net: Fix number of Rx queues used for flow hashing net: handle 802.1P vlan 0 packets properly tcp: enforce tcp_min_snd_mss in tcp_mtu_probing() tcp: add tcp_min_snd_mss sysctl tcp: tcp_fragment() should apply sane memory limits tcp: limit payload size of sacked skbs Revert "net: phylink: set the autoneg state in phylink_phy_change" bpf: fix nested bpf tracepoints with per-cpu data bpf: Fix out of bounds memory access in bpf_sk_storage vsock/virtio: set SOCK_DONE on peer shutdown net: dsa: rtl8366: Fix up VLAN filtering net: phylink: set the autoneg state in phylink_phy_change net: add high_order_alloc_disable sysctl/static key tcp: add tcp_tx_skb_cache sysctl ...	2019-06-17 15:55:34 -07:00
Mitch Williams	efa14c3985	iavf: allow null RX descriptors In some circumstances, the hardware can hand us a null receive descriptor, with no data attached but otherwise valid. Unfortunately, the driver was ill-equipped to handle such an event, and would stop processing packets at that point. To fix this, use the Descriptor Done bit instead of the size to determine whether or not a descriptor is ready to be processed. Add some checks to allow for unused buffers. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:26 -07:00
Paul Greenwalt	68dfe6348f	iavf: add call to iavf_[add\|del]_cloud_filter Add call to iavf_add_cloud_filter and iavf_del_cloud_filter from iavf_process_aq_command to clear aq_required IAVF_FLAG_AQ_ADD_CLOUD_FILTER and IAVF_FLAG_AQ_DEL_CLOUD_FILTER bits. aq_required IAVF_FLAG_AQ_DEL_CLOUD_FILTER bit is being set in iavf_down and iavf_delete_clsflower, and are never cleared. aq_required IAVF_FLAG_AQ_ADD_CLOUD_FILTER bit is being set in iavf_handle_reset and iavf_configure_clsflower, and are never cleared. Since the aq_required is not zero, iavf_watchdog_task is setting the queue_delayed_work to 20 msec instead of the longer delay. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:26 -07:00
Jakub Pawlak	b66c7bc1cd	iavf: Refactor init state machine Cleanup of init state machine, move state specific code to separate functions and rewrite the iavf_init_task() function. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:26 -07:00
Jan Sokolowski	bac8486116	iavf: Refactor the watchdog state machine Refactor the watchdog state machine implementation. Add the additional state __IAVF_COMM_FAILED to process the PF communication fails. Prepare the watchdog state machine to integrate with init state machine. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:26 -07:00
Jakub Pawlak	fdd4044ffd	iavf: Remove timer for work triggering, use delaying work instead Remove the watchdog timer, instead declare watchdog task as delayed work and use dedicated workqueue to service driver tasks. The dedicated driver workqueue iavf_wq is common for all driver instances. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:26 -07:00
Jakub Pawlak	b476b0030e	iavf: Move commands processing to the separate function Move the commands processing outside the watchdog_task() function. This reduce length and complexity of the function which is mainly designed to process the watchdog state machine. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:26 -07:00
Avinash Dayanand	16e00c25ac	iavf: Fix the math for valid length for ADq enable There was a calculation error in virtchnl regarding the valid length which was fixed recently and a corresponding change needs to go into the code while we enable ADq. Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:26 -07:00
Aleksandr Loktionov	f0a48fb441	iavf: Change GFP_KERNEL to GFP_ATOMIC in kzalloc() iavf_add_vlan() is being called in atomic context so kzalloc() needs GFP_ATOMIC. This patch fixes it. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:25 -07:00
Mitch Williams	88ec7308ea	iavf: wait longer for close to complete On some hardware/driver/architecture combinations, it may take longer than 200msec for all close operations to be completed, causing a spurious error message to be logged. Increase the timeout value to 500msec to avoid this erroneous error. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:25 -07:00
Mitch Williams	168d91cf2a	iavf: use signed variable The counter variable in iavf_clean_tx_irq starts out negative and climbs to 0. So allocating it as u16 is actually a really bad idea that just happens to work because the value underflows and overflows consistently on most architectures. Replace the u16 with an int so signed math works as expected. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:25 -07:00
Akeem G Abodunrin	c2417a7b0e	iavf: Create VLAN tag elements starting from the first element This patch changes how VLAN tag are being populated and programmed into the HW - Instead of start adding VF VLAN tag from the last member of the element list, start from the first member of the list, until number of allowed VLAN tags is exhausted in the HW. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-06-17 15:39:25 -07:00
Christian Brauner	d728cf7916	fs/namespace: fix unprivileged mount propagation When propagating mounts across mount namespaces owned by different user namespaces it is not possible anymore to move or umount the mount in the less privileged mount namespace. Here is a reproducer: sudo mount -t tmpfs tmpfs /mnt sudo --make-rshared /mnt # create unprivileged user + mount namespace and preserve propagation unshare -U -m --map-root --propagation=unchanged # now change back to the original mount namespace in another terminal: sudo mkdir /mnt/aaa sudo mount -t tmpfs tmpfs /mnt/aaa # now in the unprivileged user + mount namespace mount --move /mnt/aaa /opt Unfortunately, this is a pretty big deal for userspace since this is e.g. used to inject mounts into running unprivileged containers. So this regression really needs to go away rather quickly. The problem is that a recent change falsely locked the root of the newly added mounts by setting MNT_LOCKED. Fix this by only locking the mounts on copy_mnt_ns() and not when adding a new mount. Fixes: `3bd045cc9c` ("separate copying and locking mount tree on cross-userns copies") Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Tested-by: Christian Brauner <christian@brauner.io> Acked-by: Christian Brauner <christian@brauner.io> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-06-17 17:36:09 -04:00
Eric Biggers	1b0b9cc8d3	vfs: fsmount: add missing mntget() sys_fsmount() needs to take a reference to the new mount when adding it to the anonymous mount namespace. Otherwise the filesystem can be unmounted while it's still in use, as found by syzkaller. Reported-by: Mark Rutland <mark.rutland@arm.com> Reported-by: syzbot+99de05d099a170867f22@syzkaller.appspotmail.com Reported-by: syzbot+7008b8b8ba7df475fdc8@syzkaller.appspotmail.com Fixes: `93766fbd26` ("vfs: syscall: Add fsmount() to create a mount for a superblock") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-06-17 17:36:07 -04:00
Jiri Pirko	f517f2716c	net: sched: cls_matchall: allow to delete filter Currently user is unable to delete the filter. See following example: $ tc filter add dev ens16np1 ingress pref 1 handle 1 matchall action drop $ tc filter show dev ens16np1 ingress filter protocol all pref 1 matchall chain 0 filter protocol all pref 1 matchall chain 0 handle 0x1 in_hw action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 $ tc filter del dev ens16np1 ingress pref 1 handle 1 matchall action drop RTNETLINK answers: Operation not supported Implement tcf_proto_ops->delete() op and allow user to delete the filter. Reported-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 14:05:32 -07:00
Colin Ian King	ad9bf54519	net: hns3: fix dereference of ae_dev before it is null checked Pointer ae_dev is null checked however, prior to that it is dereferenced when assigned pointer ops. Fix this by assigning pointer ops after ae_dev has been null checked. Addresses-Coverity: ("Dereference before null check") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 14:02:57 -07:00
David S. Miller	43321251e2	Merge branch 'net-sched-act_ctinfo-fixes' Kevin Darbyshire-Bryant says: ==================== net: sched: act_ctinfo: fixes This is first attempt at sending a small series. Order is important because one bug (policy validation) prevents us from encountering the more important 'OOPS' generating bug in action creation. Fix the OOPS first. Confession time: Until very recently, development of this module has been done on 'net-next' tree to 'clean compile' level with run-time testing on backports to 4.14 & 4.19 kernels under openwrt. It turns out that sched: action: based code has been under more active change than I realised. During the back & forward porting during development & testing, the critical ACT_P_CREATED return code got missed despite being in the 4.14 & 4.19 backports. I have now gone through the init functions, using act_csum as reference with a fine toothed comb and am happy they do the same things. This issue hadn't been caught till now due to another issue caused by new strict nla_parse_nested function failing parsing validation before action creation. Thanks to Marcelo Leitner <marcelo.leitner@gmail.com> for flagging extack deficiency (fixed in `733f0766c3` sched: act_ctinfo: use extack error reporting) which led to `b424e432e7` ("netlink: add validation of NLA_F_NESTED flag") and `8cb081746c` ("netlink: make validation more configurable for future strictness”) which led to the policy validation fix, which then led to the action creation fix both contained in this series. If I ever get to a developer conference please feel free to tar/feather/apply cone of shame. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 14:00:30 -07:00
Kevin Darbyshire-Bryant	c197d63627	net: sched: act_ctinfo: fix policy validation Fix nla_policy definition by specifying an exact length type attribute to CTINFO action paraneter block structure. Without this change, netlink parsing will fail validation and the action will not be instantiated. `8cb081746c` ("netlink: make validation more configurable for future") introduced much stricter checking to attributes being passed via netlink. Existing actions were updated to use less restrictive deprecated versions of nla_parse_nested. As a new module, act_ctinfo should be designed to use the strict checking model otherwise, well, what was the point of implementing it. Confession time: Until very recently, development of this module has been done on 'net-next' tree to 'clean compile' level with run-time testing on backports to 4.14 & 4.19 kernels under openwrt. This is how I managed to miss the run-time impacts of the new strict nla_parse_nested function. I hopefully have learned something from this (glances toward laptop running a net-next kernel) There is however a still outstanding implication on iproute2 user space in that it needs to be told to pass nested netlink messages with the nested attribute actually set. So even with this kernel fix to do things correctly you still cannot instantiate a new 'strict' nla_parse_nested based action such as act_ctinfo with iproute2's tc. Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 14:00:30 -07:00
Kevin Darbyshire-Bryant	a658c2e49f	net: sched: act_ctinfo: fix action creation Use correct return value on action creation: ACT_P_CREATED. The use of incorrect return value could result in a situation where the system thought a ctinfo module was listening but actually wasn't instantiated correctly leading to an OOPS in tcf_generic_walker(). Confession time: Until very recently, development of this module has been done on 'net-next' tree to 'clean compile' level with run-time testing on backports to 4.14 & 4.19 kernels under openwrt. During the back & forward porting during development & testing, the critical ACT_P_CREATED return code got missed despite being in the 4.14 & 4.19 backports. I have now gone through the init functions, using act_csum as reference with a fine toothed comb. Bonus, no more OOPSes. I managed to also miss this issue till now due to the new strict nla_parse_nested function failing validation before action creation. As an inexperienced developer I've learned that copy/pasting/backporting/forward porting code correctly is hard. If I ever get to a developer conference I shall don the cone of shame. Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 14:00:30 -07:00
Jason Wang	098eadce3c	vhost_net: disable zerocopy by default Vhost_net was known to suffer from HOL[1] issues which is not easy to fix. Several downstream disable the feature by default. What's more, the datapath was split and datacopy path got the support of batching and XDP support recently which makes it faster than zerocopy part for small packets transmission. It looks to me that disable zerocopy by default is more appropriate. It cold be enabled by default again in the future if we fix the above issues. [1] https://patchwork.kernel.org/patch/3787671/ Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 13:58:02 -07:00
Ard Biesheuvel	c681edae33	net: ipv4: move tcp_fastopen server side code to SipHash library Using a bare block cipher in non-crypto code is almost always a bad idea, not only for security reasons (and we've seen some examples of this in the kernel in the past), but also for performance reasons. In the TCP fastopen case, we call into the bare AES block cipher one or two times (depending on whether the connection is IPv4 or IPv6). On most systems, this results in a call chain such as crypto_cipher_encrypt_one(ctx, dst, src) crypto_cipher_crt(tfm)->cit_encrypt_one(crypto_cipher_tfm(tfm), ...); aesni_encrypt kernel_fpu_begin(); aesni_enc(ctx, dst, src); // asm routine kernel_fpu_end(); It is highly unlikely that the use of special AES instructions has a benefit in this case, especially since we are doing the above twice for IPv6 connections, instead of using a transform which can process the entire input in one go. We could switch to the cbcmac(aes) shash, which would at least get rid of the duplicated overhead in some cases (i.e., today, only arm64 has an accelerated implementation of cbcmac(aes), while x86 will end up using the generic cbcmac template wrapping the AES-NI cipher, which basically ends up doing exactly the above). However, in the given context, it makes more sense to use a light-weight MAC algorithm that is more suitable for the purpose at hand, such as SipHash. Since the output size of SipHash already matches our chosen value for TCP_FASTOPEN_COOKIE_SIZE, and given that it accepts arbitrary input sizes, this greatly simplifies the code as well. NOTE: Server farms backing a single server IP for load balancing purposes and sharing a single fastopen key will be adversely affected by this change unless all systems in the pool receive their kernel upgrades at the same time. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-17 13:56:26 -07:00

1 2 3 4 5 ...

842344 Commits