linux

iv/linux

Author	SHA1	Message	Date
Vladimir Oltean	6edb9e8d45	net: dsa: felix: restore multicast flood to CPU when NPI tagger reinitializes ocelot_init sets up PGID_MC to include the CPU port module, and that is fine, but the ocelot-8021q tagger removes the CPU port module from the unknown multicast replicator. So after a transition from the default ocelot tagger towards ocelot-8021q and then again towards ocelot, multicast flooding towards the CPU port module will be disabled. Fixes: e21268efbe26 ("net: dsa: felix: perform switch setup for tag_8021q") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:08:05 -08:00
Vladimir Oltean	a8b659e7ff	net: dsa: act as passthrough for bridge port flags There are multiple ways in which a PORT_BRIDGE_FLAGS attribute can be expressed by the bridge through switchdev, and not all of them can be emulated by DSA mid-layer API at the same time. One possible configuration is when the bridge offloads the port flags using a mask that has a single bit set - therefore only one feature should change. However, DSA currently groups together unicast and multicast flooding in the .port_egress_floods method, which limits our options when we try to add support for turning off broadcast flooding: do we extend .port_egress_floods with a third parameter which b53 and mv88e6xxx will ignore? But that means that the DSA layer, which currently implements the PRE_BRIDGE_FLAGS attribute all by itself, will see that .port_egress_floods is implemented, and will report that all 3 types of flooding are supported - not necessarily true. Another configuration is when the user specifies more than one flag at the same time, in the same netlink message. If we were to create one individual function per offloadable bridge port flag, we would limit the expressiveness of the switch driver of refusing certain combinations of flag values. For example, a switch may not have an explicit knob for flooding of unknown multicast, just for flooding in general. In that case, the only correct thing to do is to allow changes to BR_FLOOD and BR_MCAST_FLOOD in tandem, and never allow mismatched values. But having a separate .port_set_unicast_flood and .port_set_multicast_flood would not allow the driver to possibly reject that. Also, DSA doesn't consider it necessary to inform the driver that a SWITCHDEV_ATTR_ID_BRIDGE_MROUTER attribute was offloaded, because it just calls .port_egress_floods for the CPU port. When we'll add support for the plain SWITCHDEV_ATTR_ID_PORT_MROUTER, that will become a real problem because the flood settings will need to be held statefully in the DSA middle layer, otherwise changing the mrouter port attribute will impact the flooding attribute. And that's _assuming_ that the underlying hardware doesn't have anything else to do when a multicast router attaches to a port than flood unknown traffic to it. If it does, there will need to be a dedicated .port_set_mrouter anyway. So we need to let the DSA drivers see the exact form that the bridge passes this switchdev attribute in, otherwise we are standing in the way. Therefore we also need to use this form of language when communicating to the driver that it needs to configure its initial (before bridge join) and final (after bridge leave) port flags. The b53 and mv88e6xxx drivers are converted to the passthrough API and their implementation of .port_egress_floods is split into two: a function that configures unicast flooding and another for multicast. The mv88e6xxx implementation is quite hairy, and it turns out that the implementations of unknown unicast flooding are actually the same for 6185 and for 6352: behind the confusing names actually lie two individual bits: NO_UNKNOWN_MC -> FLOOD_UC = 0x4 = BIT(2) NO_UNKNOWN_UC -> FLOOD_MC = 0x8 = BIT(3) so there was no reason to entangle them in the first place. Whereas the 6185 writes to MV88E6185_PORT_CTL0_FORWARD_UNKNOWN of PORT_CTL0, which has the exact same bit index. I have left the implementations separate though, for the only reason that the names are different enough to confuse me, since I am not able to double-check with a user manual. The multicast flooding setting for 6185 is in a different register than for 6352 though. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:08:04 -08:00
Vladimir Oltean	e18f4c18ab	net: switchdev: pass flags and mask to both {PRE_,}BRIDGE_FLAGS attributes This switchdev attribute offers a counterproductive API for a driver writer, because although br_switchdev_set_port_flag gets passed a "flags" and a "mask", those are passed piecemeal to the driver, so while the PRE_BRIDGE_FLAGS listener knows what changed because it has the "mask", the BRIDGE_FLAGS listener doesn't, because it only has the final value. But certain drivers can offload only certain combinations of settings, like for example they cannot change unicast flooding independently of multicast flooding - they must be both on or both off. The way the information is passed to switchdev makes drivers not expressive enough, and unable to reject this request ahead of time, in the PRE_BRIDGE_FLAGS notifier, so they are forced to reject it during the deferred BRIDGE_FLAGS attribute, where the rejection is currently ignored. This patch also changes drivers to make use of the "mask" field for edge detection when possible. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:08:04 -08:00
Vladimir Oltean	5e38c15856	net: dsa: configure better brport flags when ports leave the bridge For a DSA switch port operating in standalone mode, address learning doesn't make much sense since that is a bridge function. In fact, address learning even breaks setups such as this one: +---------------------------------------------+ \| \| \| +-------------------+ \| \| \| br0 \| send receive \| \| +--------+-+--------+ +--------+ +--------+ \| \| \| \| \| \| \| \| \| \| \| \| \| swp0 \| \| swp1 \| \| swp2 \| \| swp3 \| \| \| \| \| \| \| \| \| \| \| \| +-+--------+-+--------+-+--------+-+--------+-+ \| ^ \| ^ \| \| \| \| \| +-----------+ \| \| \| +--------------------------------+ because if the switch has a single FDB (can offload a single bridge) then source address learning on swp3 can "steal" the source MAC address of swp2 from br0's FDB, because learning frames coming from swp2 will be done twice: first on the swp1 ingress port, second on the swp3 ingress port. So the hardware FDB will become out of sync with the software bridge, and when swp2 tries to send one more packet towards swp1, the ASIC will attempt to short-circuit the forwarding path and send it directly to swp3 (since that's the last port it learned that address on), which it obviously can't, because swp3 operates in standalone mode. So DSA drivers operating in standalone mode should still configure a list of bridge port flags even when they are standalone. Currently DSA attempts to call dsa_port_bridge_flags with 0, which disables egress flooding of unknown unicast and multicast, something which doesn't make much sense. For the switches that implement .port_egress_floods - b53 and mv88e6xxx, it probably doesn't matter too much either, since they can possibly inject traffic from the CPU into a standalone port, regardless of MAC DA, even if egress flooding is turned off for that port, but certainly not all DSA switches can do that - sja1105, for example, can't. So it makes sense to use a better common default there, such as "flood everything". It should also be noted that what DSA calls "dsa_port_bridge_flags()" is a degenerate name for just calling .port_egress_floods(), since nothing else is implemented - not learning, in particular. But disabling address learning, something that this driver is also coding up for, will be supported by individual drivers once .port_egress_floods is replaced with a more generic .port_bridge_flags. Previous attempts to code up this logic have been in the common bridge layer, but as pointed out by Ido Schimmel, there are corner cases that are missed when doing that: https://patchwork.kernel.org/project/netdevbpf/patch/20210209151936.97382-5-olteanv@gmail.com/ So, at least for now, let's leave DSA in charge of setting port flags before and after the bridge join and leave. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:08:04 -08:00
Vladimir Oltean	078bbb851e	net: bridge: don't print in br_switchdev_set_port_flag For the netlink interface, propagate errors through extack rather than simply printing them to the console. For the sysfs interface, we still print to the console, but at least that's one layer higher than in switchdev, which also allows us to silently ignore the offloading of flags if that is ever needed in the future. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:08:04 -08:00
Vladimir Oltean	304ae3bf1c	net: bridge: offload all port flags at once in br_setport If for example this command: ip link set swp0 type bridge_slave flood off mcast_flood off learning off succeeded at configuring BR_FLOOD and BR_MCAST_FLOOD but not at BR_LEARNING, there would be no attempt to revert the partial state in any way. Arguably, if the user changes more than one flag through the same netlink command, this one _should_ be all or nothing, which means it should be passed through switchdev as all or nothing. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:08:04 -08:00
Vladimir Oltean	4c08c586ff	net: switchdev: propagate extack to port attributes When a struct switchdev_attr is notified through switchdev, there is no way to report informational messages, unlike for struct switchdev_obj. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:08:04 -08:00
Davide Caratti	d212683805	flow_dissector: fix TTL and TOS dissection on IPv4 fragments the following command: # tc filter add dev $h2 ingress protocol ip pref 1 handle 101 flower \ $tcflags dst_ip 192.0.2.2 ip_ttl 63 action drop doesn't drop all IPv4 packets that match the configured TTL / destination address. In particular, if "fragment offset" or "more fragments" have non zero value in the IPv4 header, setting of FLOW_DISSECTOR_KEY_IP is simply ignored. Fix this dissecting IPv4 TTL and TOS before fragment info; while at it, add a selftest for tc flower's match on 'ip_ttl' that verifies the correct behavior. Fixes: 518d8a2e9bad ("net/flow_dissector: add support for dissection of misc ip header fields") Reported-by: Shuang Li <shuali@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:03:51 -08:00
David S. Miller	b0aae0bde2	octeontx2: Fix condition. Fixes: 93efb0c656837 ("octeontx2-pf: Fix out-of-bounds read in otx2_get_fecparam()") Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:56:08 -08:00
David S. Miller	4b47ad0079	Merge branch 'ipa-cleanups' Alex Elder says: ==================== net: ipa: some more cleanup Version 3 of this series uses dev_err_probe() in the second patch, as suggested by Heiner Kallweit. Version 2 was sent to ensure the series was based on current net-next/master, and added copyright updates to files touched. The original introduction is below. This is another fairly innocuous set of cleanup patches. The first was motivated by a bug found that would affect IPA v4.5. It maintain a new GSI address pointer; one is the "raw" (original mapped) address, and the other will have been adjusted if necessary for use on newer platforms. The second just quiets some unnecessary noise during early probe. The third fixes some errors that show up when IPA_VALIDATION is enabled. The last two just create helper functions to improve readability. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:54:17 -08:00
Alex Elder	6170b6dab2	net: ipa: introduce gsi_channel_initialized() Create a simple helper function that indicates whether a channel has been initialized. This abstacts/hides the details of how this is determined. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:54:17 -08:00
Alex Elder	a266ad6b5d	net: ipa: introduce ipa_table_hash_support() Introduce a new function to abstract the knowledge of whether hashed routing and filter tables are supported for a given IPA instance. IPA v4.2 is the only one that doesn't support hashed tables (now and for the foreseeable future), but the name of the helper function is better for explaining what's going on. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:54:17 -08:00
Alex Elder	2d65ed7692	net: ipa: fix register write command validation In ipa_cmd_register_write_valid() we verify that values we will supply to a REGISTER_WRITE IPA immediate command will fit in the fields that need to hold them. This patch fixes some issues in that function and ipa_cmd_register_write_offset_valid(). The dev_err() call in ipa_cmd_register_write_offset_valid() has some printf format errors: - The name of the register (corresponding to the string format specifier) was not supplied. - The IPA base offset and offset need to be supplied separately to match the other format specifiers. Also make the ~0 constant used there to compute the maximum supported offset value explicitly unsigned. There are two other issues in ipa_cmd_register_write_valid(): - There's no need to check the hash flush register for platforms (like IPA v4.2) that do not support hashed tables - The highest possible endpoint number, whose status register offset is computed, is COUNT - 1, not COUNT. Fix these problems, and add some additional commentary. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:54:17 -08:00
Alex Elder	4c7ccfcd09	net: ipa: use dev_err_probe() in ipa_clock.c When initializing the IPA core clock and interconnects, it's possible we'll get an EPROBE_DEFER error. This isn't really an error, it's just means we need to be re-probed later. Use dev_err_probe() to report the error rather than dev_err(). This avoids polluting the log with these "error" messages. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:54:17 -08:00
Alex Elder	571b1e7e58	net: ipa: use a separate pointer for adjusted GSI memory This patch actually fixes a bug, though it doesn't affect the two platforms supported currently. The fix implements GSI memory pointers a bit differently. For IPA version 4.5 and above, the address space for almost all GSI registers is adjusted downward by a fixed amount. This is currently handled by adjusting the I/O virtual address pointer after it has been mapped. The bug is that the pointer is not "de-adjusted" as it should be when it's unmapped. This patch fixes that error, but it does so by maintaining one "raw" pointer for the mapped memory range. This is assigned when the memory is mapped and used to unmap the memory. This pointer is also used to access the two registers that do not sit in the "adjusted" memory space. Rather than adjusting that pointer, we maintain a separate pointer that's an adjusted copy of the "raw" pointer, and that is used for most GSI register accesses. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:54:16 -08:00
David S. Miller	21cc70c75b	Last set of updates: * more minstrel work from Felix to reduce the probing overhead * QoS for nl80211 control port frames * STBC injection support * and a couple of small fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAmAmiEwACgkQB8qZga/f l8TVAxAAgqJ2zeDPYchCVNGUqrsPqFG6h3rBB2oKHUCrgBy150uSWmAyhvG4eiwP S4gLA/k42hHjxsmoScFGdjyaVMHv6CqcLkrPDfYsKvjZp258kw7Jbprv94KbeFkR 6ckpO3dVsyCFrUe3VqTgEtqNatixX3jqlZ6JemiU2hHI5prUPa4Fkt9m9fvwIaDO FoLywLdDjNHrOqo8qWjWDRfktGAuuFSFi1g+y5vNjlGPs6vck8ORP1/Bi9rXVxXD TrawcgID9/Ngvblckkg0yW2oqdPl/QuMPhnJRCwOQJbVqTmxcLjuDybRKfGTcw+D zd8FBCtH2lhW2MAbo3hh5977cj6DsCeRYcNb+wDePtv7uSAgoYP9G7CJyyXL061Y AOXizfDqejQuhQEWhi4oirgtwMHosESPxgW5pSmZPnjbgxBxHZJWRXN5/52PBKRH yiPQuhaSjSh5HAs8va3U8gSIqER5mIXqGoIlOTkwcaoSf/wgxoYPxeJgk4KQuRhx 6Ssky2dx7/HlYUN/tNhUbR9GkZJk5V453VTAt0LHPVYtWrry/rpUNTYUL131fPcG gs6nv/FDaWC9l/EtFE2wFpknqh+jbzt9+Vh51h1Sf3fORP3oqF6/tDAj2pgOSSUV YzmukXRJJLHowr9Xtr227yoWR0GA8A9vKlZqmE7f08fPjXdUzoE= =PXD5 -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-net-next-2021-02-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Last set of updates: * more minstrel work from Felix to reduce the probing overhead * QoS for nl80211 control port frames * STBC injection support * and a couple of small fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:48:52 -08:00
Gustavo A. R. Silva	93efb0c656	octeontx2-pf: Fix out-of-bounds read in otx2_get_fecparam() Code at line 967 implies that rsp->fwdata.supported_fec may be up to 4: 967: if (rsp->fwdata.supported_fec <= FEC_MAX_INDEX) If rsp->fwdata.supported_fec evaluates to 4, then there is an out-of-bounds read at line 971 because fec is an array with a maximum of 4 elements: 954 const int fec[] = { 955 ETHTOOL_FEC_OFF, 956 ETHTOOL_FEC_BASER, 957 ETHTOOL_FEC_RS, 958 ETHTOOL_FEC_BASER \| ETHTOOL_FEC_RS}; 959 #define FEC_MAX_INDEX 4 971: fecparam->fec = fec[rsp->fwdata.supported_fec]; Fix this by properly indexing fec[] with rsp->fwdata.supported_fec - 1. In this case the proper indexes 0 to 3 are used when rsp->fwdata.supported_fec evaluates to a range of 1 to 4, correspondingly. Fixes: d0cf9503e908 ("octeontx2-pf: ethtool fec mode support") Addresses-Coverity-ID: 1501722 ("Out-of-bounds read") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:47:39 -08:00
Colin Ian King	a6e0ee35ee	octeontx2-af: Fix spelling mistake "recievd" -> "received" There is a spelling mistake in the text in array rpm_rx_stats_fields, fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:45:43 -08:00
David S. Miller	79201f358d	wireless-drivers-next patches for v5.12 Second set of patches for v5.12. Last time there was a smaller pull request so unsurprisingly this time we have a big one. mt76 has new hardware support and lots of new features, iwlwifi getting new features and rtw88 got NAPI support. And the usual cleanups and fixes all over. Major changes: ath10k * support setting SAR limits via nl80211 rtw88 * support 8821 RFE type2 devices * NAPI support iwlwifi * add new FW API support * support for new So devices * support for RF interference mitigation (RFI) * support for PNVM (Platform Non-Volatile Memory, a firmware data file) from BIOS mt76 * add new mt7921e driver * 802.11 encap offload support * support for multiple pcie gen1 host interfaces on 7915 * 7915 testmode support * 7915 txbf support brcmfmac * support for CQM RSSI notifications wil6210 * support for extended DMG MCS 12.1 rate -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJgJl5/AAoJEG4XJFUm622bCrcIAIwarm51aHq8R0Py0ZYBEiIM STe/x3gjxVhzYHd438APchEMnePY6pWZ6e00GnZwMyWxZpMHK+yOYzWrewLvYI29 qyPFkjGdE6PhYjL+lJjYHn4M1cdAkO4t0FSvCy5OvOnuIgu0Yz3TXXQVxZtrzrIc 2q+bOR1kKaVBe8NOjggnxTWe4mTj6efeTkD0D5M+IbppERtmpcVLra9FSgz5IcLl hlKyNNtiNo/CmQu0bGejXR7ip+JA08f5No6TOlEQKR6pBvBvwrvgXHsq9rfQO8qA CDhz4DqfPPYMNXnJuFVAzYsw4raZblwTg5GtIjH7e0cbXxSAx50Ne2Hd850nkO8= =eQGJ -----END PGP SIGNATURE----- Merge tag 'wireless-drivers-next-2021-02-12' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for v5.12 Second set of patches for v5.12. Last time there was a smaller pull request so unsurprisingly this time we have a big one. mt76 has new hardware support and lots of new features, iwlwifi getting new features and rtw88 got NAPI support. And the usual cleanups and fixes all over. Major changes: ath10k * support setting SAR limits via nl80211 rtw88 * support 8821 RFE type2 devices * NAPI support iwlwifi * add new FW API support * support for new So devices * support for RF interference mitigation (RFI) * support for PNVM (Platform Non-Volatile Memory, a firmware data file) from BIOS mt76 * add new mt7921e driver * 802.11 encap offload support * support for multiple pcie gen1 host interfaces on 7915 * 7915 testmode support * 7915 txbf support brcmfmac * support for CQM RSSI notifications wil6210 * support for extended DMG MCS 12.1 rate ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:43:13 -08:00
Vadim Fedorenko	295f830e53	rxrpc: Fix dependency on IPv6 in udp tunnel config As udp_port_cfg struct changes its members with dependency on IPv6 configuration, the code in rxrpc should also check for IPv6. Fixes: 1a9b86c9fd95 ("rxrpc: use udp tunnel APIs instead of open code in rxrpc_open_socket") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:42:05 -08:00
Doug Brown	39935dccb2	appletalk: Fix skb allocation size in loopback case If a DDP broadcast packet is sent out to a non-gateway target, it is also looped back. There is a potential for the loopback device to have a longer hardware header length than the original target route's device, which can result in the skb not being created with enough room for the loopback device's hardware header. This patch fixes the issue by determining that a loopback will be necessary prior to allocating the skb, and if so, ensuring the skb has enough room. This was discovered while testing a new driver that creates a LocalTalk network interface (LTALK_HLEN = 1). It caused an skb_under_panic. Signed-off-by: Doug Brown <doug@schmorgal.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:40:28 -08:00
David S. Miller	0a2f6b32cc	Merge branch 'mptcp-genl-events' Mat Martineau says: ==================== mptcp: Add genl events for connection info This series from the MPTCP tree adds genl multicast events that are important for implementing a userspace path manager. In MPTCP, a path manager is responsible for adding or removing additional subflows on each MPTCP connection. The in-kernel path manager (already part of the kernel) is a better fit for many server use cases, but the additional flexibility of userspace path managers is often useful for client devices. Patches 1, 2, 4, 5, and 6 do some refactoring to streamline the netlink event implementation in the final patch. Patch 3 improves the timeliness of subflow destruction to ensure the 'subflow closed' event will be sent soon enough. Patch 7 allows use of the GENL_UNS_ADMIN_PERM flag on genl mcast groups to mandate CAP_NET_ADMIN, which is important to protect token information in the MPTCP events. This is a genetlink change. Patch 8 adds the MPTCP netlink events. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:31:46 -08:00
Florian Westphal	b911c97c7d	mptcp: add netlink event support Allow userspace (mptcpd) to subscribe to mptcp genl multicast events. This implementation reuses the same event API as the mptcp kernel fork to ease integration of existing tools, e.g. mptcpd. Supported events include: 1. start and close of an mptcp connection 2. start and close of subflows (joins) 3. announce and withdrawals of addresses 4. subflow priority (backup/non-backup) change. Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:31:46 -08:00
Florian Westphal	4d54cc3211	mptcp: avoid lock_fast usage in accept path Once event support is added this may need to allocate memory while msk lock is held with softirqs disabled. Not using lock_fast also allows to do the allocation with GFP_KERNEL. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:31:46 -08:00
Florian Westphal	6c714f1b54	mptcp: pass subflow socket to a few helpers Pass the first/initial subflow to the existing functions so they can pass this on to the notification handler that is added later in the series. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:31:45 -08:00
Florian Westphal	b263b0d7d6	mptcp: move subflow close loop after sk close check In case mptcp socket is already dead the entire mptcp socket will be freed. We can avoid the close check in this case. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:31:45 -08:00
Florian Westphal	40947e1399	mptcp: schedule worker when subflow is closed When remote side closes a subflow we should schedule the worker to dispose of the subflow in a timely manner. Otherwise, SF_CLOSED event won't be generated until the mptcp socket itself is closing or local side is closing another subflow. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:31:45 -08:00
Florian Westphal	a141e02e39	mptcp: split __mptcp_close_ssk helper Prepare for subflow close events: When mptcp connection is torn down its enough to send the mptcp socket close notification rather than a subflow close event for all of the subflows followed by the mptcp close event. This splits the helper: mptcp_close_ssk() will emit the close notification, __mptcp_close_ssk will not. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:31:45 -08:00
Florian Westphal	e980143068	mptcp: move pm netlink work into pm_netlink Allows to make some functions static and avoids acquire of the pm spinlock in protocol.c. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:31:45 -08:00
David S. Miller	0a82c37e34	Merge branch 'mptcp-selftests' Mat Martineau says: ==================== mptcp: Selftest enhancement and fixes This is a collection of selftest updates from the MPTCP tree. Patch 1 uses additional 'ss' command line parameters and 'nstat' to improve output when certain MPTCP tests fail. Patches 2 & 3 fix a copy/paste error and some output formatting. Patch 4 makes sure tests still pass if certain connection-related packets are retransmitted. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:24:31 -08:00
Matthieu Baerts	5f88117f25	selftests: mptcp: fail if not enough SYN/3rd ACK If we receive less MPCapable SYN or 3rd ACK than expected, we now mark the test as failed. On the other hand, if we receive more, we keep the warning but we add a hint that it is probably due to retransmissions and that's why we don't mark the test as failed. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/148 Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:20:34 -08:00
Matthieu Baerts	45759a8715	selftests: mptcp: display warnings on one line Before we had this in case of SYN retransmissions: (...) # ns4 MPTCP -> ns2 (10.0.1.2:10034 ) MPTCP (duration 1201ms) [ OK ] # ns4 MPTCP -> ns2 (dead:beef:1::2:10035) MPTCP (duration 1242ms) [ OK ] # ns4 MPTCP -> ns2 (10.0.2.1:10036 ) MPTCP ns2-60143c00-cDZWo4 SYNRX: MPTCP -> MPTCP: expect 11, got # 13 # (duration 6221ms) [ OK ] # ns4 MPTCP -> ns2 (dead:beef:2::1:10037) MPTCP (duration 1427ms) [ OK ] # ns4 MPTCP -> ns3 (10.0.2.2:10038 ) MPTCP (duration 881ms) [ OK ] (...) Now we have: (...) # ns4 MPTCP -> ns2 (10.0.1.2:10034 ) MPTCP (duration 1201ms) [ OK ] # ns4 MPTCP -> ns2 (dead:beef:1::2:10035) MPTCP (duration 1242ms) [ OK ] # ns4 MPTCP -> ns2 (10.0.2.1:10036 ) MPTCP (duration 6221ms) [ OK ] WARN: SYNRX: expect 11, got 13 # ns4 MPTCP -> ns2 (dead:beef:2::1:10037) MPTCP (duration 1427ms) [ OK ] # ns4 MPTCP -> ns3 (10.0.2.2:10038 ) MPTCP (duration 881ms) [ OK ] (...) So we put everything on one line, keep the durations and "OK" aligned and removed duplicated info to short the warning. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:20:34 -08:00
Matthieu Baerts	f384221a38	selftests: mptcp: fix ACKRX debug message Info from received MPCapable SYN were printed instead of the ones from received MPCapable 3rd ACK. Fixes: fed61c4b584c ("selftests: mptcp: make 2nd net namespace use tcp syn cookies unconditionally") Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:20:34 -08:00
Paolo Abeni	767389c8dd	selftests: mptcp: dump more info on errors Even if that may sound completely unlikely, the mptcp implementation is not perfect, yet. When the self-tests report an error we usually need more information of what the scripts currently report. iproute allow provides some additional goodies since a few releases, let's dump them. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:20:34 -08:00
Jesper Dangaard Brouer	b62eba5632	selftests/bpf: Tests using bpf_check_mtu BPF-helper Adding selftest for BPF-helper bpf_check_mtu(). Making sure it can be used from both XDP and TC. V16: - Fix 'void' function definition V11: - Addresse nitpicks from Andrii Nakryiko V10: - Remove errno non-zero test in CHECK_ATTR() - Addresse comments from Andrii Nakryiko Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/161287791989.790810.13612620012522164562.stgit@firesoul	2021-02-13 01:15:28 +01:00
Jesper Dangaard Brouer	6b8838be7e	selftests/bpf: Use bpf_check_mtu in selftest test_cls_redirect This demonstrate how bpf_check_mtu() helper can easily be used together with bpf_skb_adjust_room() helper, prior to doing size adjustment, as delta argument is already setup. Hint: This specific test can be selected like this: ./test_progs -t cls_redirect Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/161287791481.790810.4444271170546646080.stgit@firesoul	2021-02-13 01:15:28 +01:00
Jesper Dangaard Brouer	5f7d57280c	bpf: Drop MTU check when doing TC-BPF redirect to ingress The use-case for dropping the MTU check when TC-BPF does redirect to ingress, is described by Eyal Birger in email[0]. The summary is the ability to increase packet size (e.g. with IPv6 headers for NAT64) and ingress redirect packet and let normal netstack fragment packet as needed. [0] https://lore.kernel.org/netdev/CAHsH6Gug-hsLGHQ6N0wtixdOa85LDZ3HNRHVd0opR=19Qo4W4Q@mail.gmail.com/ V15: - missing static for function declaration V9: - Make net_device "up" (IFF_UP) check explicit in skb_do_redirect V4: - Keep net_device "up" (IFF_UP) check. - Adjustment to handle bpf_redirect_peer() helper Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/161287790971.790810.11785274340154740591.stgit@firesoul	2021-02-13 01:15:28 +01:00
Jesper Dangaard Brouer	34b2021cc6	bpf: Add BPF-helper for MTU checking This BPF-helper bpf_check_mtu() works for both XDP and TC-BPF programs. The SKB object is complex and the skb->len value (accessible from BPF-prog) also include the length of any extra GRO/GSO segments, but without taking into account that these GRO/GSO segments get added transport (L4) and network (L3) headers before being transmitted. Thus, this BPF-helper is created such that the BPF-programmer don't need to handle these details in the BPF-prog. The API is designed to help the BPF-programmer, that want to do packet context size changes, which involves other helpers. These other helpers usually does a delta size adjustment. This helper also support a delta size (len_diff), which allow BPF-programmer to reuse arguments needed by these other helpers, and perform the MTU check prior to doing any actual size adjustment of the packet context. It is on purpose, that we allow the len adjustment to become a negative result, that will pass the MTU check. This might seem weird, but it's not this helpers responsibility to "catch" wrong len_diff adjustments. Other helpers will take care of these checks, if BPF-programmer chooses to do actual size adjustment. V14: - Improve man-page desc of len_diff. V13: - Enforce flag BPF_MTU_CHK_SEGS cannot use len_diff. V12: - Simplify segment check that calls skb_gso_validate_network_len. - Helpers should return long V9: - Use dev->hard_header_len (instead of ETH_HLEN) - Annotate with unlikely req from Daniel - Fix logic error using skb_gso_validate_network_len from Daniel V6: - Took John's advice and dropped BPF_MTU_CHK_RELAX - Returned MTU is kept at L3-level (like fib_lookup) V4: Lot of changes - ifindex 0 now use current netdev for MTU lookup - rename helper from bpf_mtu_check to bpf_check_mtu - fix bug for GSO pkt length (as skb->len is total len) - remove __bpf_len_adj_positive, simply allow negative len adj Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/161287790461.790810.3429728639563297353.stgit@firesoul	2021-02-13 01:15:28 +01:00
David S. Miller	0c9fc2ede9	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2021-02-13 The following pull-request contains BPF updates for your net tree. We've added 2 non-merge commits during the last 3 day(s) which contain a total of 2 files changed, 9 insertions(+), 11 deletions(-). The main changes are: 1) Fix mod32 truncation handling in verifier, from Daniel Borkmann. 2) Fix XDP redirect tests to explicitly use bash, from Björn Töpel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 16:15:23 -08:00
Jesper Dangaard Brouer	e1850ea9bd	bpf: bpf_fib_lookup return MTU value as output when looked up The BPF-helpers for FIB lookup (bpf_xdp_fib_lookup and bpf_skb_fib_lookup) can perform MTU check and return BPF_FIB_LKUP_RET_FRAG_NEEDED. The BPF-prog don't know the MTU value that caused this rejection. If the BPF-prog wants to implement PMTU (Path MTU Discovery) (rfc1191) it need to know this MTU value for the ICMP packet. Patch change lookup and result struct bpf_fib_lookup, to contain this MTU value as output via a union with 'tot_len' as this is the value used for the MTU lookup. V5: - Fixed uninit value spotted by Dan Carpenter. - Name struct output member mtu_result Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/161287789952.790810.13134700381067698781.stgit@firesoul	2021-02-13 01:15:22 +01:00
Jesper Dangaard Brouer	2c0a10af68	bpf: Fix bpf_fib_lookup helper MTU check for SKB ctx BPF end-user on Cilium slack-channel (Carlo Carraro) wants to use bpf_fib_lookup for doing MTU-check, but prior to extending packet size, by adjusting fib_params 'tot_len' with the packet length plus the expected encap size. (Just like the bpf_check_mtu helper supports). He discovered that for SKB ctx the param->tot_len was not used, instead skb->len was used (via MTU check in is_skb_forwardable() that checks against netdev MTU). Fix this by using fib_params 'tot_len' for MTU check. If not provided (e.g. zero) then keep existing TC behaviour intact. Notice that 'tot_len' for MTU check is done like XDP code-path, which checks against FIB-dst MTU. V16: - Revert V13 optimization, 2nd lookup is against egress/resulting netdev V13: - Only do ifindex lookup one time, calling dev_get_by_index_rcu(). V10: - Use same method as XDP for 'tot_len' MTU check Fixes: 4c79579b44b1 ("bpf: Change bpf_fib_lookup to return lookup status") Reported-by: Carlo Carraro <colrack@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/161287789444.790810.15247494756551413508.stgit@firesoul	2021-02-13 01:14:08 +01:00
Jesper Dangaard Brouer	6306c1189e	bpf: Remove MTU check in __bpf_skb_max_len Multiple BPF-helpers that can manipulate/increase the size of the SKB uses __bpf_skb_max_len() as the max-length. This function limit size against the current net_device MTU (skb->dev->mtu). When a BPF-prog grow the packet size, then it should not be limited to the MTU. The MTU is a transmit limitation, and software receiving this packet should be allowed to increase the size. Further more, current MTU check in __bpf_skb_max_len uses the MTU from ingress/current net_device, which in case of redirects uses the wrong net_device. This patch keeps a sanity max limit of SKB_MAX_ALLOC (16KiB). The real limit is elsewhere in the system. Jesper's testing[1] showed it was not possible to exceed 8KiB when expanding the SKB size via BPF-helper. The limiting factor is the define KMALLOC_MAX_CACHE_SIZE which is 8192 for SLUB-allocator (CONFIG_SLUB) in-case PAGE_SIZE is 4096. This define is in-effect due to this being called from softirq context see code __gfp_pfmemalloc_flags() and __do_kmalloc_node(). Jakub's testing showed that frames above 16KiB can cause NICs to reset (but not crash). Keep this sanity limit at this level as memory layer can differ based on kernel config. [1] https://github.com/xdp-project/bpf-examples/tree/master/MTU-tests Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/161287788936.790810.2937823995775097177.stgit@firesoul	2021-02-13 01:14:08 +01:00
Daniel Borkmann	9b00f1b788	bpf: Fix truncation handling for mod32 dst reg wrt zero Recently noticed that when mod32 with a known src reg of 0 is performed, then the dst register is 32-bit truncated in verifier: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r0 = 0 1: R0_w=inv0 R1=ctx(id=0,off=0,imm=0) R10=fp0 1: (b7) r1 = -1 2: R0_w=inv0 R1_w=inv-1 R10=fp0 2: (b4) w2 = -1 3: R0_w=inv0 R1_w=inv-1 R2_w=inv4294967295 R10=fp0 3: (9c) w1 %= w0 4: R0_w=inv0 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 4: (b7) r0 = 1 5: R0_w=inv1 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 5: (1d) if r1 == r2 goto pc+1 R0_w=inv1 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 6: R0_w=inv1 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 6: (b7) r0 = 2 7: R0_w=inv2 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 7: (95) exit 7: R0=inv1 R1=inv(id=0,umin_value=4294967295,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2=inv4294967295 R10=fp0 7: (95) exit However, as a runtime result, we get 2 instead of 1, meaning the dst register does not contain (u32)-1 in this case. The reason is fairly straight forward given the 0 test leaves the dst register as-is: # ./bpftool p d x i 23 0: (b7) r0 = 0 1: (b7) r1 = -1 2: (b4) w2 = -1 3: (16) if w0 == 0x0 goto pc+1 4: (9c) w1 %= w0 5: (b7) r0 = 1 6: (1d) if r1 == r2 goto pc+1 7: (b7) r0 = 2 8: (95) exit This was originally not an issue given the dst register was marked as completely unknown (aka 64 bit unknown). However, after 468f6eafa6c4 ("bpf: fix 32-bit ALU op verification") the verifier casts the register output to 32 bit, and hence it becomes 32 bit unknown. Note that for the case where the src register is unknown, the dst register is marked 64 bit unknown. After the fix, the register is truncated by the runtime and the test passes: # ./bpftool p d x i 23 0: (b7) r0 = 0 1: (b7) r1 = -1 2: (b4) w2 = -1 3: (16) if w0 == 0x0 goto pc+2 4: (9c) w1 %= w0 5: (05) goto pc+1 6: (bc) w1 = w1 7: (b7) r0 = 1 8: (1d) if r1 == r2 goto pc+1 9: (b7) r0 = 2 10: (95) exit Semantics also match with {R,W}x mod{64,32} 0 -> {R,W}x. Invalid div has always been {R,W}x div{64,32} 0 -> 0. Rewrites are as follows: mod32: mod64: (16) if w0 == 0x0 goto pc+2 (15) if r0 == 0x0 goto pc+1 (9c) w1 %= w0 (9f) r1 %= r0 (05) goto pc+1 (bc) w1 = w1 Fixes: 468f6eafa6c4 ("bpf: fix 32-bit ALU op verification") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>	2021-02-13 00:53:12 +01:00
Jun'ichi Nomura	7d4553b69f	bpf, devmap: Use GFP_KERNEL for xdp bulk queue allocation The devmap bulk queue is allocated with GFP_ATOMIC and the allocation may fail if there is no available space in existing percpu pool. Since commit 75ccae62cb8d42 ("xdp: Move devmap bulk queue into struct net_device") moved the bulk queue allocation to NETDEV_REGISTER callback, whose context is allowed to sleep, use GFP_KERNEL instead of GFP_ATOMIC to let percpu allocator extend the pool when needed and avoid possible failure of netdev registration. As the required alignment is natural, we can simply use alloc_percpu(). Fixes: 75ccae62cb8d42 ("xdp: Move devmap bulk queue into struct net_device") Signed-off-by: Jun'ichi Nomura <junichi.nomura@nec.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210209082451.GA44021@jeru.linux.bs1.fc.nec.co.jp	2021-02-13 00:11:26 +01:00
Linus Torvalds	7989807dc0	4 small smb3 fixes to the new mount API -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmAm8mYACgkQiiy9cAdy T1FDMgwArdtgQtRL9hSOINBl19/OM9GSLszUtI/EctfZGgnGoNysIq4/pIvv7uqE egVohlVwJI4niGguU7AABj5vrthLsAbmzKi+e2N6kAcYtzpXeLvjkXVfyq/Bld66 oe7sYWMjH1TFEc64gejW7nYcxOsg93HQtAvNyyoAS3SlOOWsl2LI/AQiw3BXXVBo Jb1AkfpdBGglUW7esYVZUyVwCI/ZzYVEA0YTCpaGX+EIfdCWXm3ArNPP7E9gHOLb 3HUbNP6W5QKwgYL1tPX3s7AFEtj0+PxuREgB6mSTFOkWRRfZJUTma1AEfa9MUWGA KOnQKiIzIsmaOQGP/BumcrPr/7kgeqYEFZ2exNT8kVw6ETEHP1+A4j61KZI0mduz rgnQx21gPzVcDo0tfO8SjGSt3vzuRA+vkyZO4eB/nmTqJ4YVqX+E6E4vophKOUk2 ELqk0fUlX3uspqocZCor5nrLA0EadNV6P/LfFiRyGUTt+tOcOeYmyAdK7dWY1JTf wsd20mCY =vJKS -----END PGP SIGNATURE----- Merge tag '5.11-rc7-smb3-github' of git://github.com/smfrench/smb3-kernel Pull cifs fixes from Steve French: "Four small smb3 fixes to the new mount API (including a particularly important one for DFS links). These were found in testing this week of additional DFS scenarios, and a user testing of an apache container problem" * tag '5.11-rc7-smb3-github' of git://github.com/smfrench/smb3-kernel: cifs: Set CIFS_MOUNT_USE_PREFIX_PATH flag on setting cifs_sb->prepath. cifs: In the new mount api we get the full devname as source= cifs: do not disable noperm if multiuser mount option is not provided cifs: fix dfs-links	2021-02-12 14:45:39 -08:00
Yonghong Song	17d8beda27	bpf: Fix an unitialized value in bpf_iter Commit 15d83c4d7cef ("bpf: Allow loading of a bpf_iter program") cached btf_id in struct bpf_iter_target_info so later on if it can be checked cheaply compared to checking registered names. syzbot found a bug that uninitialized value may occur to bpf_iter_target_info->btf_id. This is because we allocated bpf_iter_target_info structure with kmalloc and never initialized field btf_id afterwards. This uninitialized btf_id is typically compared to a u32 bpf program func proto btf_id, and the chance of being equal is extremely slim. This patch fixed the issue by using kzalloc which will also prevent future likely instances due to adding new fields. Fixes: 15d83c4d7cef ("bpf: Allow loading of a bpf_iter program") Reported-by: syzbot+580f4f2a272e452d55cb@syzkaller.appspotmail.com Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210212005926.2875002-1-yhs@fb.com	2021-02-12 13:33:50 -08:00
David S. Miller	c3ff3b02e9	Merge branch 'hns3-cleanups' Huazhong Tan says: ==================== net: hns3: some cleanups for -next To improve code readability and maintainability, the series refactor out some bloated functions in the HNS3 ethernet driver. change log: V2: remove an unused variable in #5 previous version: V1: https://patchwork.kernel.org/project/netdevbpf/cover/1612943005-59416-1-git-send-email-tanhuazhong@huawei.com/ ==================== Acked-by: Jakub Kicinski <kuba@kernel.org>	2021-02-12 13:13:16 -08:00
Hao Chen	80a9f3f1fa	net: hns3: refactor out hclge_rm_vport_all_mac_table() hclge_rm_vport_all_mac_table() is bloated, so split it into separate functions for readability and maintainability. Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 13:13:15 -08:00
Huazhong Tan	5fd0e7b4f7	net: hns3: refactor out hclgevf_set_rss_tuple() To make it more readable and maintainable, split hclgevf_set_rss_tuple() into two parts. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 13:13:15 -08:00
Huazhong Tan	e291eff3bc	net: hns3: refactor out hclge_set_rss_tuple() To make it more readable and maintainable, split hclge_set_rss_tuple() into two parts. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 13:13:15 -08:00

... 4 5 6 7 8 ...

987828 Commits