1123538 Commits

Author SHA1 Message Date
Johannes Berg
b1622adaa5 wifi: mac80211_hwsim: refactor RX a bit
Refactor some common RX functionality between the netlink
and non-netlink paths, adding the special hwsim TLV (if
compiled) also in the netlink path.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 17:02:15 +02:00
Johannes Berg
b2c4aa35eb wifi: mac80211_hwsim: check STA magic in change_sta_links
Just as an additional check that mac80211 isn't doing
anything strange, add a check of the STA magic (which
gets assigned when the station is added, and cleared
when the station is removed).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 17:02:05 +02:00
Johannes Berg
774e00c20c wifi: mac80211: remove unused arg to ieee80211_chandef_eht_oper
We don't need the sdata argument, and it doesn't make any
sense for a direct conversion from one value to another,
so just remove the argument

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 17:01:56 +02:00
Johannes Berg
86e74a08fe wifi: mac80211_hwsim: remove multicast workaround
Now that we have proper multicast TX in mac80211, there's
no longer a need to fake something here.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 17:01:37 +02:00
Jinpeng Cui
a21cd7d63b wifi: nl80211: remove redundant err variable
Return value from rdev_set_mcast_rate() directly instead of
taking this in another redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Jinpeng Cui <cui.jinpeng2@zte.com.cn>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 17:01:09 +02:00
James Prestwood
3c06e91b40 wifi: mac80211: Support POWERED_ADDR_CHANGE feature
Adds support in mac80211 for NL80211_EXT_FEATURE_POWERED_ADDR_CHANGE.
The motivation behind this functionality is to fix limitations of
address randomization on frequencies which are disallowed in world
roaming.

The way things work now, if a client wants to randomize their address
per-connection it must power down the device, change the MAC, and
power back up. Here lies a problem since powering down the device
may result in frequencies being disabled (until the regdom is set).
If the desired BSS is on one such frequency the client is unable to
connect once the phy is powered again.

For mac80211 based devices changing the MAC while powered is possible
but currently disallowed (-EBUSY). This patch adds some logic to
allow a MAC change while powered by removing the interface, changing
the MAC, and adding it again. mac80211 will advertise support for
this feature so userspace can determine the best course of action e.g.
disallow address randomization on certain frequencies if not
supported.

There are certain limitations put on this which simplify the logic:
 - No active connection
 - No offchannel work, including scanning.

Signed-off-by: James Prestwood <prestwoj@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 17:01:04 +02:00
James Prestwood
a36c421690 wifi: nl80211: Add POWERED_ADDR_CHANGE feature
Add a new extended feature bit signifying that the wireless hardware
supports changing the MAC address while the underlying net_device is
powered. Note that this has a different meaning from
IFF_LIVE_ADDR_CHANGE as additional restrictions might be imposed by
the hardware, such as:

 - No connection is active on this interface, carrier is off
 - No scan is in progress
 - No offchannel operations are in progress

Signed-off-by: James Prestwood <prestwoj@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 16:58:41 +02:00
Johannes Berg
90703ba9bb wifi: mac80211: prevent 4-addr use on MLDs
We haven't tried this yet, and it's not very likely to
work well right now, so for now disable 4-addr use on
interfaces that are MLDs.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20220902161143.f2e4cc2efaa1.I5924e8fb44a2d098b676f5711b36bbc1b1bd68e2@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 16:57:34 +02:00
Johannes Berg
ae960ee90b wifi: mac80211: prevent VLANs on MLDs
Do not allow VLANs to be added to AP interfaces that are
MLDs, this isn't going to work because the link structs
aren't propagated to the VLAN interfaces yet.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20220902161144.8c88531146e9.If2ef9a3b138d4f16ed2fda91c852da156bdf5e4d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 16:57:01 +02:00
Johannes Berg
2aec909912 wifi: use struct_group to copy addresses
We sometimes copy all the addresses from the 802.11 header
for the AAD, which may cause complaints from fortify checks.
Use struct_group() to avoid the compiler warnings/errors.

Change-Id: Ic3ea389105e7813b22095b295079eecdabde5045
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 16:40:06 +02:00
Soenke Huster
8c0427842a wifi: mac80211_hwsim: check length for virtio packets
An invalid packet with a length shorter than the specified length in the
netlink header can lead to use-after-frees and slab-out-of-bounds in the
processing of the netlink attributes, such as the following:

  BUG: KASAN: slab-out-of-bounds in __nla_validate_parse+0x1258/0x2010
  Read of size 2 at addr ffff88800ac7952c by task kworker/0:1/12

  Workqueue: events hwsim_virtio_rx_work
  Call Trace:
   <TASK>
   dump_stack_lvl+0x45/0x5d
   print_report.cold+0x5e/0x5e5
   kasan_report+0xb1/0x1c0
   __nla_validate_parse+0x1258/0x2010
   __nla_parse+0x22/0x30
   hwsim_virtio_handle_cmd.isra.0+0x13f/0x2d0
   hwsim_virtio_rx_work+0x1b2/0x370
   process_one_work+0x8df/0x1530
   worker_thread+0x575/0x11a0
   kthread+0x29d/0x340
   ret_from_fork+0x22/0x30
 </TASK>

Discarding packets with an invalid length solves this.
Therefore, skb->len must be set at reception.

Change-Id: Ieaeb9a4c62d3beede274881a7c2722c6c6f477b6
Signed-off-by: Soenke Huster <soenke.huster@eknoes.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 16:40:06 +02:00
Johannes Berg
69371801f9 wifi: mac80211: fix locking in auth/assoc timeout
If we hit an authentication or association timeout, we only
release the chanctx for the deflink, and the other link(s)
are released later by ieee80211_vif_set_links(), but we're
not locking this correctly.

Fix the locking here while releasing the channels and links.

Change-Id: I9e08c1a5434592bdc75253c1abfa6c788f9f39b1
Fixes: 81151ce462e5 ("wifi: mac80211: support MLO authentication/association with one link")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 16:40:06 +02:00
Johannes Berg
7a2c6d1616 wifi: mac80211: mlme: release deflink channel in error case
In the prep_channel error case we didn't release the deflink
channel leaving it to be left around. Fix that.

Change-Id: If0dfd748125ec46a31fc6045a480dc28e03723d2
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 16:40:06 +02:00
Mukesh Sisodiya
4a86c54626 wifi: mac80211: fix link warning in RX agg timer expiry
The rx data link pointer isn't set from the RX aggregation timer,
resulting in a later warning. Fix that by setting it to the first
valid link for now, with a FIXME to worry about statistics later,
it's not very important since it's just the timeout case.

Reported-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/498d714c-76be-9d04-26db-a1206878de5e@redhat.com
Fixes: 56057da4569b ("wifi: mac80211: rx: track link in RX data")
Signed-off-by: Mukesh Sisodiya <mukesh.sisodiya@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03 16:40:03 +02:00
ye xingchen
ac9284db6b LoongArch: mm: Remove the unneeded result variable
Return the value pa_to_nid() directly instead of storing it in another
redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03 18:01:27 +08:00
Yupeng Li
1a470ce4e9 LoongArch: Fix arch_remove_memory() undefined build error
The kernel build error when unslected CONFIG_MEMORY_HOTREMOVE because
arch_remove_memory() is needed by mm/memory_hotplug.c but undefined.

Some build error messages like:

 LD      vmlinux.o
 MODPOST vmlinux.symvers
 MODINFO modules.builtin.modinfo
 GEN     modules.builtin
 LD      .tmp_vmlinux.kallsyms1
loongarch64-linux-gnu-ld: mm/memory_hotplug.o: in function `.L242':
memory_hotplug.c:(.ref.text+0x930): undefined reference to `arch_remove_memory'
make: *** [Makefile:1169:vmlinux] 错误 1

Removed CONFIG_MEMORY_HOTREMOVE requirement and rearrange the file refer
to the definitions of other platform architectures.

Signed-off-by: Yupeng Li <liyupeng@zbhlos.com>
Signed-off-by: Caicai <caizp2008@163.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03 18:01:27 +08:00
Huacai Chen
e0fba87c85 LoongArch: Fix section mismatch due to acpi_os_ioremap()
Now acpi_os_ioremap() is marked with __init because it calls memblock_
is_memory() which is also marked with __init in the !ARCH_KEEP_MEMBLOCK
case. However, acpi_os_ioremap() is called by ordinary functions such
as acpi_os_{read, write}_memory() and causes section mismatch warnings:

WARNING: modpost: vmlinux.o: section mismatch in reference: acpi_os_read_memory (section: .text) -> acpi_os_ioremap (section: .init.text)
WARNING: modpost: vmlinux.o: section mismatch in reference: acpi_os_write_memory (section: .text) -> acpi_os_ioremap (section: .init.text)

Fix these warnings by selecting ARCH_KEEP_MEMBLOCK unconditionally and
removing the __init modifier of acpi_os_ioremap(). This can also give a
chance to track "memory" and "reserved" memblocks after early boot.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03 18:01:27 +08:00
Huacai Chen
ad6846196a LoongArch: Improve dump_tlb() output messages
1, Use nr/nx to replace ri/xi;
2, Add 0x prefix for hexadecimal data.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03 18:01:27 +08:00
Huacai Chen
0163005374 LoongArch: Adjust arch_do_signal_or_restart() to adapt generic entry
Commit 8ba62d37949e248c69 ("task_work: Call tracehook_notify_signal from
get_signal on all architectures") adjust arch_do_signal_or_restart() for
all architectures. LoongArch hasn't been upstream yet at that time and
can be still built successfully without adjustment because this function
has a weak version with the correct prototype. It is obviously that we
should convert LoongArch to use new API, otherwise some signal handlings
will be lost.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03 18:01:27 +08:00
Ard Biesheuvel
1429cfde90 LoongArch: Avoid orphan input sections
Ensure that all input sections are listed explicitly in the linker
script, and issue a warning otherwise. This ensures that the binary
image matches the PE/COFF and other image metadata exactly, which is
important for things like code signing.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-03 18:01:27 +08:00
David S. Miller
d9c0103b9c Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-09-02 (i40e, iavf)

This series contains updates to i40e and iavf drivers.

Przemyslaw adds reset to ADQ configuration to allow for setting of rate
limit beyond TC0 for i40e.

Ivan Vecera does not free client on failure to open which could cause
NULL pointer dereference to occur on i40e. He also detaches device
during reset to prevent NDO calls with could cause races for iavf.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03 10:46:24 +01:00
David S. Miller
cf5c15d1e9 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
This series contains updates to ice driver only.

Przemyslaw fixes memory leak of DMA memory due to incorrect freeing of
rx_buf.

Michal S corrects incorrect call to free memory.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03 10:44:01 +01:00
Oleksij Rempel
3015c50384 net: dsa: microchip: fix kernel oops on ksz8 switches
After driver refactoring we was running ksz9477 specific CPU port
configuration on ksz8 family which ended with kernel oops. So, make sure
we run this code only on ksz9477 compatible devices.

Tested on KSZ8873 and KSZ9477.

Fixes: da8cd08520f3 ("net: dsa: microchip: add support for common phylink mac link up")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03 10:42:10 +01:00
David S. Miller
aa3fab0110 Merge branch 'net_sched-redundant-resource-cleanups'
Zhengchao Shao says:

====================
net: sched: remove redundant resource cleanup when init() fails

qdisc_create() calls .init() to initialize qdisc. If the initialization
fails, qdisc_create() will call .destroy() to release resources.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03 10:40:40 +01:00
Zhengchao Shao
d59f4e1d1f net: sched: htb: remove redundant resource cleanup in htb_init()
If htb_init() fails, qdisc_create() invokes htb_destroy() to clear
resources. Therefore, remove redundant resource cleanup in htb_init().

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03 10:40:40 +01:00
Zhengchao Shao
494f5063b8 net: sched: fq_codel: remove redundant resource cleanup in fq_codel_init()
If fq_codel_init() fails, qdisc_create() invokes fq_codel_destroy() to
clear resources. Therefore, remove redundant resource cleanup in
fq_codel_init().

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03 10:40:40 +01:00
Wei Fang
40c79ce13b net: fec: add stop mode support for imx8 platform
The current driver support stop mode by calling machine api.
The patch add dts support to set GPR register for stop request.

imx8mq enter stop/exit stop mode by setting GPR bit, which can
be accessed by A core.
imx8qm enter stop/exit stop mode by calling IMX_SC ipc APIs that
communicate with M core ipc service, and the M core set the related
GPR bit at last.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03 09:54:34 +01:00
André Apitzsch
e26c258434 r8152: Add MAC passthrough support for Lenovo Travel Hub
The Lenovo USB-C Travel Hub supports MAC passthrough.

Signed-off-by: André Apitzsch <git@apitzsch.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03 09:53:38 +01:00
Paul Durrant
c55f34b6ae xen-netback: only remove 'hotplug-status' when the vif is actually destroyed
Removing 'hotplug-status' in backend_disconnected() means that it will be
removed even in the case that the frontend unilaterally disconnects (which
it is free to do at any time). The consequence of this is that, when the
frontend attempts to re-connect, the backend gets stuck in 'InitWait'
rather than moving straight to 'Connected' (which it can do because the
hotplug script has already run).
Instead, the 'hotplug-status' mode should be removed in netback_remove()
i.e. when the vif really is going away.

Fixes: 0f4558ae9187 ("Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"")
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03 09:53:02 +01:00
Gustavo A. R. Silva
5854a09b49 net/ipv4: Use __DECLARE_FLEX_ARRAY() helper
We now have a cleaner way to keep compatibility with user-space
(a.k.a. not breaking it) when we need to keep in place a one-element
array (for its use in user-space) together with a flexible-array
member (for its use in kernel-space) without making it hard to read
at the source level. This is through the use of the new
__DECLARE_FLEX_ARRAY() helper macro.

The size and memory layout of the structure is preserved after the
changes. See below.

Before changes:

$ pahole -C ip_msfilter net/ipv4/igmp.o
struct ip_msfilter {
	union {
		struct {
			__be32     imsf_multiaddr_aux;   /*     0     4 */
			__be32     imsf_interface_aux;   /*     4     4 */
			__u32      imsf_fmode_aux;       /*     8     4 */
			__u32      imsf_numsrc_aux;      /*    12     4 */
			__be32     imsf_slist[1];        /*    16     4 */
		};                                       /*     0    20 */
		struct {
			__be32     imsf_multiaddr;       /*     0     4 */
			__be32     imsf_interface;       /*     4     4 */
			__u32      imsf_fmode;           /*     8     4 */
			__u32      imsf_numsrc;          /*    12     4 */
			__be32     imsf_slist_flex[0];   /*    16     0 */
		};                                       /*     0    16 */
	};                                               /*     0    20 */

	/* size: 20, cachelines: 1, members: 1 */
	/* last cacheline: 20 bytes */
};

After changes:

$ pahole -C ip_msfilter net/ipv4/igmp.o
struct ip_msfilter {
	__be32                     imsf_multiaddr;       /*     0     4 */
	__be32                     imsf_interface;       /*     4     4 */
	__u32                      imsf_fmode;           /*     8     4 */
	__u32                      imsf_numsrc;          /*    12     4 */
	union {
		__be32             imsf_slist[1];        /*    16     4 */
		struct {
			struct {
			} __empty_imsf_slist_flex;       /*    16     0 */
			__be32     imsf_slist_flex[0];   /*    16     0 */
		};                                       /*    16     0 */
	};                                               /*    16     4 */

	/* size: 20, cachelines: 1, members: 5 */
	/* last cacheline: 20 bytes */
};

In the past, we had to duplicate the whole original structure within
a union, and update the names of all the members. Now, we just need to
declare the flexible-array member to be used in kernel-space through
the __DECLARE_FLEX_ARRAY() helper together with the one-element array,
within a union. This makes the source code more clean and easier to read.

Link: https://github.com/KSPP/linux/issues/193
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03 09:51:10 +01:00
Zhengchao Shao
2e5fb32232 net/sched: cls_api: remove redundant 0 check in tcf_qevent_init()
tcf_qevent_parse_block_index() never returns a zero block_index.
Therefore, it is unnecessary to check the value of block_index in
tcf_qevent_init().

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20220901011617.14105-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-02 21:24:49 -07:00
GUO Zihua
c8ef3c94bd net: lantiq_etop: Fix return type for implementation of ndo_start_xmit
Since Linux now supports CFI, it will be a good idea to fix mismatched
return type for implementation of hooks. Otherwise this might get
cought out by CFI and cause a panic.

ltq_etop_tx() would return either NETDEV_TX_BUSY or NETDEV_TX_OK, so
change the return type to netdev_tx_t directly.

Signed-off-by: GUO Zihua <guozihua@huawei.com>
Link: https://lore.kernel.org/r/20220902081521.59867-1-guozihua@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-02 21:19:35 -07:00
GUO Zihua
7b620e1560 net: sunplus: Fix return type for implementation of ndo_start_xmit
Since Linux now supports CFI, it will be a good idea to fix mismatched
return type for implementation of hooks. Otherwise this might get
cought out by CFI and cause a panic.

spl2sw_ethernet_start_xmit() would return either NETDEV_TX_BUSY or
NETDEV_TX_OK, so change the return type to netdev_tx_t directly.

Signed-off-by: GUO Zihua <guozihua@huawei.com>
Link: https://lore.kernel.org/r/20220902081550.60095-1-guozihua@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-02 21:19:14 -07:00
GUO Zihua
0dbaf0fa62 net: xscale: Fix return type for implementation of ndo_start_xmit
Since Linux now supports CFI, it will be a good idea to fix mismatched
return type for implementation of hooks. Otherwise this might get
cought out by CFI and cause a panic.

eth_xmit() would return either NETDEV_TX_BUSY or NETDEV_TX_OK, so
change the return type to netdev_tx_t directly.

Signed-off-by: GUO Zihua <guozihua@huawei.com>
Link: https://lore.kernel.org/r/20220902081612.60405-1-guozihua@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-02 21:18:56 -07:00
GUO Zihua
12f7bd2522 net: broadcom: Fix return type for implementation of
Since Linux now supports CFI, it will be a good idea to fix mismatched
return type for implementation of hooks. Otherwise this might get
cought out by CFI and cause a panic.

bcm4908_enet_start_xmit() would return either NETDEV_TX_BUSY or
NETDEV_TX_OK, so change the return type to netdev_tx_t directly.

Signed-off-by: GUO Zihua <guozihua@huawei.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220902075407.52358-1-guozihua@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-02 21:18:43 -07:00
Csókás Bence
b353b241f1 net: fec: Use a spinlock to guard fep->ptp_clk_on
Mutexes cannot be taken in a non-preemptible context,
causing a panic in `fec_ptp_save_state()`. Replacing
`ptp_clk_mutex` by `tmreg_lock` fixes this.

Fixes: 6a4d7234ae9a ("net: fec: ptp: avoid register access when ipg clock is disabled")
Fixes: f79959220fa5 ("fec: Restart PPS after link state change")
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/all/20220827160922.642zlcd5foopozru@pengutronix.de/
Signed-off-by: Csókás Bence <csokas.bence@prolan.hu>
Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Apalis iMX6
Link: https://lore.kernel.org/r/20220901140402.64804-1-csokas.bence@prolan.hu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-02 21:15:39 -07:00
Wei Fang
7d650df99d net: fec: add pm_qos support on imx6q platform
There is a very low probability that tx timeout will occur during
suspend and resume stress test on imx6q platform. So we add pm_qos
support to prevent system from entering low level idles which may
affect the transmission of tx.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://lore.kernel.org/r/20220830070148.2021947-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-02 21:12:01 -07:00
Alexei Starovoitov
0b20a133c0 Merge branch 'bpf: net: Remove duplicated code from bpf_getsockopt()'
Martin KaFai Lau says:

====================

From: Martin KaFai Lau <martin.lau@kernel.org>

The earlier commits [0] removed duplicated code from bpf_setsockopt().
This series is to remove duplicated code from bpf_getsockopt().

Unlike the setsockopt() which had already changed to take
the sockptr_t argument, the same has not been done to
getsockopt().  This is the extra step being done in this
series.

[0]: https://lore.kernel.org/all/20220817061704.4174272-1-kafai@fb.com/

v2:
- The previous v2 did not reach the list. It is a resend.
- Add comments on bpf_getsockopt() should not free
  the saved_syn (Stanislav)
- Explicitly null-terminate the tcp-cc name (Stanislav)
====================

Reviewed-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:35:11 -07:00
Martin KaFai Lau
f649f992de selftest/bpf: Add test for bpf_getsockopt()
This patch removes the __bpf_getsockopt() which directly
reads the sk by using PTR_TO_BTF_ID.  Instead, the test now directly
uses the kernel bpf helper bpf_getsockopt() which supports all
the required optname now.

TCP_SAVE[D]_SYN and TCP_MAXSEG are not tested in a loop for all
the hooks and sock_ops's cb.  TCP_SAVE[D]_SYN only works
in passive connection.  TCP_MAXSEG only works when
it is setsockopt before the connection is established and
the getsockopt return value can only be tested after
the connection is established.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002937.2896904-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:32 -07:00
Martin KaFai Lau
38566ec06f bpf: Change bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt()
This patch changes bpf_getsockopt(SOL_IPV6) to reuse
do_ipv6_getsockopt().  It removes the duplicated code from
bpf_getsockopt(SOL_IPV6).

This also makes bpf_getsockopt(SOL_IPV6) supporting the same
set of optnames as in bpf_setsockopt(SOL_IPV6).  In particular,
this adds IPV6_AUTOFLOWLABEL support to bpf_getsockopt(SOL_IPV6).

ipv6 could be compiled as a module.  Like how other code solved it
with stubs in ipv6_stubs.h, this patch adds the do_ipv6_getsockopt
to the ipv6_bpf_stub.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002931.2896218-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:32 -07:00
Martin KaFai Lau
fd969f25fe bpf: Change bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt()
This patch changes bpf_getsockopt(SOL_IP) to reuse
do_ip_getsockopt() and remove the duplicated code.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002925.2895416-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:32 -07:00
Martin KaFai Lau
273b7f0fb4 bpf: Change bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt()
This patch changes bpf_getsockopt(SOL_TCP) to reuse
do_tcp_getsockopt().  It removes the duplicated code from
bpf_getsockopt(SOL_TCP).

Before this patch, there were some optnames available to
bpf_setsockopt(SOL_TCP) but missing in bpf_getsockopt(SOL_TCP).
For example, TCP_NODELAY, TCP_MAXSEG, TCP_KEEPIDLE, TCP_KEEPINTVL,
and a few more.  It surprises users from time to time.  This patch
automatically closes this gap without duplicating more code.

bpf_getsockopt(TCP_SAVED_SYN) does not free the saved_syn,
so it stays in sol_tcp_sockopt().

For string name value like TCP_CONGESTION, bpf expects it
is always null terminated, so sol_tcp_sockopt() decrements
optlen by one before calling do_tcp_getsockopt() and
the 'if (optlen < saved_optlen) memset(..,0,..);'
in __bpf_getsockopt() will always do a null termination.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002918.2894511-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:32 -07:00
Martin KaFai Lau
65ddc82d3b bpf: Change bpf_getsockopt(SOL_SOCKET) to reuse sk_getsockopt()
This patch changes bpf_getsockopt(SOL_SOCKET) to reuse
sk_getsockopt().  It removes all duplicated code from
bpf_getsockopt(SOL_SOCKET).

Before this patch, there were some optnames available to
bpf_setsockopt(SOL_SOCKET) but missing in bpf_getsockopt(SOL_SOCKET).
It surprises users from time to time.  For example, SO_REUSEADDR,
SO_KEEPALIVE, SO_RCVLOWAT, and SO_MAX_PACING_RATE.  This patch
automatically closes this gap without duplicating more code.
The only exception is SO_BINDTODEVICE because it needs to acquire a
blocking lock.  Thus, SO_BINDTODEVICE is not supported.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002912.2894040-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:31 -07:00
Martin KaFai Lau
c2b063ca34 bpf: Embed kernel CONFIG check into the if statement in bpf_getsockopt
This patch moves the "#ifdef CONFIG_XXX" check into the "if/else"
statement itself.  The change is done for the bpf_getsockopt()
function only.  It will make the latter patches easier to follow
without the surrounding ifdef macro.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002906.2893572-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:31 -07:00
Martin KaFai Lau
0f95f7d426 bpf: net: Avoid do_ipv6_getsockopt() taking sk lock when called from bpf
Similar to the earlier patch that changes sk_getsockopt() to
use sockopt_{lock,release}_sock() such that it can avoid taking the
lock when called from bpf.  This patch also changes do_ipv6_getsockopt()
to use sockopt_{lock,release}_sock() such that bpf_getsockopt(SOL_IPV6)
can reuse do_ipv6_getsockopt().

Although bpf_getsockopt(SOL_IPV6) currently does not support optname
that requires lock_sock(), using sockopt_{lock,release}_sock()
consistently across *_getsockopt() will make future optname addition
harder to miss the sockopt_{lock,release}_sock() usage. eg. when
adding new optname that requires a lock and the new optname is
needed in bpf_getsockopt() also.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002859.2893064-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:31 -07:00
Martin KaFai Lau
6dadbe4bac bpf: net: Change do_ipv6_getsockopt() to take the sockptr_t argument
Similar to the earlier patch that changes sk_getsockopt() to
take the sockptr_t argument .  This patch also changes
do_ipv6_getsockopt() to take the sockptr_t argument such that
a latter patch can make bpf_getsockopt(SOL_IPV6) to reuse
do_ipv6_getsockopt().

Note on the change in ip6_mc_msfget().  This function is to
return an array of sockaddr_storage in optval.  This function
is shared between ipv6_get_msfilter() and compat_ipv6_get_msfilter().
However, the sockaddr_storage is stored at different offset of the
optval because of the difference between group_filter and
compat_group_filter.  Thus, a new 'ss_offset' argument is
added to ip6_mc_msfget().

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002853.2892532-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:31 -07:00
Martin KaFai Lau
9c3f9707de net: Add a len argument to compat_ipv6_get_msfilter()
Pass the len to the compat_ipv6_get_msfilter() instead of
compat_ipv6_get_msfilter() getting it again from optlen.
Its counter part ipv6_get_msfilter() is also taking the
len from do_ipv6_getsockopt().

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002846.2892091-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:31 -07:00
Martin KaFai Lau
75f2397988 net: Remove unused flags argument from do_ipv6_getsockopt
The 'unsigned int flags' argument is always 0, so it can be removed.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002840.2891763-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:31 -07:00
Martin KaFai Lau
1985320c54 bpf: net: Avoid do_ip_getsockopt() taking sk lock when called from bpf
Similar to the earlier commit that changed sk_setsockopt() to
use sockopt_{lock,release}_sock() such that it can avoid taking
lock when called from bpf.  This patch also changes do_ip_getsockopt()
to use sockopt_{lock,release}_sock() such that a latter patch can
make bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt().

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002834.2891514-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:31 -07:00
Martin KaFai Lau
728f064cd7 bpf: net: Change do_ip_getsockopt() to take the sockptr_t argument
Similar to the earlier patch that changes sk_getsockopt() to
take the sockptr_t argument.  This patch also changes
do_ip_getsockopt() to take the sockptr_t argument such that
a latter patch can make bpf_getsockopt(SOL_IP) to reuse
do_ip_getsockopt().

Note on the change in ip_mc_gsfget().  This function is to
return an array of sockaddr_storage in optval.  This function
is shared between ip_get_mcast_msfilter() and
compat_ip_get_mcast_msfilter().  However, the sockaddr_storage
is stored at different offset of the optval because of
the difference between group_filter and compat_group_filter.
Thus, a new 'ss_offset' argument is added to ip_mc_gsfget().

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20220902002828.2890585-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02 20:34:31 -07:00