linux

iv/linux

Author	SHA1	Message	Date
Sabrina Dubroca	f3e444e31f	tls: get cipher_name from cipher_desc in tls_set_sw_offload tls_cipher_desc also contains the algorithm name needed by crypto_alloc_aead, use it. Finally, use get_cipher_desc to check if the cipher_type coming from userspace is valid, and remove the cipher_type switch. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/53d021d80138aa125a9cef4468aa5ce531975a7b.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:42 -07:00
Sabrina Dubroca	48dfad27fd	tls: use tls_cipher_desc to access per-cipher crypto_info in tls_set_sw_offload The crypto_info_* helpers allow us to fetch pointers into the per-cipher crypto_info's data. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/c23af110caf0af6b68de2f86c58064913e2e902a.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:42 -07:00
Sabrina Dubroca	d9a6ca1a97	tls: use tls_cipher_desc to get per-cipher sizes in tls_set_sw_offload We can get rid of some local variables, but we have to keep nonce_size because tls1.3 uses nonce_size = 0 for all ciphers. We can also drop the runtime sanity checks on iv/rec_seq/tag size, since we have compile time checks on those values. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/deed9c4430a62c31751a72b8c03ad66ffe710717.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:42 -07:00
Sabrina Dubroca	077e05d135	tls: use tls_cipher_desc to simplify do_tls_getsockopt_conf Every cipher uses the same code to update its crypto_info struct based on the values contained in the cctx, with only the struct type and size/offset changing. We can get those from tls_cipher_desc, and use a single pair of memcpy and final copy_to_user. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/c21a904b91e972bdbbf9d1c6d2731ccfa1eedf72.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:42 -07:00
Sabrina Dubroca	5f309ade49	tls: get crypto_info size from tls_cipher_desc in do_tls_setsockopt_conf We can simplify do_tls_setsockopt_conf using tls_cipher_desc. Also use get_cipher_desc's result to check if the cipher_type coming from userspace is valid. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/e97658eb4c6a5832f8ba20a06c4f36a77763c59e.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:42 -07:00
Sabrina Dubroca	e907277aeb	tls: expand use of tls_cipher_desc in tls_sw_fallback_init tls_sw_fallback_init already gets the key and tag size from tls_cipher_desc. We can now also check that the cipher type is valid, and stop hard-coding the algorithm name passed to crypto_alloc_aead. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/c8c94b8fcafbfb558e09589c1f1ad48dbdf92f76.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:42 -07:00
Sabrina Dubroca	d2322cf5ed	tls: allocate the fallback aead after checking that the cipher is valid No need to allocate the aead if we're going to fail afterwards. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/335e32511ed55a0b30f3f81a78fa8f323b3bdf8f.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:42 -07:00
Sabrina Dubroca	3524dd4d5f	tls: expand use of tls_cipher_desc in tls_set_device_offload tls_set_device_offload is already getting iv and rec_seq sizes from tls_cipher_desc. We can now also check if the cipher_type coming from userspace is valid and can be offloaded. We can also remove the runtime check on rec_seq, since we validate it at compile time. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/8ab71b8eca856c7aaf981a45fe91ac649eb0e2e9.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:41 -07:00
Sabrina Dubroca	0d98cc0202	tls: validate cipher descriptions at compile time Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/b38fb8cf60e099e82ae9979c3c9c92421042417c.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:41 -07:00
Sabrina Dubroca	176a3f50bc	tls: extend tls_cipher_desc to fully describe the ciphers - add nonce, usually equal to iv_size but not for chacha - add offsets into the crypto_info for each field - add algorithm name - add offloadable flag Also add helpers to access each field of a crypto_info struct described by a tls_cipher_desc. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/39d5f476d63c171097764e8d38f6f158b7c109ae.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:41 -07:00
Sabrina Dubroca	8db44ab26b	tls: rename tls_cipher_size_desc to tls_cipher_desc We're going to add other fields to it to fully describe a cipher, so the "_size" name won't match the contents. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/76ca6c7686bd6d1534dfa188fb0f1f6fabebc791.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:41 -07:00
Sabrina Dubroca	037303d676	tls: reduce size of tls_cipher_size_desc tls_cipher_size_desc indexes ciphers by their type, but we're not using indices 0..50 of the array. Each struct tls_cipher_size_desc is 20B, so that's a lot of unused memory. We can reindex the array starting at the lowest used cipher_type. Introduce the get_cipher_size_desc helper to find the right item and avoid out-of-bounds accesses, and make tls_cipher_size_desc's size explicit so that gcc reminds us to update TLS_CIPHER_MIN/MAX when we add a new cipher. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/5e054e370e240247a5d37881a1cd93a67c15f4ca.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:41 -07:00
Sabrina Dubroca	200e231651	tls: add TLS_CIPHER_ARIA_GCM_* to tls_cipher_size_desc Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/b2e0fb79e6d0a4478be9bf33781dc9c9281c9d56.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:41 -07:00
Sabrina Dubroca	fd0fc6fdd8	tls: move tls_cipher_size_desc to net/tls/tls.h It's only used in net/tls/*, no need to bloat include/net/tls.h. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/dd9fad80415e5b3575b41f56b331871038362eab.1692977948.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:17:41 -07:00
Dima Chumak	390a24cbc3	devlink: Expose port function commands to control IPsec packet offloads Expose port function commands to enable / disable IPsec packet offloads, this is used to control the port IPsec capabilities. When IPsec packet is disabled for a function of the port (default), function cannot offload IPsec packet operations (encapsulation and XFRM policy offload). When enabled, IPsec packet operations can be offloaded by the function of the port, which includes crypto operation (Encrypt/Decrypt), IPsec encapsulation and XFRM state and policy offload. Example of a PCI VF port which supports IPsec packet offloads: $ devlink port show pci/0000:06:00.0/1 pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0 function: hw_addr 00:00:00:00:00:00 roce enable ipsec_packet disable $ devlink port function set pci/0000:06:00.0/1 ipsec_packet enable $ devlink port show pci/0000:06:00.0/1 pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0 function: hw_addr 00:00:00:00:00:00 roce enable ipsec_packet enable Signed-off-by: Dima Chumak <dchumak@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20230825062836.103744-3-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:08:45 -07:00
Dima Chumak	62b6442c58	devlink: Expose port function commands to control IPsec crypto offloads Expose port function commands to enable / disable IPsec crypto offloads, this is used to control the port IPsec capabilities. When IPsec crypto is disabled for a function of the port (default), function cannot offload any IPsec crypto operations (Encrypt/Decrypt and XFRM state offloading). When enabled, IPsec crypto operations can be offloaded by the function of the port. Example of a PCI VF port which supports IPsec crypto offloads: $ devlink port show pci/0000:06:00.0/1 pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0 function: hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto disable $ devlink port function set pci/0000:06:00.0/1 ipsec_crypto enable $ devlink port show pci/0000:06:00.0/1 pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0 function: hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto enable Signed-off-by: Dima Chumak <dchumak@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20230825062836.103744-2-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-27 17:08:44 -07:00
Budimir Markovic	b3d26c5702	net/sched: sch_hfsc: Ensure inner classes have fsc curve HFSC assumes that inner classes have an fsc curve, but it is currently possible for classes without an fsc curve to become parents. This leads to bugs including a use-after-free. Don't allow non-root classes without HFSC_FSC to become parents. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Budimir Markovic <markovicbudimir@gmail.com> Signed-off-by: Budimir Markovic <markovicbudimir@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20230824084905.422-1-markovicbudimir@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-25 18:57:54 -07:00
Jakub Kicinski	bebfbf07c7	bpf-next-for-netdev -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZOjkTAAKCRDbK58LschI gx32AP9gaaHFBtOYBfoenKTJfMgv1WhtQHIBas+WN9ItmBx9MAEA4gm/VyQ6oD7O EBjJKJQ2CZ/QKw7cNacXw+l5jF7/+Q0= =8P7g -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-08-25 We've added 87 non-merge commits during the last 8 day(s) which contain a total of 104 files changed, 3719 insertions(+), 4212 deletions(-). The main changes are: 1) Add multi uprobe BPF links for attaching multiple uprobes and usdt probes, which is significantly faster and saves extra fds, from Jiri Olsa. 2) Add support BPF cpu v4 instructions for arm64 JIT compiler, from Xu Kuohai. 3) Add support BPF cpu v4 instructions for riscv64 JIT compiler, from Pu Lehui. 4) Fix LWT BPF xmit hooks wrt their return values where propagating the result from skb_do_redirect() would trigger a use-after-free, from Yan Zhai. 5) Fix a BPF verifier issue related to bpf_kptr_xchg() with local kptr where the map's value kptr type and locally allocated obj type mismatch, from Yonghong Song. 6) Fix BPF verifier's check_func_arg_reg_off() function wrt graph root/node which bypassed reg->off == 0 enforcement, from Kumar Kartikeya Dwivedi. 7) Lift BPF verifier restriction in networking BPF programs to treat comparison of packet pointers not as a pointer leak, from Yafang Shao. 8) Remove unmaintained XDP BPF samples as they are maintained in xdp-tools repository out of tree, from Toke Høiland-Jørgensen. 9) Batch of fixes for the tracing programs from BPF samples in order to make them more libbpf-aware, from Daniel T. Lee. 10) Fix a libbpf signedness determination bug in the CO-RE relocation handling logic, from Andrii Nakryiko. 11) Extend libbpf to support CO-RE kfunc relocations. Also follow-up fixes for bpf_refcount shared ownership implementation, both from Dave Marchevsky. 12) Add a new bpf_object__unpin() API function to libbpf, from Daniel Xu. 13) Fix a memory leak in libbpf to also free btf_vmlinux when the bpf_object gets closed, from Hao Luo. 14) Small error output improvements to test_bpf module, from Helge Deller. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (87 commits) selftests/bpf: Add tests for rbtree API interaction in sleepable progs bpf: Allow bpf_spin_{lock,unlock} in sleepable progs bpf: Consider non-owning refs to refcounted nodes RCU protected bpf: Reenable bpf_refcount_acquire bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes bpf: Consider non-owning refs trusted bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire selftests/bpf: Enable cpu v4 tests for RV64 riscv, bpf: Support unconditional bswap insn riscv, bpf: Support signed div/mod insns riscv, bpf: Support 32-bit offset jmp insn riscv, bpf: Support sign-extension mov insns riscv, bpf: Support sign-extension load insns riscv, bpf: Fix missing exception handling and redundant zext for LDX_B/H/W samples/bpf: Add note to README about the XDP utilities moved to xdp-tools samples/bpf: Cleanup .gitignore samples/bpf: Remove the xdp_sample_pkts utility samples/bpf: Remove the xdp1 and xdp2 utilities samples/bpf: Remove the xdp_rxq_info utility samples/bpf: Remove the xdp_redirect* utilities ... ==================== Link: https://lore.kernel.org/r/20230825194319.12727-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-25 18:40:15 -07:00
Jakub Kicinski	1fa6ffad12	wireless-next patches for v6.6 The second pull request for v6.6, this time with both stack and driver changes. Unusually we have only one major new feature but lots of small cleanup all over, I guess this is due to people have been on vacation the last month. Major changes: rtw89 * Introduce Time Averaged SAR (TAS) support -----BEGIN PGP SIGNATURE----- iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmToqosRHGt2YWxvQGtl cm5lbC5vcmcACgkQbhckVSbrbZv9XQf9HDq9smbuWLvwzNjbbS31hHFLmnfhN8Zp +Zzn47gpMCle9ahGLQyw8lcfNPWCMyqOu4sGQ6hyyuH+YXoxZryuq9QDwWo9L/b1 5Cpm4IaBYBMm0ZoOkWw2lQSzGyNrXgvCEKRVC+pYQMvr5V2aEWxT/kT4guiou9D5 OXPRFN2iqZP0Q3TKcfKWRnWn3S0Ok3kZCFuXcWkL0sgwjqP/wbAPO1XNI1IImKNM xUd0zT4vK/layYq7i20y8blglI5kcp/aKCFEwYpQC2WPeZ3Wtl1G9PQ8eze5Gc2Q NTw3xfr6tENIcAmYoLdBdKbUq6e6pwLwXlojlZ2beR6s7LHM30AinQ== =2Hja -----END PGP SIGNATURE----- Merge tag 'wireless-next-2023-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.6 The second pull request for v6.6, this time with both stack and driver changes. Unusually we have only one major new feature but lots of small cleanup all over, I guess this is due to people have been on vacation the last month. Major changes: rtw89 - Introduce Time Averaged SAR (TAS) support * tag 'wireless-next-2023-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (114 commits) wifi: rtlwifi: rtl8723: Remove unused function rtl8723_cmd_send_packet() wifi: rtw88: usb: kill and free rx urbs on probe failure wifi: rtw89: Fix clang -Wimplicit-fallthrough in rtw89_query_sar() wifi: rtw89: phy: modify register setting of ENV_MNTR, PHYSTS and DIG wifi: rtw89: phy: add phy_gen_def::cr_base to support WiFi 7 chips wifi: rtw89: mac: define register address of rx_filter to generalize code wifi: rtw89: mac: define internal memory address for WiFi 7 chip wifi: rtw89: mac: generalize code to indirectly access WiFi internal memory wifi: rtw89: mac: add mac_gen_def::band1_offset to map MAC band1 register address wifi: wlcore: sdio: Use module_sdio_driver macro to simplify the code wifi: rtw89: initialize multi-channel handling wifi: rtw89: provide functions to configure NoA for beacon update wifi: rtw89: call rtw89_chan_get() by vif chanctx if aware of vif wifi: rtw89: sar: let caller decide the center frequency to query wifi: rtw89: refine rtw89_correct_cck_chan() by rtw89_hw_to_nl80211_band() wifi: rtw89: add function prototype for coex request duration Fix nomenclature for USB and PCI wireless devices wifi: ath: Use is_multicast_ether_addr() to check multicast Ether address wifi: ath12k: Remove unused declarations wifi: ath12k: add check max message length while scanning with extraie ... ==================== Link: https://lore.kernel.org/r/20230825132230.A0833C433C8@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-25 18:35:09 -07:00
Matthew Wilcox (Oracle)	f9bff0e318	minmax: add in_range() macro Patch series "New page table range API", v6. This patchset changes the API used by the MM to set up page table entries. The four APIs are: set_ptes(mm, addr, ptep, pte, nr) update_mmu_cache_range(vma, addr, ptep, nr) flush_dcache_folio(folio) flush_icache_pages(vma, page, nr) flush_dcache_folio() isn't technically new, but no architecture implemented it, so I've done that for them. The old APIs remain around but are mostly implemented by calling the new interfaces. The new APIs are based around setting up N page table entries at once. The N entries belong to the same PMD, the same folio and the same VMA, so ptep++ is a legitimate operation, and locking is taken care of for you. Some architectures can do a better job of it than just a loop, but I have hesitated to make too deep a change to architectures I don't understand well. One thing I have changed in every architecture is that PG_arch_1 is now a per-folio bit instead of a per-page bit when used for dcache clean/dirty tracking. This was something that would have to happen eventually, and it makes sense to do it now rather than iterate over every page involved in a cache flush and figure out if it needs to happen. The point of all this is better performance, and Fengwei Yin has measured improvement on x86. I suspect you'll see improvement on your architecture too. Try the new will-it-scale test mentioned here: https://lore.kernel.org/linux-mm/20230206140639.538867-5-fengwei.yin@intel.com/ You'll need to run it on an XFS filesystem and have CONFIG_TRANSPARENT_HUGEPAGE set. This patchset is the basis for much of the anonymous large folio work being done by Ryan, so it's received quite a lot of testing over the last few months. This patch (of 38): Determine if a value lies within a range more efficiently (subtraction + comparison vs two comparisons and an AND). It also has useful (under some circumstances) behaviour if the range exceeds the maximum value of the type. Convert all the conflicting definitions of in_range() within the kernel; some can use the generic definition while others need their own definition. Link: https://lkml.kernel.org/r/20230802151406.3735276-1-willy@infradead.org Link: https://lkml.kernel.org/r/20230802151406.3735276-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-08-24 16:20:18 -07:00
Luiz Augusto von Dentz	253f3399f4	Bluetooth: HCI: Introduce HCI_QUIRK_BROKEN_LE_CODED This introduces HCI_QUIRK_BROKEN_LE_CODED which is used to indicate that LE Coded PHY shall not be used, it is then set for some Intel models that claim to support it but when used causes many problems. Cc: stable@vger.kernel.org # 6.4.y+ Link: https://github.com/bluez/bluez/issues/577 Link: https://github.com/bluez/bluez/issues/582 Link: https://lore.kernel.org/linux-bluetooth/CABBYNZKco-v7wkjHHexxQbgwwSz-S=GZ=dZKbRE1qxT1h4fFbQ@mail.gmail.com/T/# Fixes: 288c90224eec ("Bluetooth: Enable all supported LE PHY by default") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2023-08-24 12:23:46 -07:00
Claudia Draghicescu	9c0826310b	Bluetooth: ISO: Add support for periodic adv reports processing In the case of a Periodic Synchronized Receiver, the PA report received from a Broadcaster contains the BASE, which has information about codec and other parameters of a BIG. This isnformation is stored and the application can retrieve it using getsockopt(BT_ISO_BASE). Signed-off-by: Claudia Draghicescu <claudia.rosu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2023-08-24 12:22:56 -07:00
Pauli Virtanen	3344d31833	Bluetooth: hci_conn: fail SCO/ISO via hci_conn_failed if ACL gone early Not calling hci_(dis)connect_cfm before deleting conn referred to by a socket generally results to use-after-free. When cleaning up SCO connections when the parent ACL is deleted too early, use hci_conn_failed to do the connection cleanup properly. We also need to clean up ISO connections in a similar situation when connecting has started but LE Create CIS is not yet sent, so do it too here. Fixes: ca1fd42e7dbf ("Bluetooth: Fix potential double free caused by hci_conn_unlink") Reported-by: syzbot+cf54c1da6574b6c1b049@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-bluetooth/00000000000013b93805fbbadc50@google.com/ Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2023-08-24 12:22:35 -07:00
Luiz Augusto von Dentz	db08722fc7	Bluetooth: hci_core: Fix missing instances using HCI_MAX_AD_LENGTH There a few instances still using HCI_MAX_AD_LENGTH instead of using max_adv_len which takes care of detecting what is the actual maximum length depending on if the controller supports EA or not. Fixes: 112b5090c219 ("Bluetooth: MGMT: Fix always using HCI_MAX_AD_LENGTH") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2023-08-24 12:22:05 -07:00
Iulia Tanasescu	fbdc4bc472	Bluetooth: ISO: Use defer setup to separate PA sync and BIG sync This commit implements defer setup support for the Broadcast Sink scenario: By setting defer setup on a broadcast socket before calling listen, the user is able to trigger the PA sync and BIG sync procedures separately. This is useful if the user first wants to synchronize to the periodic advertising transmitted by a Broadcast Source, and trigger the BIG sync procedure later on. If defer setup is set, once a PA sync established event arrives, a new hcon is created and notified to the ISO layer. A child socket associated with the PA sync connection will be added to the accept queue of the listening socket. Once the accept call returns the fd for the PA sync child socket, the user should call read on that fd. This will trigger the BIG create sync procedure, and the PA sync socket will become a listening socket itself. When the BIG sync established event is notified to the ISO layer, the bis connections will be added to the accept queue of the PA sync parent. The user should call accept on the PA sync socket to get the final bis connections. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2023-08-24 12:21:35 -07:00
Luiz Augusto von Dentz	3a15324fd4	Bluetooth: hci_conn: Fix sending BT_HCI_CMD_LE_CREATE_CONN_CANCEL This fixes sending BT_HCI_CMD_LE_CREATE_CONN_CANCEL when hci_le_create_conn_sync has not been called because HCI_CONN_SCANNING has been clear too early before its cmd_sync callback has been run. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2023-08-24 12:20:16 -07:00
Luiz Augusto von Dentz	94d9ba9f98	Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync Use-after-free can occur in hci_disconnect_all_sync if a connection is deleted by concurrent processing of a controller event. To prevent this the code now tries to iterate over the list backwards to ensure the links are cleanup before its parents, also it no longer relies on a cursor, instead it always uses the last element since hci_abort_conn_sync is guaranteed to call hci_conn_del. UAF crash log: ================================================================== BUG: KASAN: slab-use-after-free in hci_set_powered_sync (net/bluetooth/hci_sync.c:5424) [bluetooth] Read of size 8 at addr ffff888009d9c000 by task kworker/u9:0/124 CPU: 0 PID: 124 Comm: kworker/u9:0 Tainted: G W 6.5.0-rc1+ #10 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014 Workqueue: hci0 hci_cmd_sync_work [bluetooth] Call Trace: <TASK> dump_stack_lvl+0x5b/0x90 print_report+0xcf/0x670 ? __virt_addr_valid+0xdd/0x160 ? hci_set_powered_sync+0x2c9/0x4a0 [bluetooth] kasan_report+0xa6/0xe0 ? hci_set_powered_sync+0x2c9/0x4a0 [bluetooth] ? __pfx_set_powered_sync+0x10/0x10 [bluetooth] hci_set_powered_sync+0x2c9/0x4a0 [bluetooth] ? __pfx_hci_set_powered_sync+0x10/0x10 [bluetooth] ? __pfx_lock_release+0x10/0x10 ? __pfx_set_powered_sync+0x10/0x10 [bluetooth] hci_cmd_sync_work+0x137/0x220 [bluetooth] process_one_work+0x526/0x9d0 ? __pfx_process_one_work+0x10/0x10 ? __pfx_do_raw_spin_lock+0x10/0x10 ? mark_held_locks+0x1a/0x90 worker_thread+0x92/0x630 ? __pfx_worker_thread+0x10/0x10 kthread+0x196/0x1e0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 </TASK> Allocated by task 1782: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 __kasan_kmalloc+0x8f/0xa0 hci_conn_add+0xa5/0xa80 [bluetooth] hci_bind_cis+0x881/0x9b0 [bluetooth] iso_connect_cis+0x121/0x520 [bluetooth] iso_sock_connect+0x3f6/0x790 [bluetooth] __sys_connect+0x109/0x130 __x64_sys_connect+0x40/0x50 do_syscall_64+0x60/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Freed by task 695: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2b/0x50 __kasan_slab_free+0x10a/0x180 __kmem_cache_free+0x14d/0x2e0 device_release+0x5d/0xf0 kobject_put+0xdf/0x270 hci_disconn_complete_evt+0x274/0x3a0 [bluetooth] hci_event_packet+0x579/0x7e0 [bluetooth] hci_rx_work+0x287/0xaa0 [bluetooth] process_one_work+0x526/0x9d0 worker_thread+0x92/0x630 kthread+0x196/0x1e0 ret_from_fork+0x2c/0x50 ================================================================== Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2023-08-24 12:19:55 -07:00
Thomas Weißschuh	5d21d0a65b	net: generalize calculation of skb extensions length Remove the necessity to modify skb_ext_total_length() when new extension types are added. Also reduces the line count a bit. With optimizations enabled the function is folded down to the same constant value as before during compilation. This has been validated on x86 with GCC 6.5.0 and 13.2.1. Also a similar construct has been validated on godbolt.org with GCC 5.1. In any case the compiler has to be able to evaluate the construct at compile-time for the BUILD_BUG_ON() in skb_extensions_init(). Even if not evaluated at compile-time this function would only ever be executed once at run-time, so the overhead would be very minuscule. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230823-skb_ext-simplify-v2-1-66e26cd66860@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-24 11:24:30 -07:00
Jakub Kicinski	57ce6427e0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: include/net/inet_sock.h f866fbc842de ("ipv4: fix data-races around inet->inet_id") c274af224269 ("inet: introduce inet->inet_flags") https://lore.kernel.org/all/679ddff6-db6e-4ff6-b177-574e90d0103d@tessares.net/ Adjacent changes: drivers/net/bonding/bond_alb.c e74216b8def3 ("bonding: fix macvlan over alb bond support") f11e5bd159b0 ("bonding: support balance-alb with openvswitch") drivers/net/ethernet/broadcom/bgmac.c d6499f0b7c7c ("net: bgmac: Return PTR_ERR() for fixed_phy_register()") 23a14488ea58 ("net: bgmac: Fix return value check for fixed_phy_register()") drivers/net/ethernet/broadcom/genet/bcmmii.c 32bbe64a1386 ("net: bcmgenet: Fix return value check for fixed_phy_register()") acf50d1adbf4 ("net: bcmgenet: Return PTR_ERR() for fixed_phy_register()") net/sctp/socket.c f866fbc842de ("ipv4: fix data-races around inet->inet_id") b09bde5c3554 ("inet: move inet->mc_loop to inet->inet_frags") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-24 10:51:39 -07:00
Trond Myklebust	cd18f24085	SUNRPC: Don't override connect timeouts in rpc_clnt_add_xprt() If the caller specifies the connect timeouts in the arguments to rpc_clnt_add_xprt(), then we shouldn't override them. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2023-08-24 13:24:15 -04:00
Trond Myklebust	d2ee413884	SUNRPC: Allow specification of TCP client connect timeout at setup When we create a TCP transport, the connect timeout parameters are currently fixed to be 90s. This is problematic in the pNFS flexfiles case, where we may have multiple mirrors, and we would like to fail over quickly to the next mirror if a data server is down. This patch adds the ability to specify the connection parameters at RPC client creation time. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2023-08-24 13:24:15 -04:00
Trond Myklebust	3e6ff89d2e	SUNRPC: Refactor and simplify connect timeout Instead of requiring the requests to redrive the connection several times, just let the TCP connect code manage it now that we've adjusted the TCP_SYNCNT value. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2023-08-24 13:24:15 -04:00
Trond Myklebust	3a107f0740	SUNRPC: Set the TCP_SYNCNT to match the socket timeout Set the TCP SYN count so that we abort the connection attempt at around the expected timeout value. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2023-08-24 13:24:15 -04:00
Linus Torvalds	b5cc3833f1	Networking fixes for 6.5-rc8, including fixes from wifi, can and netfilter Fixes to fixes: - nf_tables: - GC transaction race with abort path - defer gc run if previous batch is still pending Previous releases - regressions: - ipv4: fix data-races around inet->inet_id - phy: fix deadlocking in phy_error() invocation - mdio: fix C45 read/write protocol - ipvlan: fix a reference count leak warning in ipvlan_ns_exit() - ice: fix NULL pointer deref during VF reset - i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters() - tg3: use slab_build_skb() when needed - mtk_eth_soc: fix NULL pointer on hw reset Previous releases - always broken: - core: validate veth and vxcan peer ifindexes - sched: fix a qdisc modification with ambiguous command request - devlink: add missing unregister linecard notification - wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning - batman: - do not get eth header before batadv_check_management_packet - fix batadv_v_ogm_aggr_send memory leak - bonding: fix macvlan over alb bond support - mlxsw: set time stamp fields also when its type is MIRROR_UTC Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmTnJIQSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkt7kP/jy6HOMwSOMFbtxQD2m89EImr6ZlLUPg H09seQzC5nwRbgZrdzukmM27HDKEkYe1sPyxhpS8E4iAslFaefEvnWqOY0oiQSpH OuF4mP/cS9QKb62NwKVrau3SCARS9arLmOF0mcJNdDOWwucE+SoFaebxSMitAU/w k8hHVsLwc5dwZAYznOl2/qsmPBnIUsxfymNJE/RuFqj1nHccGybh9mJKpAxc0knj QEjqno//PgAXPV/X3mH/wG0fcsXs0OlAnBS9yA95GNzuR2yWrh7bD/et99En/elS 8paUio+O3P6Y6WaewgDYFm44pf/x+hFb18Irtab82BkdRw+lgFyF23g8IH7ToJAE mEaxwdS7AQ4XEunNyJsjwiffWUG1nFaoIhaGb0Lo1qmgLHDo+rrNhkrBWvZxSf0Q 8QlMnCXopJ1c5Qltz5QNVaWPErpCcanxV3cpNlG+lTpfamWBrUpuv/EhHCUF/fr3 hlgJEm+WoFTvexO+QC3CyJDz2JYLLMaaYaoUZ1aJS2dtTTc3tfUjEL8VcopfXI87 2FXJ3qEtCkvfdtfFjhofw97qHDvGrTXa9r2JSh1Pp8v15pKdM2P/lMYxd4B0cSEw 9udW/3bWkvHZayzBWvqDEiz3UTID1+uX0/qpBWY40QzTdIXo6sBrCCk93tjJUdcA kXjw9HkSqW6H =WKil -----END PGP SIGNATURE----- Merge tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from wifi, can and netfilter. Fixes to fixes: - nf_tables: - GC transaction race with abort path - defer gc run if previous batch is still pending Previous releases - regressions: - ipv4: fix data-races around inet->inet_id - phy: fix deadlocking in phy_error() invocation - mdio: fix C45 read/write protocol - ipvlan: fix a reference count leak warning in ipvlan_ns_exit() - ice: fix NULL pointer deref during VF reset - i40e: fix potential NULL pointer dereferencing of pf->vf in i40e_sync_vsi_filters() - tg3: use slab_build_skb() when needed - mtk_eth_soc: fix NULL pointer on hw reset Previous releases - always broken: - core: validate veth and vxcan peer ifindexes - sched: fix a qdisc modification with ambiguous command request - devlink: add missing unregister linecard notification - wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning - batman: - do not get eth header before batadv_check_management_packet - fix batadv_v_ogm_aggr_send memory leak - bonding: fix macvlan over alb bond support - mlxsw: set time stamp fields also when its type is MIRROR_UTC" * tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits) selftests: bonding: add macvlan over bond testing selftest: bond: add new topo bond_topo_2d1c.sh bonding: fix macvlan over alb bond support rtnetlink: Reject negative ifindexes in RTM_NEWLINK netfilter: nf_tables: defer gc run if previous batch is still pending netfilter: nf_tables: fix out of memory error handling netfilter: nf_tables: use correct lock to protect gc_list netfilter: nf_tables: GC transaction race with abort path netfilter: nf_tables: flush pending destroy work before netlink notifier netfilter: nf_tables: validate all pending tables ibmveth: Use dcbf rather than dcbfl i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters() net/sched: fix a qdisc modification with ambiguous command request igc: Fix the typo in the PTM Control macro batman-adv: Hold rtnl lock during MTU update via netlink igb: Avoid starting unnecessary workqueues can: raw: add missing refcount for memory leak fix can: isotp: fix support for transmission of SF without flow control bnx2x: new flag for track HW resource allocation sfc: allocate a big enough SKB for loopback selftest packet ...	2023-08-24 08:23:13 -07:00
Herbert Xu	e6a28d6303	libceph: do not include crypto/algapi.h The header file crypto/algapi.h is for internal use only. Use the header file crypto/utils.h instead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2023-08-24 11:24:37 +02:00
Jeff Layton	4e8c4c2355	libceph: allow ceph_osdc_new_request to accept a multi-op read Currently we have some special-casing for multi-op writes, but in the case of a read, we can't really handle it. All of the current multi-op callers call it with CEPH_OSD_FLAG_WRITE set. Have ceph_osdc_new_request check for CEPH_OSD_FLAG_READ and if it's set, allocate multiple reply ops instead of multiple request ops. If neither flag is set, return -EINVAL. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2023-08-24 11:24:35 +02:00
Jeff Layton	69dd3b3930	libceph: add CEPH_OSD_OP_ASSERT_VER support ...and record the user_version in the reply in a new field in ceph_osd_request, so we can populate the assert_ver appropriately. Shuffle the fields a bit too so that the new field fits in an existing hole on x86_64. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Milind Changire <mchangir@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2023-08-24 11:24:35 +02:00
Paolo Abeni	8938fc0c7e	netfilter pull request 2023-08-23 -----BEGIN PGP SIGNATURE----- iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmTmI1cNHGZ3QHN0cmxl bi5kZQAKCRBwkajZrV/2AKBEEACACRkBNJ38IZoNhRdDWWVpoGiBL08BBZ/9Fdhh Cc/iZ0d/XWcAS8qmPlABk82rwZ7EwW0l+9VGai4easY37S6SC0qLKZQYScZj5Fpl hUMRiEn/Hd1fYjgGPCPG7dCFHYmh0JzXDFDDrBE9eRJmo7JdU/M9amLxYa2q1La7 vvC6f9MO7+zUeCl5KLOpCBl3/kLDadHSA0FBaPIWP3K+Pd1wR2QJpNoy8U7XzZJP 0+oS6kqqaOhAKImCzct2de1xfY4djnMzYYxAqxAUdd60/2dLiT+NJK03LA+FMKFX 7bZY/CnoqWZzXbWcMAC/fg7nbj7zSS1HIgOft3zbj1sGZrhZmINC3hTjiIeSwyZV /n0fbV3IQaGCWx3dAGUQpuuCk3FwpIsw4NyRM8v43mnbFeaon/dBtMycXsWP+xiH VMc0j+BJl5zWNynZVTF1PYuNwkX9uubhDVrgtkqZZD+9RzE8i6DiRf7deOBLsI3N XlJpuc34hgGKe3s+Wn1FOY7jMO4FG6OEjB67t0tpjgAxg4mnuxGncXPV+dbTDq9k fgwntbo5RAL9R4itb2Qfy0cg4NiFF1Nqjyzxo+bBMMByst1hlsrAX/V7LInKF9Hi VI4X8YRdV2b8cQVFpqBigJS/k7wRUH7pdgd7YA6QSDVrBSp5mLf49+L7gaGOTJ6i hag4pg== =EVaB -----END PGP SIGNATURE----- Merge tag 'nf-23-08-23' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== netfilter updates for net This PR contains nf_tables updates for your net tree. First patch fixes table validation, I broke this in 6.4 when tracking validation state per table, reported by Pablo, fixup from myself. Second patch makes sure objects waiting for memory release have been released, this was broken in 6.1, patch from Pablo Neira Ayuso. Patch three is a fix-for-fix from previous PR: In case a transaction gets aborted, gc sequence counter needs to be incremented so pending gc requests are invalidated, from Pablo. Same for patch 4: gc list needs to use gc list lock, not destroy lock, also from Pablo. Patch 5 fixes a UaF in a set backend, but this should only occur when failslab is enabled for GFP_KERNEL allocations, broken since feature was added in 5.6, from myself. Patch 6 fixes a double-free bug that was also added via previous PR: We must not schedule gc work if the previous batch is still queued. netfilter pull request 2023-08-23 * tag 'nf-23-08-23' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: defer gc run if previous batch is still pending netfilter: nf_tables: fix out of memory error handling netfilter: nf_tables: use correct lock to protect gc_list netfilter: nf_tables: GC transaction race with abort path netfilter: nf_tables: flush pending destroy work before netlink notifier netfilter: nf_tables: validate all pending tables ==================== Link: https://lore.kernel.org/r/20230823152711.15279-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-08-24 10:33:22 +02:00
Ido Schimmel	30188bd783	rtnetlink: Reject negative ifindexes in RTM_NEWLINK Negative ifindexes are illegal, but the kernel does not validate the ifindex in the ancillary header of RTM_NEWLINK messages, resulting in the kernel generating a warning [1] when such an ifindex is specified. Fix by rejecting negative ifindexes. [1] WARNING: CPU: 0 PID: 5031 at net/core/dev.c:9593 dev_index_reserve+0x1a2/0x1c0 net/core/dev.c:9593 [...] Call Trace: <TASK> register_netdevice+0x69a/0x1490 net/core/dev.c:10081 br_dev_newlink+0x27/0x110 net/bridge/br_netlink.c:1552 rtnl_newlink_create net/core/rtnetlink.c:3471 [inline] __rtnl_newlink+0x115e/0x18c0 net/core/rtnetlink.c:3688 rtnl_newlink+0x67/0xa0 net/core/rtnetlink.c:3701 rtnetlink_rcv_msg+0x439/0xd30 net/core/rtnetlink.c:6427 netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2545 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0x536/0x810 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x93c/0xe40 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:728 [inline] sock_sendmsg+0xd9/0x180 net/socket.c:751 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2538 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2592 __sys_sendmsg+0x117/0x1e0 net/socket.c:2621 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 38f7b870d4a6 ("[RTNETLINK]: Link creation API") Reported-by: syzbot+5ba06978f34abb058571@syzkaller.appspotmail.com Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230823064348.2252280-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-08-24 09:45:52 +02:00
Herbert Xu	8da1985ff7	wifi: mac80211: Do not include crypto/algapi.h The header file crypto/algapi.h is for internal use only. Use the header file crypto/utils.h instead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/E1qYlA0-006vFr-Ts@formenos.hmeau.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2023-08-24 08:42:36 +02:00
Yue Haibing	f9597ba887	xprtrdma: Remove unused function declaration rpcrdma_bc_post_recv() rpcrdma_bc_post_recv() is never implemented since introduction in commit f531a5dbc451 ("xprtrdma: Pre-allocate backward rpc_rqst and send/receive buffers"). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2023-08-23 15:58:47 -04:00
Anna Schumaker	61182c796d	SUNRPC: kmap() the xdr pages during decode If the pages are in HIGHMEM then we need to make sure they're mapped before trying to read data off of them, otherwise we could end up with a NULL pointer dereference. The downside to this is that we need an extra cleanup step at the end of decode to kunmap() the last page. I introduced an xdr_finish_decode() function to do this. Right now this function only calls the unmap_current_page() function, but other generic cleanup steps could be added in the future if we come across anything else. Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2023-08-23 15:58:47 -04:00
Florian Westphal	8e51830e29	netfilter: nf_tables: defer gc run if previous batch is still pending Don't queue more gc work, else we may queue the same elements multiple times. If an element is flagged as dead, this can mean that either the previous gc request was invalidated/discarded by a transaction or that the previous request is still pending in the system work queue. The latter will happen if the gc interval is set to a very low value, e.g. 1ms, and system work queue is backlogged. The sets refcount is 1 if no previous gc requeusts are queued, so add a helper for this and skip gc run if old requests are pending. Add a helper for this and skip the gc run in this case. Fixes: f6c383b8c31a ("netfilter: nf_tables: adapt set backend to use GC transaction API") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-08-23 16:12:59 +02:00
Florian Westphal	5e1be4cdc9	netfilter: nf_tables: fix out of memory error handling Several instances of pipapo_resize() don't propagate allocation failures, this causes a crash when fault injection is enabled for gfp_kernel slabs. Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Stefano Brivio <sbrivio@redhat.com>	2023-08-23 16:12:10 +02:00
Pablo Neira Ayuso	8357bc946a	netfilter: nf_tables: use correct lock to protect gc_list Use nf_tables_gc_list_lock spinlock, not nf_tables_destroy_list_lock to protect the gc list. Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-08-23 16:10:01 +02:00
Pablo Neira Ayuso	720344340f	netfilter: nf_tables: GC transaction race with abort path Abort path is missing a synchronization point with GC transactions. Add GC sequence number hence any GC transaction losing race will be discarded. Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-08-23 16:10:01 +02:00
Pablo Neira Ayuso	2c9f029328	netfilter: nf_tables: flush pending destroy work before netlink notifier Destroy work waits for the RCU grace period then it releases the objects with no mutex held. All releases objects follow this path for transactions, therefore, order is guaranteed and references to top-level objects in the hierarchy remain valid. However, netlink notifier might interfer with pending destroy work. rcu_barrier() is not correct because objects are not release via RCU callback. Flush destroy work before releasing objects from netlink notifier path. Fixes: d4bc8271db21 ("netfilter: nf_tables: netlink notifier might race to release objects") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-08-23 16:10:01 +02:00
Florian Westphal	4b80ced971	netfilter: nf_tables: validate all pending tables We have to validate all tables in the transaction that are in VALIDATE_DO state, the blamed commit below did not move the break statement to its right location so we only validate one table. Moreover, we can't init table->validate to _SKIP when a table object is allocated. If we do, then if a transcaction creates a new table and then fails the transaction, nfnetlink will loop and nft will hang until user cancels the command. Add back the pernet state as a place to stash the last state encountered. This is either _DO (we hit an error during commit validation) or _SKIP (transaction passed all checks). Fixes: 00c320f9b755 ("netfilter: nf_tables: make validation state per table") Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>	2023-08-23 16:10:01 +02:00
Jamal Hadi Salim	da71714e35	net/sched: fix a qdisc modification with ambiguous command request When replacing an existing root qdisc, with one that is of the same kind, the request boils down to essentially a parameterization change i.e not one that requires allocation and grafting of a new qdisc. syzbot was able to create a scenario which resulted in a taprio qdisc replacing an existing taprio qdisc with a combination of NLM_F_CREATE, NLM_F_REPLACE and NLM_F_EXCL leading to create and graft scenario. The fix ensures that only when the qdisc kinds are different that we should allow a create and graft, otherwise it goes into the "change" codepath. While at it, fix the code and comments to improve readability. While syzbot was able to create the issue, it did not zone on the root cause. Analysis from Vladimir Oltean <vladimir.oltean@nxp.com> helped narrow it down. v1->V2 changes: - remove "inline" function definition (Vladmir) - remove extrenous braces in branches (Vladmir) - change inline function names (Pedro) - Run tdc tests (Victor) v2->v3 changes: - dont break else/if (Simon) Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+a3618a167af2021433cd@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/20230816225759.g25x76kmgzya2gei@skbuf/T/ Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-23 09:44:48 +01:00
Jordan Rife	0bdf399342	net: Avoid address overwrite in kernel_connect BPF programs that run on connect can rewrite the connect address. For the connect system call this isn't a problem, because a copy of the address is made when it is moved into kernel space. However, kernel_connect simply passes through the address it is given, so the caller may observe its address value unexpectedly change. A practical example where this is problematic is where NFS is combined with a system such as Cilium which implements BPF-based load balancing. A common pattern in software-defined storage systems is to have an NFS mount that connects to a persistent virtual IP which in turn maps to an ephemeral server IP. This is usually done to achieve high availability: if your server goes down you can quickly spin up a replacement and remap the virtual IP to that endpoint. With BPF-based load balancing, mounts will forget the virtual IP address when the address rewrite occurs because a pointer to the only copy of that address is passed down the stack. Server failover then breaks, because clients have forgotten the virtual IP address. Reconnects fail and mounts remain broken. This patch was tested by setting up a scenario like this and ensuring that NFS reconnects worked after applying the patch. Signed-off-by: Jordan Rife <jrife@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-08-23 09:42:05 +01:00

1 2 3 4 5 ...

74468 Commits