linux

iv/linux

Author	SHA1	Message	Date
Dinghao Liu	b65d52ac9c	qed: Fix a potential use-after-free in qed_cxt_tables_alloc qed_ilt_shadow_alloc() will call qed_ilt_shadow_free() to free p_hwfn->p_cxt_mngr->ilt_shadow on error. However, qed_cxt_tables_alloc() accesses the freed pointer on failure of qed_ilt_shadow_alloc() through calling qed_cxt_mngr_free(), which may lead to use-after-free. Fix this issue by setting p_mngr->ilt_shadow to NULL in qed_ilt_shadow_free(). Fixes: fe56b9e6a8d9 ("qed: Add module with basic common support") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Link: https://lore.kernel.org/r/20231210045255.21383-1-dinghao.liu@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 13:33:51 -08:00
Dmitry Antipov	2a62644800	net: asix: fix fortify warning When compiling with gcc version 14.0.0 20231129 (experimental) and CONFIG_FORTIFY_SOURCE=y, I've noticed the following warning: ... In function 'fortify_memcpy_chk', inlined from 'ax88796c_tx_fixup' at drivers/net/ethernet/asix/ax88796c_main.c:287:2: ./include/linux/fortify-string.h:588:25: warning: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 588 \| __read_overflow2_field(q_size_field, size); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ... This call to 'memcpy()' is interpreted as an attempt to copy TX_OVERHEAD (which is 8) bytes from 4-byte 'sop' field of 'struct tx_pkt_info' and thus overread warning is issued. Since we actually want to copy both 'sop' and 'seg' fields at once, use the convenient 'struct_group()' here. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Acked-by: Łukasz Stelmach <l.stelmach@samsung.com> Link: https://lore.kernel.org/r/20231211090535.9730-1-dmantipov@yandex.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 13:20:37 -08:00
Ilpo Järvinen	bf88f7d920	e1000e: Use pcie_capability_read_word() for reading LNKSTA Use pcie_capability_read_word() for reading LNKSTA and remove the custom define that matches to PCI_EXP_LNKSTA. As only single user for cap_offset remains, replace it with a call to pci_pcie_cap(). Instead of e1000_adapter, make local variable out of pci_dev because both users are interested in it. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-12-12 12:21:29 -08:00
Linus Torvalds	cf52eed70e	Fix various bugs / regressions for ext4, including a soft lockup, a WARN_ON, and a BUG. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmV4tMwACgkQ8vlZVpUN gaNZRAf/ejQZne9iZck8SSV62mkR9E7EwN9J2+gkWFrlsyurErZlVsBA5yRB+i9A V1v6DRGDnYFwKFNHJhR/RW9NhEwpYkX9Vo3miksSCq8rsAB1kjSs3xVrTBIYi/8c ztw4ncyxW7RRFRmruzFfUEKriiyJzxJYx+EqbNsQHcl5ET6Y2/5zM0bChV9MwuN3 iS1Rm98RbHVrylzKbGG562MaGdJyUYvQ+mnRCgma1mTu6K9SWLJg211icLTsDhHg XEB/QGWji2O7xOudcry8wLIpoR6rYPAhWfbkLekW1K9hjV3iXuJoVjj7eB9LctMf FAXr8u0FKJI0iIQyrQrEEqIuh+jKBA== =4zQL -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix various bugs / regressions for ext4, including a soft lockup, a WARN_ON, and a BUG" * tag 'ext4_for_linus-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd2: fix soft lockup in journal_finish_inode_data_buffers() ext4: fix warning in ext4_dio_write_end_io() jbd2: increase the journal IO's priority jbd2: correct the printing of write_flags in jbd2_write_superblock() ext4: prevent the normalized size from exceeding EXT_MAX_BLOCKS	2023-12-12 11:37:04 -08:00
Slawomir Laba	7ae42ef308	iavf: Fix iavf_shutdown to call iavf_remove instead iavf_close Make the flow for pci shutdown be the same to the pci remove. iavf_shutdown was implementing an incomplete version of iavf_remove. It misses several calls to the kernel like iavf_free_misc_irq, iavf_reset_interrupt_capability, iounmap that might break the system on reboot or hibernation. Implement the call of iavf_remove directly in iavf_shutdown to close this gap. Fixes below error messages (dmesg) during shutdown stress tests - [685814.900917] ice 0000:88:00.0: MAC 02:d0:5f:82:43:5d does not exist for VF 0 [685814.900928] ice 0000:88:00.0: MAC 33:33:00:00:00:01 does not exist for VF 0 Reproduction: 1. Create one VF interface: echo 1 > /sys/class/net/<interface_name>/device/sriov_numvfs 2. Run live dmesg on the host: dmesg -wH 3. On SUT, script below steps into vf_namespace_assignment.sh <#!/bin/sh> // Remove <>. Git removes # line if=<VF name> (edit this per VF name) loop=0 while true; do echo test round $loop let loop++ ip netns add ns$loop ip link set dev $if up ip link set dev $if netns ns$loop ip netns exec ns$loop ip link set dev $if up ip netns exec ns$loop ip link set dev $if netns 1 ip netns delete ns$loop done 4. Run the script for at least 1000 iterations on SUT: ./vf_namespace_assignment.sh Expected result: No errors in dmesg. Fixes: 129cf89e5856 ("iavf: rename functions and structs to new name") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Co-developed-by: Ranganatha Rao <ranganatha.rao@intel.com> Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-12-12 11:21:30 -08:00
Piotr Gardocki	09d23b8918	iavf: Handle ntuple on/off based on new state machines for flow director ntuple-filter feature on/off: Default is on. If turned off, the filters will be removed from both PF and iavf list. The removal is irrespective of current filter state. Steps to reproduce: ------------------- 1. Ensure ntuple is on. ethtool -K enp8s0 ntuple-filters on 2. Create a filter to receive the traffic into non-default rx-queue like 15 and ensure traffic is flowing into queue into 15. Now, turn off ntuple. Traffic should not flow to configured queue 15. It should flow to default RX queue. Fixes: 0dbfbabb840d ("iavf: Add framework to enable ethtool ntuple filters") Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-12-12 11:21:30 -08:00
Piotr Gardocki	3a0b5a2929	iavf: Introduce new state machines for flow director New states introduced: IAVF_FDIR_FLTR_DIS_REQUEST IAVF_FDIR_FLTR_DIS_PENDING IAVF_FDIR_FLTR_INACTIVE Current FDIR state machines (SM) are not adequate to handle a few scenarios in the link DOWN/UP event, reset event and ntuple-feature. For example, when VF link goes DOWN and comes back UP administratively, the expectation is that previously installed filters should also be restored. But with current SM, filters are not restored. So with new SM, during link DOWN filters are marked as INACTIVE in the iavf list but removed from PF. After link UP, SM will transition from INACTIVE to ADD_REQUEST to restore the filter. Similarly, with VF reset, filters will be removed from the PF, but marked as INACTIVE in the iavf list. Filters will be restored after reset completion. Steps to reproduce: ------------------- 1. Create a VF. Here VF is enp8s0. 2. Assign IP addresses to VF and link partner and ping continuously from remote. Here remote IP is 1.1.1.1. 3. Check default RX Queue of traffic. ethtool -S enp8s0 \| grep -E "rx-[[:digit:]]+\.packets" 4. Add filter - change default RX Queue (to 15 here) ethtool -U ens8s0 flow-type ip4 src-ip 1.1.1.1 action 15 loc 5 5. Ensure filter gets added and traffic is received on RX queue 15 now. Link event testing: ------------------- 6. Bring VF link down and up. If traffic flows to configured queue 15, test is success, otherwise it is a failure. Reset event testing: -------------------- 7. Reset the VF. If traffic flows to configured queue 15, test is success, otherwise it is a failure. Fixes: 0dbfbabb840d ("iavf: Add framework to enable ethtool ntuple filters") Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-12-12 11:20:40 -08:00
Linus Torvalds	eaadbbaaff	fuse fixes for 6.7-rc6 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCZXhQgQAKCRDh3BK/laaZ PHL9AQC0y7A+HLH6oXM8uI8rqC8e78qGdoGGl+Ppapae+BhO8gD+Or5B4yR0MZKR Z/j0zMe57mmRMplxcwz/LXXCqeE9+w0= =cn6g -----END PGP SIGNATURE----- Merge tag 'fuse-fixes-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: - Fix a couple of potential crashes, one introduced in 6.6 and one in 5.10 - Fix misbehavior of virtiofs submounts on memory pressure - Clarify naming in the uAPI for a recent feature * tag 'fuse-fixes-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: disable FOPEN_PARALLEL_DIRECT_WRITES with FUSE_DIRECT_IO_ALLOW_MMAP fuse: dax: set fc->dax to NULL in fuse_dax_conn_free() fuse: share lookup state between submount and its parent docs/fuse-io: Document the usage of DIRECT_IO_ALLOW_MMAP fuse: Rename DIRECT_IO_RELAX to DIRECT_IO_ALLOW_MMAP	2023-12-12 11:06:41 -08:00
Ilpo Järvinen	4c39e76846	e1000e: Use PCI_EXP_LNKSTA_NLW & FIELD_GET() instead of custom defines/code e1000e has own copy of PCI Negotiated Link Width field defines. Use the ones from include/uapi/linux/pci_regs.h instead of the custom ones and remove the custom ones and convert to FIELD_GET(). Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-12-12 10:55:22 -08:00
Ilpo Järvinen	4f6011678d	igb: Use FIELD_GET() to extract Link Width Use FIELD_GET() to extract PCIe Negotiated Link Width field instead of custom masking and shifting. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2023-12-12 10:55:22 -08:00
Linus Torvalds	8b8cd4beea	nine smb3 server fixes -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmV3pUoACgkQiiy9cAdy T1GQCgv/YURd8zz5k+GSOvUF2tCl6zW6h0NJQbWRIjgl4i7eGHZwIslgCI6kZIN1 AFrSyUj4tZQmFvh0aVWLZeWsoKETbSkOYkz2dC4X/lC8LJD3VAy3vzAhu4oSAWva +pItQVlOTG0CcmMSTANSfw0sSsCwC2BHAUJnnu7ypgERI3wllOPtxE1xN9mT/8Bf NxJDZa3jtZd2hC4Cda1NTYYEfaSGufEOzPZIW9/h5ftpRo0qtEZkKh9TPddBMGm4 yMnt1sSp4DHoW6xOyGOt+7kJAGA5NtP3/voLSjirG558Bb4HjWhBT+Dkxe6dUiXn i9gi1bFJ/8gRulv1cTdOxTFGE+i9Wr4PzpG2g82qugYRTl3LqLoJBa8NH+WzKz+q AX8EySFdlJtE++wTMNZB5hgFuJNGkzRi3YbjrQjvHFDQvaSVHvtayyhuEN+UcqAe gWuj1PTDKy6cfkxFYPDEBtMgp1u4+72nWOxoYUE5LyvzkLCLjfgMKCDX03RlAvfZ zB76cU/3 =yMkH -----END PGP SIGNATURE----- Merge tag '6.7-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd Pull smb server fixes from Steve French: - Memory leak fix (in lock error path) - Two fixes for create with allocation size - FIx for potential UAF in lease break error path - Five directory lease (caching) fixes found during additional recent testing * tag '6.7-rc5-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix wrong name of SMB2_CREATE_ALLOCATION_SIZE ksmbd: fix wrong allocation size update in smb2_open() ksmbd: avoid duplicate opinfo_put() call on error of smb21_lease_break_ack() ksmbd: lazy v2 lease break on smb2_write() ksmbd: send v2 lease break notification for directory ksmbd: downgrade RWH lease caching state to RH for directory ksmbd: set v2 lease capability ksmbd: set epoch in create context v2 lease ksmbd: fix memory leak in smb2_lock()	2023-12-12 10:30:10 -08:00
Ye Bin	6c02757c93	jbd2: fix soft lockup in journal_finish_inode_data_buffers() There's issue when do io test: WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170] CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G OE Call trace: dump_backtrace+0x0/0x1a0 show_stack+0x24/0x30 dump_stack+0xb0/0x100 watchdog_timer_fn+0x254/0x3f8 __hrtimer_run_queues+0x11c/0x380 hrtimer_interrupt+0xfc/0x2f8 arch_timer_handler_phys+0x38/0x58 handle_percpu_devid_irq+0x90/0x248 generic_handle_irq+0x3c/0x58 __handle_domain_irq+0x68/0xc0 gic_handle_irq+0x90/0x320 el1_irq+0xcc/0x180 queued_spin_lock_slowpath+0x1d8/0x320 jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2] kjournald2+0xec/0x2f0 [jbd2] kthread+0x134/0x138 ret_from_fork+0x10/0x18 Analyzed informations from vmcore as follows: (1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list'; (2) Now is processing the 855th jbd2_inode; (3) JBD2 task has TIF_NEED_RESCHED flag; (4) There's no pags in address_space around the 855th jbd2_inode; (5) There are some process is doing drop caches; (6) Mounted with 'nodioread_nolock' option; (7) 128 CPUs; According to informations from vmcore we know 'journal->j_list_lock' spin lock competition is fierce. So journal_finish_inode_data_buffers() maybe process slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors(). However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK, will not call cond_resched(). So may lead to soft lockup. journal_finish_inode_data_buffers filemap_fdatawait_range_keep_errors __filemap_fdatawait_range while (index <= end) nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK); if (!nr_pages) break; --> If 'nr_pages' is equal zero will break, then will not call cond_resched() for (i = 0; i < nr_pages; i++) wait_on_page_writeback(page); cond_resched(); To solve above issue, add scheduling point in the journal_finish_inode_data_buffers(); Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-12-12 10:25:46 -05:00
Yan Jun	df83a0df82	HID: apple: Add "hfd.cn" and "WKB603" to the list of non-apple keyboards JingZao(京造) WKB603 keyboard is a rebranded product of Jamesdonkey RS2 keyboard, identified as "hfd.cn WKB603" in wired mode, "WKB603" in bluetooth mode. Adding them to the list of non-apple keyboards fixes function key. Signed-off-by: Yan Jun <jerrysteve1101@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2023-12-12 14:50:47 +01:00
Mikhail Khvainitski	43527a0094	HID: lenovo: Restrict detection of patched firmware only to USB cptkbd Commit 46a0a2c96f0f ("HID: lenovo: Detect quirk-free fw on cptkbd and stop applying workaround") introduced a regression for ThinkPad TrackPoint Keyboard II which has similar quirks to cptkbd (so it uses the same workarounds) but slightly different so that there are false-positives during detecting well-behaving firmware. This commit restricts detecting well-behaving firmware to the only model which known to have one and have stable enough quirks to not cause false-positives. Fixes: 46a0a2c96f0f ("HID: lenovo: Detect quirk-free fw on cptkbd and stop applying workaround") Link: https://lore.kernel.org/linux-input/ZXRiiPsBKNasioqH@jekhomev/ Link: https://bbs.archlinux.org/viewtopic.php?pid=2135468#p2135468 Signed-off-by: Mikhail Khvainitski <me@khvoinitsky.org> Tested-by: Yauhen Kharuzhy <jekhor@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2023-12-12 14:45:34 +01:00
Paolo Abeni	609c767f2c	Merge branch 'net-dsa-realtek-two-rtl8366rb-fixes' Linus Walleij says: ==================== net: dsa: realtek: Two RTL8366RB fixes These minor fixes were found while digging into other issues: a weirdly named variable and bogus MTU handling. Fix it up. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> ==================== Link: https://lore.kernel.org/r/20231209-rtl8366rb-mtu-fix-v1-0-df863e2b2b2a@linaro.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-12-12 14:17:08 +01:00
Linus Walleij	d577ca429a	net: dsa: realtek: Rewrite RTL8366RB MTU handling The MTU callbacks are in layer 1 size, so for example 1500 bytes is a normal setting. Cache this size, and only add the layer 2 framing right before choosing the setting. On the CPU port this will however include the DSA tag since this is transmitted from the parent ethernet interface! Add the layer 2 overhead such as ethernet and VLAN framing and FCS before selecting the size in the register. This will make the code easier to understand. The rtl8366rb_max_mtu() callback returns a bogus MTU just subtracting the CPU tag, which is the only thing we should NOT subtract. Return the correct layer 1 max MTU after removing headers and checksum. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-12-12 14:17:06 +01:00
Linus Walleij	389119c842	net: dsa: realtek: Rename bogus RTL8368S variable Rename the register name to RTL8366RB instead of the bogus RTL8368S (internal product name?) Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Reviewed-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-12-12 14:17:06 +01:00
Hyunwoo Kim	810c38a369	net/rose: Fix Use-After-Free in rose_ioctl Because rose_ioctl() accesses sk->sk_receive_queue without holding a sk->sk_receive_queue.lock, it can cause a race with rose_accept(). A use-after-free for skb occurs with the following flow. ``` rose_ioctl() -> skb_peek() rose_accept() -> skb_dequeue() -> kfree_skb() ``` Add sk->sk_receive_queue.lock to rose_ioctl() to fix this issue. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Link: https://lore.kernel.org/r/20231209100538.GA407321@v4bel-B760M-AORUS-ELITE-AX Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-12-12 13:24:58 +01:00
Hyunwoo Kim	24e90b9e34	atm: Fix Use-After-Free in do_vcc_ioctl Because do_vcc_ioctl() accesses sk->sk_receive_queue without holding a sk->sk_receive_queue.lock, it can cause a race with vcc_recvmsg(). A use-after-free for skb occurs with the following flow. ``` do_vcc_ioctl() -> skb_peek() vcc_recvmsg() -> skb_recv_datagram() -> skb_free_datagram() ``` Add sk->sk_receive_queue.lock to do_vcc_ioctl() to fix this issue. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Link: https://lore.kernel.org/r/20231209094210.GA403126@v4bel-B760M-AORUS-ELITE-AX Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-12-12 13:14:08 +01:00
Ahelenia Ziemiańska	26c79ec96e	net: dns_resolver: the module is called dns_resolver, not dnsresolver $ modinfo dnsresolver dns_resolver \| grep name modinfo: ERROR: Module dnsresolver not found. filename: /lib/modules/6.1.0-9-amd64/kernel/net/dns_resolver/dns_resolver.ko name: dns_resolver Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Link: https://lore.kernel.org/r/gh4sxphjxbo56n2spgmc66vtazyxgiehpmv5f2gkvgicy6f4rs@tarta.nabijaczleweli.xyz Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-12-12 12:14:19 +01:00
Andy Shevchenko	68cbdb150d	net: dl2k: Use proper conversion of dev_addr before IO to device The driver is using iowriteXX()/ioreadXX() APIs which are LE IO accessors simplified as 1. Convert given value _from_ CPU _to_ LE 2. Write it to the device as is The dev_addr is a byte stream, but because the driver uses 16-bit IO accessors, it wants to perform double conversion on BE CPUs, but it took it wrong, as it effectivelly does two times _from_ CPU _to_ LE. What it has to do is to consider dev_addr as an array of LE16 and hence do _from_ LE _to_ CPU conversion, followed by implied _from_ CPU _to_ LE in the iowrite16(). To achieve that, use get_unaligned_le16(). This will make it correct and allows to avoid sparse warning as reported by LKP. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312030058.hfZPTXd7-lkp@intel.com/ Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20231208153327.3306798-1-andriy.shevchenko@linux.intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-12-12 11:25:36 +01:00
Swarup Laxman Kotiaklapudi	68c84289bc	netlink: specs: devlink: add some(not all) missing attributes in devlink.yaml Add some missing(not all) attributes in devlink.yaml. Signed-off-by: Swarup Laxman Kotiaklapudi <swarupkotikalapudi@gmail.com> Suggested-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231208182515.1206616-1-swarupkotikalapudi@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-11 18:54:13 -08:00
Jakub Kicinski	b72137ecd5	Merge branch 'net-sched-conditional-notification-of-events-for-cls-and-act' Pedro Tammela says: ==================== net/sched: conditional notification of events for cls and act This is an optimization we have been leveraging on P4TC but we believe it will benefit rtnl users in general. It's common to allocate an skb, build a notification message and then broadcast an event. In the absence of any user space listeners, these resources (cpu and memory operations) are wasted. In cases where the subsystem is lockless (such as in tc-flower) this waste is more prominent. For the scenarios where the rtnl_lock is held it is not as prominent. The idea is simple. Build and send the notification iif: - The user requests via NLM_F_ECHO or - Someone is listening to the rtnl group (tc mon) On a simple test with tc-flower adding 1M entries, using just a single core, there's already a noticeable difference in the cycles spent in tc_new_tfilter with this patchset. before: - 43.68% tc_new_tfilter + 31.73% fl_change + 6.35% tfilter_notify + 1.62% nlmsg_notify 0.66% __tcf_qdisc_find.part.0 0.64% __tcf_chain_get 0.54% fl_get + 0.53% tcf_proto_lookup_ops after: - 39.20% tc_new_tfilter + 34.58% fl_change 0.69% __tcf_qdisc_find.part.0 0.67% __tcf_chain_get + 0.61% tcf_proto_lookup_ops Note, the above test is using iproute2:tc which execs a shell. We expect people using netlink directly to observe even greater reductions. The qdisc side needs some refactoring of the notification routines to fit in this new model, so they will be sent in a later patchset. ==================== Link: https://lore.kernel.org/r/20231208192847.714940-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-11 18:53:01 -08:00
Pedro Tammela	93775590b1	net/sched: cls_api: conditional notification of events As of today tc-filter/chain events are unconditionally built and sent to RTNLGRP_TC. As with the introduction of rtnl_notify_needed we can check before-hand if they are really needed. This will help to alleviate system pressure when filters are concurrently added without the rtnl lock as in tc-flower. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Link: https://lore.kernel.org/r/20231208192847.714940-8-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-11 18:52:57 -08:00
Pedro Tammela	e522755520	net/sched: cls_api: remove 'unicast' argument from delete notification This argument is never called while set to true, so remove it as there's no need for it. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231208192847.714940-7-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-11 18:52:57 -08:00
Pedro Tammela	8d4390f519	net/sched: act_api: conditional notification of events As of today tc-action events are unconditionally built and sent to RTNLGRP_TC. As with the introduction of rtnl_notify_needed we can check before-hand if they are really needed. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231208192847.714940-6-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-11 18:52:57 -08:00
Pedro Tammela	c73724bfde	net/sched: act_api: don't open code max() Use max() in a couple of places that are open coding it with the ternary operator. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231208192847.714940-5-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-11 18:52:57 -08:00
Pedro Tammela	ddb6b284bd	rtnl: add helper to send if skb is not null This is a convenience helper for routines handling conditional rtnl events, that is code that might send a notification depending on rtnl_has_listeners/rtnl_notify_needed. Instead of: if (skb) rtnetlink_send(...) Use: rtnetlink_maybe_send(...) Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Link: https://lore.kernel.org/r/20231208192847.714940-4-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-11 18:52:57 -08:00
Victor Nogueira	8439109b76	rtnl: add helper to check if a notification is needed Building on the rtnl_has_listeners helper, add the rtnl_notify_needed helper to check if we can bail out early in the notification routines. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Link: https://lore.kernel.org/r/20231208192847.714940-3-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-11 18:52:56 -08:00
Jamal Hadi Salim	c5e2a97344	rtnl: add helper to check if rtnl group has listeners As of today, rtnl code creates a new skb and unconditionally fills and broadcasts it to the relevant group. For most operations this is okay and doesn't waste resources in general. When operations are done without the rtnl_lock, as in tc-flower, such skb allocation, message fill and no-op broadcasting can happen in all cores of the system, which contributes to system pressure and wastes precious cpu cycles when no one will receive the built message. Introduce this helper so rtnetlink operations can simply check if someone is listening and then proceed if necessary. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Link: https://lore.kernel.org/r/20231208192847.714940-2-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-11 18:52:56 -08:00
Linus Torvalds	26aff84943	More bcachefs bugfixes for 6.7: - Fix a rare emergency shutdown path bug: dropping journal pins after the filesystem has mostly been torn down is not what we want. - Fix some concurrency issues with the btree write buffer and journal replay by not using the btree write buffer until journal replay is finished - A fixup from the prior patch to kill journal pre-reservations: at the start of the btree update path, where previously we took a pre-reservation, we do at least want to check the journal watermark. - Fix a race between dropping device metadata and btree node writes, which would re-add a pointer to a device that had just been dropped - Fix one of the SCRU lock warnings, in bch2_compression_stats_to_text(). - Partial fix for a rare transaction paths overflow, when indirect extents had been split by background tasks, by not running certain triggers when they're not needed. - Fix for creating a snapshot with implicit source in a subdirectory of the containing subvolume - Don't unfreeze when we're emergency read-only - Fix for rebalance spinning trying to compress unwritten extentns - Another deleted_inodes fix, for directories - Fix a rare deadlock (usually just an unecessary wait) when flushing the journal with an open journal entry. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmV2T4EACgkQE6szbY3K bnay1w/+PyH5qwE2gOy17rno6cWSNyKJELUkcqVNqrSTZpuA+TbMbcV8+oOeBnG1 9/ShwKRvwwNC4HVk6KySoTMo9lRkaZ5wX6DpEsOqxoN8aCp6kqiCUxr0inAAyVdu O8FktP83eSX/vERWNlCeGLdi1KsCK0BWVbVMpkiVEO9QhLpS9eo1C8btstDIjbsv TVGvKO7IpVgibSBwymQPKpZa6BGN4d6emLlgKStdpVVR1RwJW3eLJwi1EV2hSp1f LBnTI5eD64pu+phEb4zE83JX932XAbxdBWaHlN1y3i4l6+sJDu63Y4R8bkbW+rnJ cbiyYM5IuAH6MFbbh9rIW8kEIvjrX13mY94oGlK8ClCI9WX129jD5538tEH624U5 KnhCZpkuzeGC5CVXNAzdJ8NP/Aj9qtKvSyssG6R5ZTitQ1FnTZ391Wb2pIRgj9pm yVfpJ/Q4cizVfSsKBvtr0U5I444zq50z+brKwegIoH8uMuGHKXcIgTUOu4q5pKDD znjS9eFrQTN2li2HB3LMxuS94yUmozqwgxClMptynLsHVknQH7F3cAdD+mYbwW5Q GUOd/QTlpskBYAUfBS8ewllowRjLGDJyrGvbR9Mvitk8CxOLRgoDipdh1K13jDMS zCmG1eQgdbtPHTM6fqif8Bu8xtgK7p2r099dcBhhiWmRyLPo5Qw= =l5sa -----END PGP SIGNATURE----- Merge tag 'bcachefs-2023-12-10' of https://evilpiepirate.org/git/bcachefs Pull more bcachefs bugfixes from Kent Overstreet: - Fix a rare emergency shutdown path bug: dropping journal pins after the filesystem has mostly been torn down is not what we want. - Fix some concurrency issues with the btree write buffer and journal replay by not using the btree write buffer until journal replay is finished - A fixup from the prior patch to kill journal pre-reservations: at the start of the btree update path, where previously we took a pre-reservation, we do at least want to check the journal watermark. - Fix a race between dropping device metadata and btree node writes, which would re-add a pointer to a device that had just been dropped - Fix one of the SCRU lock warnings, in bch2_compression_stats_to_text(). - Partial fix for a rare transaction paths overflow, when indirect extents had been split by background tasks, by not running certain triggers when they're not needed. - Fix for creating a snapshot with implicit source in a subdirectory of the containing subvolume - Don't unfreeze when we're emergency read-only - Fix for rebalance spinning trying to compress unwritten extentns - Another deleted_inodes fix, for directories - Fix a rare deadlock (usually just an unecessary wait) when flushing the journal with an open journal entry. * tag 'bcachefs-2023-12-10' of https://evilpiepirate.org/git/bcachefs: bcachefs: Close journal entry if necessary when flushing all pins bcachefs: Fix uninitialized var in bch2_journal_replay() bcachefs: Fix deleted inode check for dirs bcachefs: rebalance shouldn't attempt to compress unwritten extents bcachefs: don't attempt rw on unfreeze when shutdown bcachefs: Fix creating snapshot with implict source bcachefs: Don't run indirect extent trigger unless inserting/deleting bcachefs: Convert compression_stats to for_each_btree_key2 bcachefs: Fix bch2_extent_drop_ptrs() call bcachefs: Fix a journal deadlock in replay bcachefs; Don't use btree write buffer until journal replay is finished bcachefs: Don't drop journal pins in exit path	2023-12-11 16:13:51 -08:00
David Howells	52bf9f6c09	afs: Fix refcount underflow from error handling race If an AFS cell that has an unreachable (eg. ENETUNREACH) server listed (VL server or fileserver), an asynchronous probe to one of its addresses may fail immediately because sendmsg() returns an error. When this happens, a refcount underflow can happen if certain events hit a very small window. The way this occurs is: (1) There are two levels of "call" object, the afs_call and the rxrpc_call. Each of them can be transitioned to a "completed" state in the event of success or failure. (2) Asynchronous afs_calls are self-referential whilst they are active to prevent them from evaporating when they're not being processed. This reference is disposed of when the afs_call is completed. Note that an afs_call may only be completed once; once completed completing it again will do nothing. (3) When a call transmission is made, the app-side rxrpc code queues a Tx buffer for the rxrpc I/O thread to transmit. The I/O thread invokes sendmsg() to transmit it - and in the case of failure, it transitions the rxrpc_call to the completed state. (4) When an rxrpc_call is completed, the app layer is notified. In this case, the app is kafs and it schedules a work item to process events pertaining to an afs_call. (5) When the afs_call event processor is run, it goes down through the RPC-specific handler to afs_extract_data() to retrieve data from rxrpc - and, in this case, it picks up the error from the rxrpc_call and returns it. The error is then propagated to the afs_call and that is completed too. At this point the self-reference is released. (6) If the rxrpc I/O thread manages to complete the rxrpc_call within the window between rxrpc_send_data() queuing the request packet and checking for call completion on the way out, then rxrpc_kernel_send_data() will return the error from sendmsg() to the app. (7) Then afs_make_call() will see an error and will jump to the error handling path which will attempt to clean up the afs_call. (8) The problem comes when the error handling path in afs_make_call() tries to unconditionally drop an async afs_call's self-reference. This self-reference, however, may already have been dropped by afs_extract_data() completing the afs_call (9) The refcount underflows when we return to afs_do_probe_vlserver() and that tries to drop its reference on the afs_call. Fix this by making afs_make_call() attempt to complete the afs_call rather than unconditionally putting it. That way, if afs_extract_data() manages to complete the call first, afs_make_call() won't do anything. The bug can be forced by making do_udp_sendmsg() return -ENETUNREACH and sticking an msleep() in rxrpc_send_data() after the 'success:' label to widen the race window. The error message looks something like: refcount_t: underflow; use-after-free. WARNING: CPU: 3 PID: 720 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110 ... RIP: 0010:refcount_warn_saturate+0xba/0x110 ... afs_put_call+0x1dc/0x1f0 [kafs] afs_fs_get_capabilities+0x8b/0xe0 [kafs] afs_fs_probe_fileserver+0x188/0x1e0 [kafs] afs_lookup_server+0x3bf/0x3f0 [kafs] afs_alloc_server_list+0x130/0x2e0 [kafs] afs_create_volume+0x162/0x400 [kafs] afs_get_tree+0x266/0x410 [kafs] vfs_get_tree+0x25/0xc0 fc_mount+0xe/0x40 afs_d_automount+0x1b3/0x390 [kafs] __traverse_mounts+0x8f/0x210 step_into+0x340/0x760 path_openat+0x13a/0x1260 do_filp_open+0xaf/0x160 do_sys_openat2+0xaf/0x170 or something like: refcount_t: underflow; use-after-free. ... RIP: 0010:refcount_warn_saturate+0x99/0xda ... afs_put_call+0x4a/0x175 afs_send_vl_probes+0x108/0x172 afs_select_vlserver+0xd6/0x311 afs_do_cell_detect_alias+0x5e/0x1e9 afs_cell_detect_alias+0x44/0x92 afs_validate_fc+0x9d/0x134 afs_get_tree+0x20/0x2e6 vfs_get_tree+0x1d/0xc9 fc_mount+0xe/0x33 afs_d_automount+0x48/0x9d __traverse_mounts+0xe0/0x166 step_into+0x140/0x274 open_last_lookups+0x1c1/0x1df path_openat+0x138/0x1c3 do_filp_open+0x55/0xb4 do_sys_openat2+0x6c/0xb6 Fixes: 34fa47612bfe ("afs: Fix race in async call refcounting") Reported-by: Bill MacAllister <bill@ca-zephyr.org> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052304 Suggested-by: Jeffrey E Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/2633992.1702073229@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-12-11 15:40:41 -08:00
Ard Biesheuvel	50d7cdf7a9	efi/x86: Avoid physical KASLR on older Dell systems River reports boot hangs with v6.6 and v6.7, and the bisect points to commit a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") which moves the memory allocation and kernel decompression from the legacy decompressor (which executes after ExitBootServices()) to the EFI stub, using boot services for allocating the memory. The memory allocation succeeds but the subsequent call to decompress_kernel() never returns, resulting in a failed boot and a hanging system. As it turns out, this issue only occurs when physical address randomization (KASLR) is enabled, and given that this is a feature we can live without (virtual KASLR is much more important), let's disable the physical part of KASLR when booting on AMI UEFI firmware claiming to implement revision v2.0 of the specification (which was released in 2006), as this is the version these systems advertise. Fixes: a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218173 Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2023-12-11 17:57:42 +01:00
David S. Miller	70028b2e51	Merge branch 'ipv6-data-races' Eric Dumazet says: ==================== ipv6: more data-race annotations Small follow up series, taking care of races around np->mcast_oif and np->ucast_oif. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-11 10:59:17 +00:00
Eric Dumazet	1ac13efd61	ipv6: annotate data-races around np->ucast_oif np->ucast_oif is read locklessly in some contexts. Make all accesses to this field lockless, adding appropriate annotations. This also makes setsockopt( IPV6_UNICAST_IF ) lockless. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-11 10:59:17 +00:00
Eric Dumazet	d2f011a0bf	ipv6: annotate data-races around np->mcast_oif np->mcast_oif is read locklessly in some contexts. Make all accesses to this field lockless, adding appropriate annotations. This also makes setsockopt( IPV6_MULTICAST_IF ) lockless. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-11 10:59:17 +00:00
Johannes Berg	9a64d4c93e	Revert "net: rtnetlink: remove local list in __linkwatch_run_queue()" This reverts commit b8dbbbc535a9 ("net: rtnetlink: remove local list in __linkwatch_run_queue()"). It's evidently broken when there's a non-urgent work that gets added back, and then the loop can never finish. While reverting, add a note about that. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: b8dbbbc535a9 ("net: rtnetlink: remove local list in __linkwatch_run_queue()") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-11 10:57:16 +00:00
Hariprasad Kelam	e307b5a845	octeontx2-af: Fix pause frame configuration The current implementation's default Pause Forward setting is causing unnecessary network traffic. This patch disables Pause Forward to address this issue. Fixes: 1121f6b02e7a ("octeontx2-af: Priority flow control configuration support") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-11 10:55:12 +00:00
Wang Yao	271f2a4a95	efi/loongarch: Use load address to calculate kernel entry address The efi_relocate_kernel() may load the PIE kernel to anywhere, the loaded address may not be equal to link address or EFI_KIMG_PREFERRED_ADDRESS. Acked-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Wang Yao <wangyao@lemote.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2023-12-11 11:18:26 +01:00
David S. Miller	c3e041425a	Merge branch 'octeontx2-fixes' Hariprasad Kelam says: ==================== octeontx2: Fix issues with promisc/allmulti mode When interface is configured in promisc/all multi mode, low network performance observed. This series patches address the same. Patch1: Change the promisc/all multi mcam entry action to unicast if there are no trusted vfs associated with PF. Patch2: Configures RSS flow algorithm in promisc/all multi mcam entries to address flow distribution issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-11 10:06:05 +00:00
Hariprasad Kelam	570ba37898	octeontx2-af: Update RSS algorithm index The RSS flow algorithm is not set up correctly for promiscuous or all multi MCAM entries. This has an impact on flow distribution. This patch fixes the issue by updating flow algorithm index in above mentioned MCAM entries. Fixes: 967db3529eca ("octeontx2-af: add support for multicast/promisc packet replication feature") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-11 10:06:05 +00:00
Hariprasad Kelam	dbda436824	octeontx2-pf: Fix promisc mcam entry action Current implementation is such that, promisc mcam entry action is set as multicast even when there are no trusted VFs. multicast action causes the hardware to copy packet data, which reduces the performance. This patch fixes this issue by setting the promisc mcam entry action to unicast instead of multicast when there are no trusted VFs. The same change is made for the 'allmulti' mcam entry action. Fixes: ffd2f89ad05c ("octeontx2-pf: Enable promisc/allmulti match MCAM entries.") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-11 10:06:04 +00:00
Fei Qin	18c5c0a845	nfp: support UDP segmentation offload The device supports UDP hardware segmentation offload, which helps improving the performance. Thus, this patch adds support for UDP segmentation offload from the driver side. Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-11 10:01:56 +00:00
Shinas Rasheed	284f717622	octeon_ep: explicitly test for firmware ready value The firmware ready value is 1, and get firmware ready status function should explicitly test for that value. The firmware ready value read will be 2 after driver load, and on unbind till firmware rewrites the firmware ready back to 0, the value seen by driver will be 2, which should be regarded as not ready. Fixes: 10c073e40469 ("octeon_ep: defer probe if firmware not ready") Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-11 10:00:59 +00:00
Vlad Buslov	125f1c7f26	net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table The referenced change added custom cleanup code to act_ct to delete any callbacks registered on the parent block when deleting the tcf_ct_flow_table instance. However, the underlying issue is that the drivers don't obtain the reference to the tcf_ct_flow_table instance when registering callbacks which means that not only driver callbacks may still be on the table when deleting it but also that the driver can still have pointers to its internal nf_flowtable and can use it concurrently which results either warning in netfilter[0] or use-after-free. Fix the issue by taking a reference to the underlying struct tcf_ct_flow_table instance when registering the callback and release the reference when unregistering. Expose new API required for such reference counting by adding two new callbacks to nf_flowtable_type and implementing them for act_ct flowtable_ct type. This fixes the issue by extending the lifetime of nf_flowtable until all users have unregistered. [0]: [106170.938634] ------------[ cut here ]------------ [106170.939111] WARNING: CPU: 21 PID: 3688 at include/net/netfilter/nf_flow_table.h:262 mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.940108] Modules linked in: act_ct nf_flow_table act_mirred act_skbedit act_tunnel_key vxlan cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vhost_iotlb vdpa bonding openvswitch nsh rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_regis try overlay mlx5_core [106170.943496] CPU: 21 PID: 3688 Comm: kworker/u48:0 Not tainted 6.6.0-rc7_for_upstream_min_debug_2023_11_01_13_02 #1 [106170.944361] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [106170.945292] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core] [106170.945846] RIP: 0010:mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.946413] Code: 89 ef 48 83 05 71 a4 14 00 01 e8 f4 06 04 e1 48 83 05 6c a4 14 00 01 48 83 c4 28 5b 5d 41 5c 41 5d c3 48 83 05 d1 8b 14 00 01 <0f> 0b 48 83 05 d7 8b 14 00 01 e9 96 fe ff ff 48 83 05 a2 90 14 00 [106170.947924] RSP: 0018:ffff88813ff0fcb8 EFLAGS: 00010202 [106170.948397] RAX: 0000000000000000 RBX: ffff88811eabac40 RCX: ffff88811eabad48 [106170.949040] RDX: ffff88811eab8000 RSI: ffffffffa02cd560 RDI: 0000000000000000 [106170.949679] RBP: ffff88811eab8000 R08: 0000000000000001 R09: ffffffffa0229700 [106170.950317] R10: ffff888103538fc0 R11: 0000000000000001 R12: ffff88811eabad58 [106170.950969] R13: ffff888110c01c00 R14: ffff888106b40000 R15: 0000000000000000 [106170.951616] FS: 0000000000000000(0000) GS:ffff88885fd40000(0000) knlGS:0000000000000000 [106170.952329] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [106170.952834] CR2: 00007f1cefd28cb0 CR3: 000000012181b006 CR4: 0000000000370ea0 [106170.953482] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [106170.954121] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [106170.954766] Call Trace: [106170.955057] <TASK> [106170.955315] ? __warn+0x79/0x120 [106170.955648] ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.956172] ? report_bug+0x17c/0x190 [106170.956537] ? handle_bug+0x3c/0x60 [106170.956891] ? exc_invalid_op+0x14/0x70 [106170.957264] ? asm_exc_invalid_op+0x16/0x20 [106170.957666] ? mlx5_del_flow_rules+0x10/0x310 [mlx5_core] [106170.958172] ? mlx5_tc_ct_block_flow_offload_add+0x1240/0x1240 [mlx5_core] [106170.958788] ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core] [106170.959339] ? mlx5_tc_ct_del_ft_cb+0xc6/0x2b0 [mlx5_core] [106170.959854] ? mapping_remove+0x154/0x1d0 [mlx5_core] [106170.960342] ? mlx5e_tc_action_miss_mapping_put+0x4f/0x80 [mlx5_core] [106170.960927] mlx5_tc_ct_delete_flow+0x76/0xc0 [mlx5_core] [106170.961441] mlx5_free_flow_attr_actions+0x13b/0x220 [mlx5_core] [106170.962001] mlx5e_tc_del_fdb_flow+0x22c/0x3b0 [mlx5_core] [106170.962524] mlx5e_tc_del_flow+0x95/0x3c0 [mlx5_core] [106170.963034] mlx5e_flow_put+0x73/0xe0 [mlx5_core] [106170.963506] mlx5e_put_flow_list+0x38/0x70 [mlx5_core] [106170.964002] mlx5e_rep_update_flows+0xec/0x290 [mlx5_core] [106170.964525] mlx5e_rep_neigh_update+0x1da/0x310 [mlx5_core] [106170.965056] process_one_work+0x13a/0x2c0 [106170.965443] worker_thread+0x2e5/0x3f0 [106170.965808] ? rescuer_thread+0x410/0x410 [106170.966192] kthread+0xc6/0xf0 [106170.966515] ? kthread_complete_and_exit+0x20/0x20 [106170.966970] ret_from_fork+0x2d/0x50 [106170.967332] ? kthread_complete_and_exit+0x20/0x20 [106170.967774] ret_from_fork_asm+0x11/0x20 [106170.970466] </TASK> [106170.970726] ---[ end trace 0000000000000000 ]--- Fixes: 77ac5e40c44e ("net/sched: act_ct: remove and free nf_table callbacks") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-11 09:59:58 +00:00
Linus Torvalds	a39b6ac378	Linux 6.7-rc5 v6.7-rc5	2023-12-10 14:33:40 -08:00
Kent Overstreet	a66ff26b0f	bcachefs: Close journal entry if necessary when flushing all pins Since outstanding journal buffers hold a journal pin, when flushing all pins we need to close the current journal entry if necessary so its pin can be released. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-12-10 16:53:46 -05:00
David S. Miller	6e944cc686	Merge branch 'rswitch-jumbo-frames' Yoshihiro Shimoda says: ==================== net: rswitch: Add jumbo frames support This patch series is based on the latest net-next.git / main branch. Changes from v3: https://lore.kernel.org/all/20231204012058.3876078-1-yoshihiro.shimoda.uh@renesas.com/ - Based on the latest net-next.git / main branch. - Modify for code consistancy in the patch 3/9. - Add a condition in the patch 3/9. - Fix usage of dma_addr in the patch 8/9. Changes from v2: https://lore.kernel.org/all/20231201054655.3731772-1-yoshihiro.shimoda.uh@renesas.com/ - Based on the latest net-next.git / main branch. - Fix using a variable in the patch 8/9. - Add Reviewed-by tag in the patch 1/9. Changes from v1: https://lore.kernel.org/all/20231127115334.3670790-1-yoshihiro.shimoda.uh@renesas.com/ - Based on the latest net-next.git / main branch. - Fix commit descriptions (s/near the future/the near future/). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-10 19:31:42 +00:00
Yoshihiro Shimoda	c71517fe73	net: rswitch: Allow jumbo frames Allow jumbo frames by changing maximum MTU size and number of RX queues. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-10 19:31:42 +00:00
Yoshihiro Shimoda	d2c96b9d5f	net: rswitch: Add jumbo frames handling for TX If the driver would like to transmit a jumbo frame like 2KiB or more, it should be split into multiple queues. In the near future, to support this, add handling specific descriptor types F{START,MID,END}. However, such jumbo frames will not happen yet because the maximum MTU size is still default for now. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-12-10 19:31:42 +00:00

... 3 4 5 6 7 ...

1235902 Commits