linux

iv/linux

Author	SHA1	Message	Date
Stafford Horne	c0ed9a711e	openrisc: traps: Don't send signals to kernel mode threads [ Upstream commit c88cfb5cea5f8f9868ef02cc9ce9183a26dcf20f ] OpenRISC exception handling sends signals to user processes on floating point exceptions and trap instructions (for debugging) among others. There is a bug where the trap handling logic may send signals to kernel threads, we should not send these signals to kernel threads, if that happens we treat it as an error. This patch adds conditions to die if the kernel receives these exceptions in kernel mode code. Fixes: 27267655c531 ("openrisc: Support floating point user api") Signed-off-by: Stafford Horne <shorne@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:42 +02:00
Gabriel Krisman Bertazi	71d865be7c	udp: Avoid call to compute_score on multiple sites [ Upstream commit 50aee97d15113b95a68848db1f0cb2a6c09f753a ] We've observed a 7-12% performance regression in iperf3 UDP ipv4 and ipv6 tests with multiple sockets on Zen3 cpus, which we traced back to commit f0ea27e7bfe1 ("udp: re-score reuseport groups when connected sockets are present"). The failing tests were those that would spawn UDP sockets per-cpu on systems that have a high number of cpus. Unsurprisingly, it is not caused by the extra re-scoring of the reused socket, but due to the compiler no longer inlining compute_score, once it has the extra call site in udp4_lib_lookup2. This is augmented by the "Safe RET" mitigation for SRSO, needed in our Zen3 cpus. We could just explicitly inline it, but compute_score() is quite a large function, around 300b. Inlining in two sites would almost double udp4_lib_lookup2, which is a silly thing to do just to workaround a mitigation. Instead, this patch shuffles the code a bit to avoid the multiple calls to compute_score. Since it is a static function used in one spot, the compiler can safely fold it in, as it did before, without increasing the text size. With this patch applied I ran my original iperf3 testcases. The failing cases all looked like this (ipv4): iperf3 -c 127.0.0.1 --udp -4 -f K -b $R -l 8920 -t 30 -i 5 -P 64 -O 2 where $R is either 1G/10G/0 (max, unlimited). I ran 3 times each. baseline is v6.9-rc3. harmean == harmonic mean; CV == coefficient of variation. ipv4: 1G 10G MAX HARMEAN (CV) HARMEAN (CV) HARMEAN (CV) baseline 1743852.66(0.0208) 1725933.02(0.0167) 1705203.78(0.0386) patched 1968727.61(0.0035) 1962283.22(0.0195) 1923853.50(0.0256) ipv6: 1G 10G MAX HARMEAN (CV) HARMEAN (CV) HARMEAN (CV) baseline 1729020.03(0.0028) 1691704.49(0.0243) 1692251.34(0.0083) patched 1900422.19(0.0067) 1900968.01(0.0067) 1568532.72(0.1519) This restores the performance we had before the change above with this benchmark. We obviously don't expect any real impact when mitigations are disabled, but just to be sure it also doesn't regresses: mitigations=off ipv4: 1G 10G MAX HARMEAN (CV) HARMEAN (CV) HARMEAN (CV) baseline 3230279.97(0.0066) 3229320.91(0.0060) 2605693.19(0.0697) patched 3242802.36(0.0073) 3239310.71(0.0035) 2502427.19(0.0882) Cc: Lorenz Bauer <lmb@isovalent.com> Fixes: f0ea27e7bfe1 ("udp: re-score reuseport groups when connected sockets are present") Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:42 +02:00
Juergen Gross	edcdeb8a4f	x86/pat: Fix W^X violation false-positives when running as Xen PV guest [ Upstream commit 5bc8b0f5dac04cd4ebe47f8090a5942f2f2647ef ] When running as Xen PV guest in some cases W^X violation WARN()s have been observed. Those WARN()s are produced by verify_rwx(), which looks into the PTE to verify that writable kernel pages have the NX bit set in order to avoid code modifications of the kernel by rogue code. As the NX bits of all levels of translation entries are or-ed and the RW bits of all levels are and-ed, looking just into the PTE isn't enough for the decision that a writable page is executable, too. When running as a Xen PV guest, the direct map PMDs and kernel high map PMDs share the same set of PTEs. Xen kernel initialization will set the NX bit in the direct map PMD entries, and not the shared PTEs. Fixes: 652c5bf380ad ("x86/mm: Refuse W^X violations") Reported-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240412151258.9171-5-jgross@suse.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:42 +02:00
Juergen Gross	29681171ff	x86/pat: Restructure _lookup_address_cpa() [ Upstream commit 02eac06b820c3eae73e5736ae62f986d37fed991 ] Modify _lookup_address_cpa() to no longer use lookup_address(), but only lookup_address_in_pgd(). This is done in preparation of using lookup_address_in_pgd_attr(). No functional change intended. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240412151258.9171-4-jgross@suse.com Stable-dep-of: 5bc8b0f5dac0 ("x86/pat: Fix W^X violation false-positives when running as Xen PV guest") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:42 +02:00
Juergen Gross	308fba77bc	x86/pat: Introduce lookup_address_in_pgd_attr() [ Upstream commit ceb647b4b529fdeca9021cd34486f5a170746bda ] Add lookup_address_in_pgd_attr() doing the same as the already existing lookup_address_in_pgd(), but returning the effective settings of the NX and RW bits of all walked page table levels, too. This will be needed in order to match hardware behavior when looking for effective access rights, especially for detecting writable code pages. In order to avoid code duplication, let lookup_address_in_pgd() call lookup_address_in_pgd_attr() with dummy parameters. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240412151258.9171-2-jgross@suse.com Stable-dep-of: 5bc8b0f5dac0 ("x86/pat: Fix W^X violation false-positives when running as Xen PV guest") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:41 +02:00
Viresh Kumar	3e99f060cf	cpufreq: exit() callback is optional [ Upstream commit b8f85833c05730d631576008daaa34096bc7f3ce ] The exit() callback is optional and shouldn't be called without checking a valid pointer first. Also, we must clear freq_table pointer even if the exit() callback isn't present. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Fixes: 91a12e91dc39 ("cpufreq: Allow light-weight tear down and bring up of CPUs") Fixes: f339f3541701 ("cpufreq: Rearrange locking in cpufreq_remove_dev()") Reported-by: Lizhe <sensor1010@163.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:41 +02:00
Hechao Li	99f3af0a1a	tcp: increase the default TCP scaling ratio [ Upstream commit 697a6c8cec03c2299f850fa50322641a8bf6b915 ] After commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), we noticed an application-level timeout due to reduced throughput. Before the commit, for a client that sets SO_RCVBUF to 65k, it takes around 22 seconds to transfer 10M data. After the commit, it takes 40 seconds. Because our application has a 30-second timeout, this regression broke the application. The reason that it takes longer to transfer data is that tp->scaling_ratio is initialized to a value that results in ~0.25 of rcvbuf. In our case, SO_RCVBUF is set to 65536 by the application, which translates to 2 * 65536 = 131,072 bytes in rcvbuf and hence a ~28k initial receive window. Later, even though the scaling_ratio is updated to a more accurate skb->len/skb->truesize, which is ~0.66 in our environment, the window stays at ~0.25 * rcvbuf. This is because tp->window_clamp does not change together with the tp->scaling_ratio update when autotuning is disabled due to SO_RCVBUF. As a result, the window size is capped at the initial window_clamp, which is also ~0.25 * rcvbuf, and never grows bigger. Most modern applications let the kernel do autotuning, and benefit from the increased scaling_ratio. But there are applications such as kafka that has a default setting of SO_RCVBUF=64k. This patch increases the initial scaling_ratio from ~25% to 50% in order to make it backward compatible with the original default sysctl_tcp_adv_win_scale for applications setting SO_RCVBUF. Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale") Signed-off-by: Hechao Li <hli@netflix.com> Reviewed-by: Tycho Andersen <tycho@tycho.pizza> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/netdev/20240402215405.432863-1-hli@netflix.com/ Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:41 +02:00
Paolo Abeni	ca19418abc	tcp: define initial scaling factor value as a macro [ Upstream commit 849ee75a38b297187c760bb1d23d8f2a7b1fc73e ] So that other users could access it. Notably MPTCP will use it in the next patch. No functional change intended. Acked-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231023-send-net-next-20231023-2-v1-4-9dc60939d371@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 697a6c8cec03 ("tcp: increase the default TCP scaling ratio") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:41 +02:00
Geliang Tang	a7fba17a05	selftests/bpf: Fix umount cgroup2 error in test_sockmap [ Upstream commit d75142dbeb2bd1587b9cc19f841578f541275a64 ] This patch fixes the following "umount cgroup2" error in test_sockmap.c: (cgroup_helpers.c:353: errno: Device or resource busy) umount cgroup2 Cgroup fd cg_fd should be closed before cleanup_cgroup_environment(). Fixes: 13a5f3ffd202 ("bpf: Selftests, sockmap test prog run without setting cgroup") Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/0399983bde729708773416b8488bac2cd5e022b8.1712639568.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:41 +02:00
Ard Biesheuvel	dc03a37553	x86/boot/64: Clear most of CR4 in startup_64(), except PAE, MCE and LA57 [ Upstream commit a0025f587c685e5ff842fb0194036f2ca0b6eaf4 ] The early 64-bit boot code must be entered with a 1:1 mapping of the bootable image, but it cannot operate without a 1:1 mapping of all the assets in memory that it accesses, and therefore, it creates such mappings for all known assets upfront, and additional ones on demand when a page fault happens on a memory address. These mappings are created with the global bit G set, as the flags used to create page table descriptors are based on __PAGE_KERNEL_LARGE_EXEC defined by the core kernel, even though the context where these mappings are used is very different. This means that the TLB maintenance carried out by the decompressor is not sufficient if it is entered with CR4.PGE enabled, which has been observed to happen with the stage0 bootloader of project Oak. While this is a dubious practice if no global mappings are being used to begin with, the decompressor is clearly at fault here for creating global mappings and not performing the appropriate TLB maintenance. Since commit: f97b67a773cd84b ("x86/decompressor: Only call the trampoline when changing paging levels") CR4 is no longer modified by the decompressor if no change in the number of paging levels is needed. Before that, CR4 would always be set to a consistent value with PGE cleared. So let's reinstate a simplified version of the original logic to put CR4 into a known state, and preserve the PAE, MCE and LA57 bits, none of which can be modified freely at this point (PAE and LA57 cannot be changed while running in long mode, and MCE cannot be cleared when running under some hypervisors). This effectively clears PGE and works around the project Oak bug. Fixes: f97b67a773cd84b ("x86/decompressor: Only call the trampoline when ...") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240410151354.506098-2-ardb+git@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:41 +02:00
Andreas Gruenbacher	abea81e6a7	gfs2: Fix "ignore unlock failures after withdraw" [ Upstream commit 5d9231111966b6c5a65016d58dcbeab91055bc91 ] Commit 3e11e53041502 tries to suppress dlm_lock() lock conversion errors that occur when the lockspace has already been released. It does that by setting and checking the SDF_SKIP_DLM_UNLOCK flag. This conflicts with the intended meaning of the SDF_SKIP_DLM_UNLOCK flag, so check whether the lockspace is still allocated instead. (Given the current DLM API, checking for this kind of error after the fact seems easier that than to make sure that the lockspace is still allocated before calling dlm_lock(). Changing the DLM API so that users maintain the lockspace references themselves would be an option.) Fixes: 3e11e53041502 ("GFS2: ignore unlock failures after withdraw") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:41 +02:00
Andreas Gruenbacher	21d78e4c36	gfs2: Don't forget to complete delayed withdraw [ Upstream commit b01189333ee91c1ae6cd96dfd1e3a3c2e69202f0 ] Commit fffe9bee14b0 ("gfs2: Delay withdraw from atomic context") switched from gfs2_withdraw() to gfs2_withdraw_delayed() in gfs2_ail_error(), but failed to then check if a delayed withdraw had occurred. Fix that by adding the missing check in __gfs2_ail_flush(), where the spin locks are already dropped and a withdraw is possible. Fixes: fffe9bee14b0 ("gfs2: Delay withdraw from atomic context") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:40 +02:00
Arnd Bergmann	673f7120a6	ACPI: disable -Wstringop-truncation [ Upstream commit a3403d304708f60565582d60af4316289d0316a0 ] gcc -Wstringop-truncation warns about copying a string that results in a missing nul termination: drivers/acpi/acpica/tbfind.c: In function 'acpi_tb_find_table': drivers/acpi/acpica/tbfind.c:60:9: error: 'strncpy' specified bound 6 equals destination size [-Werror=stringop-truncation] 60 \| strncpy(header.oem_id, oem_id, ACPI_OEM_ID_SIZE); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/acpi/acpica/tbfind.c:61:9: error: 'strncpy' specified bound 8 equals destination size [-Werror=stringop-truncation] 61 \| strncpy(header.oem_table_id, oem_table_id, ACPI_OEM_TABLE_ID_SIZE); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The code works as intended, and the warning could be addressed by using a memcpy(), but turning the warning off for this file works equally well and may be easier to merge. Fixes: 47c08729bf1c ("ACPICA: Fix for LoadTable operator, input strings") Link: https://lore.kernel.org/lkml/CAJZ5v0hoUfv54KW7y4223Mn9E7D4xvR7whRFNLTBqCZMUxT50Q@mail.gmail.com/#t Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:40 +02:00
Zenghui Yu	acb5503dbb	irqchip/loongson-pch-msi: Fix off-by-one on allocation error path [ Upstream commit b327708798809328f21da8dc14cc8883d1e8a4b3 ] When pch_msi_parent_domain_alloc() returns an error, there is an off-by-one in the number of interrupts to be freed. Fix it by passing the number of successfully allocated interrupts, instead of the relative index of the last allocated one. Fixes: 632dcc2c75ef ("irqchip: Add Loongson PCH MSI controller") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Link: https://lore.kernel.org/r/20240327142334.1098-1-yuzenghui@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:40 +02:00
Zenghui Yu	a9bbafa46c	irqchip/alpine-msi: Fix off-by-one in allocation error path [ Upstream commit ff3669a71afa06208de58d6bea1cc49d5e3fcbd1 ] When alpine_msix_gic_domain_alloc() fails, there is an off-by-one in the number of interrupts to be freed. Fix it by passing the number of successfully allocated interrupts, instead of the relative index of the last allocated one. Fixes: 3841245e8498 ("irqchip/alpine-msi: Fix freeing of interrupts on allocation error path") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240327142305.1048-1-yuzenghui@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:40 +02:00
Uros Bizjak	1d4e1fa2f2	locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() [ Upstream commit 929ad065ba2967be238dfdc0895b79fda62c7f16 ] Correct the definition of __arch_try_cmpxchg128(), introduced by: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Fixes: b23e139d0b66 ("arch: Introduce arch_{,try_}_cmpxchg128{,_local}()") Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20240408091547.90111-2-ubizjak@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:40 +02:00
Andy Shevchenko	040c3a0024	ACPI: LPSS: Advertise number of chip selects via property [ Upstream commit 07b73ee599428b41d0240f2f7b31b524eba07dd0 ] Advertise number of chip selects via property for Intel Braswell. Fixes: 620c803f42de ("ACPI: LPSS: Provide an SSP type to the driver") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:40 +02:00
Andrew Halaney	776bad0b1f	scsi: ufs: core: Perform read back after disabling UIC_COMMAND_COMPL [ Upstream commit 4bf3855497b60765ca03b983d064b25e99b97657 ] Currently, the UIC_COMMAND_COMPL interrupt is disabled and a wmb() is used to complete the register write before any following writes. wmb() ensures the writes complete in that order, but completion doesn't mean that it isn't stored in a buffer somewhere. The recommendation for ensuring this bit has taken effect on the device is to perform a read back to force it to make it all the way to the device. This is documented in device-io.rst and a talk by Will Deacon on this can be seen over here: https://youtu.be/i6DayghhA8Q?si=MiyxB5cKJXSaoc01&t=1678 Let's do that to ensure the bit hits the device. Because the wmb()'s purpose wasn't to add extra ordering (on top of the ordering guaranteed by writel()/readl()), it can safely be removed. Fixes: d75f7fe495cf ("scsi: ufs: reduce the interrupts for power mode change requests") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Can Guo <quic_cang@quicinc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240329-ufs-reset-ensure-effect-before-delay-v5-9-181252004586@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:40 +02:00
Andrew Halaney	92374b6a5a	scsi: ufs: core: Perform read back after disabling interrupts [ Upstream commit e4a628877119bd40164a651d20321247b6f94a8b ] Currently, interrupts are cleared and disabled prior to registering the interrupt. An mb() is used to complete the clear/disable writes before the interrupt is registered. mb() ensures that the write completes, but completion doesn't mean that it isn't stored in a buffer somewhere. The recommendation for ensuring these bits have taken effect on the device is to perform a read back to force it to make it all the way to the device. This is documented in device-io.rst and a talk by Will Deacon on this can be seen over here: https://youtu.be/i6DayghhA8Q?si=MiyxB5cKJXSaoc01&t=1678 Let's do that to ensure these bits hit the device. Because the mb()'s purpose wasn't to add extra ordering (on top of the ordering guaranteed by writel()/readl()), it can safely be removed. Fixes: 199ef13cac7d ("scsi: ufs: avoid spurious UFS host controller interrupts") Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Can Guo <quic_cang@quicinc.com> Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240329-ufs-reset-ensure-effect-before-delay-v5-8-181252004586@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:39 +02:00
Andrew Halaney	bfd29d5ea6	scsi: ufs: core: Perform read back after writing UTP_TASK_REQ_LIST_BASE_H [ Upstream commit 408e28086f1c7a6423efc79926a43d7001902fae ] Currently, the UTP_TASK_REQ_LIST_BASE_L/UTP_TASK_REQ_LIST_BASE_H regs are written to and then completed with an mb(). mb() ensures that the write completes, but completion doesn't mean that it isn't stored in a buffer somewhere. The recommendation for ensuring these bits have taken effect on the device is to perform a read back to force it to make it all the way to the device. This is documented in device-io.rst and a talk by Will Deacon on this can be seen over here: https://youtu.be/i6DayghhA8Q?si=MiyxB5cKJXSaoc01&t=1678 Let's do that to ensure the bits hit the device. Because the mb()'s purpose wasn't to add extra ordering (on top of the ordering guaranteed by writel()/readl()), it can safely be removed. Fixes: 88441a8d355d ("scsi: ufs: core: Add hibernation callbacks") Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Can Guo <quic_cang@quicinc.com> Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240329-ufs-reset-ensure-effect-before-delay-v5-7-181252004586@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:39 +02:00
Andrew Halaney	872f68019b	scsi: ufs: cdns-pltfrm: Perform read back after writing HCLKDIV [ Upstream commit b715c55daf598aac8fa339048e4ca8a0916b332e ] Currently, HCLKDIV is written to and then completed with an mb(). mb() ensures that the write completes, but completion doesn't mean that it isn't stored in a buffer somewhere. The recommendation for ensuring this bit has taken effect on the device is to perform a read back to force it to make it all the way to the device. This is documented in device-io.rst and a talk by Will Deacon on this can be seen over here: https://youtu.be/i6DayghhA8Q?si=MiyxB5cKJXSaoc01&t=1678 Let's do that to ensure the bit hits the device. Because the mb()'s purpose wasn't to add extra ordering (on top of the ordering guaranteed by writel()/readl()), it can safely be removed. Fixes: d90996dae8e4 ("scsi: ufs: Add UFS platform driver for Cadence UFS") Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240329-ufs-reset-ensure-effect-before-delay-v5-6-181252004586@redhat.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:39 +02:00
Andrew Halaney	8e5ede836b	scsi: ufs: qcom: Perform read back after writing CGC enable [ Upstream commit d9488511b3ac7eb48a91bc5eded7027525525e03 ] Currently, the CGC enable bit is written and then an mb() is used to ensure that completes before continuing. mb() ensures that the write completes, but completion doesn't mean that it isn't stored in a buffer somewhere. The recommendation for ensuring this bit has taken effect on the device is to perform a read back to force it to make it all the way to the device. This is documented in device-io.rst and a talk by Will Deacon on this can be seen over here: https://youtu.be/i6DayghhA8Q?si=MiyxB5cKJXSaoc01&t=1678 Let's do that to ensure the bit hits the device. Because the mb()'s purpose wasn't to add extra ordering (on top of the ordering guaranteed by writel()/readl()), it can safely be removed. Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Can Guo <quic_cang@quicinc.com> Fixes: 81c0fc51b7a7 ("ufs-qcom: add support for Qualcomm Technologies Inc platforms") Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240329-ufs-reset-ensure-effect-before-delay-v5-5-181252004586@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:39 +02:00
Andrew Halaney	d2741b23b1	scsi: ufs: qcom: Perform read back after writing unipro mode [ Upstream commit 823150ecf04f958213cf3bf162187cd1a91c885c ] Currently, the QUNIPRO_SEL bit is written to and then an mb() is used to ensure that completes before continuing. mb() ensures that the write completes, but completion doesn't mean that it isn't stored in a buffer somewhere. The recommendation for ensuring this bit has taken effect on the device is to perform a read back to force it to make it all the way to the device. This is documented in device-io.rst and a talk by Will Deacon on this can be seen over here: https://youtu.be/i6DayghhA8Q?si=MiyxB5cKJXSaoc01&t=1678 But, there's really no reason to even ensure completion before continuing. The only requirement here is that this write is ordered to this endpoint (which readl()/writel() guarantees already). For that reason the mb() can be dropped altogether without anything forcing completion. Fixes: f06fcc7155dc ("scsi: ufs-qcom: add QUniPro hardware support and power optimizations") Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240329-ufs-reset-ensure-effect-before-delay-v5-4-181252004586@redhat.com Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:39 +02:00
Andrew Halaney	32402b2a9c	scsi: ufs: qcom: Perform read back after writing REG_UFS_SYS1CLK_1US [ Upstream commit a862fafa263aea0f427d51aca6ff7fd9eeaaa8bd ] Currently after writing to REG_UFS_SYS1CLK_1US a mb() is used to ensure that write has gone through to the device. mb() ensures that the write completes, but completion doesn't mean that it isn't stored in a buffer somewhere. The recommendation for ensuring this bit has taken effect on the device is to perform a read back to force it to make it all the way to the device. This is documented in device-io.rst and a talk by Will Deacon on this can be seen over here: https://youtu.be/i6DayghhA8Q?si=MiyxB5cKJXSaoc01&t=1678 Let's do that to ensure the bit hits the device. Because the mb()'s purpose wasn't to add extra ordering (on top of the ordering guaranteed by writel()/readl()), it can safely be removed. Fixes: f06fcc7155dc ("scsi: ufs-qcom: add QUniPro hardware support and power optimizations") Reviewed-by: Can Guo <quic_cang@quicinc.com> Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240329-ufs-reset-ensure-effect-before-delay-v5-2-181252004586@redhat.com Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:39 +02:00
Andrew Halaney	8f01dda10c	scsi: ufs: qcom: Perform read back after writing reset bit [ Upstream commit c4d28e06b0c94636f6e35d003fa9ebac0a94e1ae ] Currently, the reset bit for the UFS provided reset controller (used by its phy) is written to, and then a mb() happens to try and ensure that hit the device. Immediately afterwards a usleep_range() occurs. mb() ensures that the write completes, but completion doesn't mean that it isn't stored in a buffer somewhere. The recommendation for ensuring this bit has taken effect on the device is to perform a read back to force it to make it all the way to the device. This is documented in device-io.rst and a talk by Will Deacon on this can be seen over here: https://youtu.be/i6DayghhA8Q?si=MiyxB5cKJXSaoc01&t=1678 Let's do that to ensure the bit hits the device. By doing so and guaranteeing the ordering against the immediately following usleep_range(), the mb() can safely be removed. Fixes: 81c0fc51b7a7 ("ufs-qcom: add support for Qualcomm Technologies Inc platforms") Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Can Guo <quic_cang@quicinc.com> Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20240329-ufs-reset-ensure-effect-before-delay-v5-1-181252004586@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:39 +02:00
Andrii Nakryiko	b17592380f	bpf: prevent r10 register from being marked as precise [ Upstream commit 1f2a74b41ea8b902687eb97c4e7e3f558801865b ] r10 is a special register that is not under BPF program's control and is always effectively precise. The rest of precision logic assumes that only r0-r9 SCALAR registers are marked as precise, so prevent r10 from being marked precise. This can happen due to signed cast instruction allowing to do something like `r0 = (s8)r10;`, which later, if r0 needs to be precise, would lead to an attempt to mark r10 as precise. Prevent this with an extra check during instruction backtracking. Fixes: 8100928c8814 ("bpf: Support new sign-extension mov insns") Reported-by: syzbot+148110ee7cf72f39f33e@syzkaller.appspotmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240404214536.3551295-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:38 +02:00
Anton Protopopov	7a7d4237e3	bpf: Pack struct bpf_fib_lookup [ Upstream commit f91717007217d975aa975ddabd91ae1a107b9bff ] The struct bpf_fib_lookup is supposed to be of size 64. A recent commit 59b418c7063d ("bpf: Add a check for struct bpf_fib_lookup size") added a static assertion to check this property so that future changes to the structure will not accidentally break this assumption. As it immediately turned out, on some 32-bit arm systems, when AEABI=n, the total size of the structure was equal to 68, see [1]. This happened because the bpf_fib_lookup structure contains a union of two 16-bit fields: union { __u16 tot_len; __u16 mtu_result; }; which was supposed to compile to a 16-bit-aligned 16-bit field. On the aforementioned setups it was instead both aligned and padded to 32-bits. Declare this inner union as __attribute__((packed, aligned(2))) such that it always is of size 2 and is aligned to 16 bits. [1] https://lore.kernel.org/all/CA+G9fYtsoP51f-oP_Sp5MOq-Ffv8La2RztNpwvE6+R1VtFiLrw@mail.gmail.com/#t Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: e1850ea9bd9e ("bpf: bpf_fib_lookup return MTU value as output when looked up") Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240403123303.1452184-1-aspsk@isovalent.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:38 +02:00
Sahil Siddiq	f92aebf170	bpftool: Mount bpffs on provided dir instead of parent dir [ Upstream commit 478a535ae54ad3831371904d93b5dfc403222e17 ] When pinning programs/objects under PATH (eg: during "bpftool prog loadall") the bpffs is mounted on the parent dir of PATH in the following situations: - the given dir exists but it is not bpffs. - the given dir doesn't exist and the parent dir is not bpffs. Mounting on the parent dir can also have the unintentional side- effect of hiding other files located under the parent dir. If the given dir exists but is not bpffs, then the bpffs should be mounted on the given dir and not its parent dir. Similarly, if the given dir doesn't exist and its parent dir is not bpffs, then the given dir should be created and the bpffs should be mounted on this new dir. Fixes: 2a36c26fe3b8 ("bpftool: Support bpffs mountpoint as pin path for prog loadall") Signed-off-by: Sahil Siddiq <icegambit91@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/2da44d24-74ae-a564-1764-afccf395eeec@isovalent.com/T/#t Link: https://lore.kernel.org/bpf/20240404192219.52373-1-icegambit91@gmail.com Closes: https://github.com/libbpf/bpftool/issues/100 Changes since v1: - Split "mount_bpffs_for_pin" into two functions. This is done to improve maintainability and readability. Changes since v2: - mount_bpffs_for_pin: rename to "create_and_mount_bpffs_dir". - mount_bpffs_given_file: rename to "mount_bpffs_given_file". - create_and_mount_bpffs_dir: - introduce "dir_exists" boolean. - remove new dir if "mnt_fs" fails. - improve error handling and error messages. Changes since v3: - Rectify function name. - Improve error messages and formatting. - mount_bpffs_for_file: - Check if dir exists before block_mount check. Changes since v4: - Use strdup instead of strcpy. - create_and_mount_bpffs_dir: - Use S_IRWXU instead of 0700. - Improve error handling and formatting. Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:38 +02:00
Arnd Bergmann	8758646709	wifi: carl9170: re-fix fortified-memset warning [ Upstream commit 066afafc10c9476ee36c47c9062527a17e763901 ] The carl9170_tx_release() function sometimes triggers a fortified-memset warning in my randconfig builds: In file included from include/linux/string.h:254, from drivers/net/wireless/ath/carl9170/tx.c:40: In function 'fortify_memset_chk', inlined from 'carl9170_tx_release' at drivers/net/wireless/ath/carl9170/tx.c:283:2, inlined from 'kref_put' at include/linux/kref.h:65:3, inlined from 'carl9170_tx_put_skb' at drivers/net/wireless/ath/carl9170/tx.c:342:9: include/linux/fortify-string.h:493:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 493 \| __write_overflow_field(p_size_field, size); Kees previously tried to avoid this by using memset_after(), but it seems this does not fully address the problem. I noticed that the memset_after() here is done on a different part of the union (status) than the original cast was from (rate_driver_data), which may confuse the compiler. Unfortunately, the memset_after() trick does not work on driver_rates[] because that is part of an anonymous struct, and I could not get struct_group() to do this either. Using two separate memset() calls on the two members does address the warning though. Fixes: fb5f6a0e8063b ("mac80211: Use memset_after() to clear tx status") Link: https://lore.kernel.org/lkml/20230623152443.2296825-1-arnd@kernel.org/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240328135509.3755090-2-arnd@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:38 +02:00
Alexander Aring	bc236ebc2a	dlm: fix user space lock decision to copy lvb [ Upstream commit ad191e0eeebf64a60ca2d16ca01a223d2b1dd25e ] This patch fixes the copy lvb decision for user space lock requests. Checking dlm_lvb_operations is done earlier, where granted/requested lock modes are available to use in the matrix. The decision had been moved to the wrong location, where granted mode and requested mode where the same, which causes the dlm_lvb_operations matix to produce the wrong copy decision. For PW or EX requests, the caller could get invalid lvb data. Fixes: 61bed0baa4db ("fs: dlm: use a non-static queue for callbacks") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:38 +02:00
Alexander Lobakin	0fdbbe7ee7	bitops: add missing prototype check [ Upstream commit 72cc1980a0ef3ccad0d539e7dace63d0d7d432a4 ] Commit 8238b4579866 ("wait_on_bit: add an acquire memory barrier") added a new bitop, test_bit_acquire(), with proper wrapping in order to try to optimize it at compile-time, but missed the list of bitops used for checking their prototypes a bit below. The functions added have consistent prototypes, so that no more changes are required and no functional changes take place. Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:38 +02:00
Arnd Bergmann	f3531ac07b	mlx5: stop warning for 64KB pages [ Upstream commit a5535e5336943b33689f558199366102387b7bbf ] When building with 64KB pages, clang points out that xsk->chunk_size can never be PAGE_SIZE: drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c:19:22: error: result of comparison of constant 65536 with expression of type 'u16' (aka 'unsigned short') is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (xsk->chunk_size > PAGE_SIZE \|\| ~~~~~~~~~~~~~~~ ^ ~~~~~~~~~ In older versions of this code, using PAGE_SIZE was the only possibility, so this would have never worked on 64KB page kernels, but the patch apparently did not address this case completely. As Maxim Mikityanskiy suggested, 64KB chunks are really not all that useful, so just shut up the warning by adding a cast. Fixes: 282c0c798f8e ("net/mlx5e: Allow XSK frames smaller than a page") Link: https://lore.kernel.org/netdev/20211013150232.2942146-1-arnd@kernel.org/ Link: https://lore.kernel.org/lkml/a7b27541-0ebb-4f2d-bd06-270a4d404613@app.fastmail.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com> Reviewed-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240328143051.1069575-9-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:38 +02:00
Arnd Bergmann	7dd2a9bb7b	mlx5: avoid truncating error message [ Upstream commit b324a960354b872431d25959ad384ab66a7116ec ] clang warns that one error message is too long for its destination buffer: drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c:1876:4: error: 'snprintf' will always be truncated; specified size is 80, but format string expands to at least 94 [-Werror,-Wformat-truncation-non-kprintf] Reword it to be a bit shorter so it always fits. Fixes: 70f0302b3f20 ("net/mlx5: Bridge, implement mdb offload") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/20240326223825.4084412-5-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:37 +02:00
Arnd Bergmann	6541f8ea76	qed: avoid truncating work queue length [ Upstream commit 954fd908f177604d4cce77e2a88cc50b29bad5ff ] clang complains that the temporary string for the name passed into alloc_workqueue() is too short for its contents: drivers/net/ethernet/qlogic/qed/qed_main.c:1218:3: error: 'snprintf' will always be truncated; specified size is 16, but format string expands to at least 18 [-Werror,-Wformat-truncation] There is no need for a temporary buffer, and the actual name of a workqueue is 32 bytes (WQ_NAME_LEN), so just use the interface as intended to avoid the truncation. Fixes: 59ccf86fe69a ("qed: Add driver infrastucture for handling mfw requests.") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240326223825.4084412-4-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:37 +02:00
Arnd Bergmann	997a53102a	enetc: avoid truncating error message [ Upstream commit 9046d581ed586f3c715357638ca12c0e84402002 ] As clang points out, the error message in enetc_setup_xdp_prog() still does not fit in the buffer and will be truncated: drivers/net/ethernet/freescale/enetc/enetc.c:2771:3: error: 'snprintf' will always be truncated; specified size is 80, but format string expands to at least 87 [-Werror,-Wformat-truncation] Replace it with an even shorter message that should fit. Fixes: f968c56417f0 ("net: enetc: shorten enetc_setup_xdp_prog() error message to fit NETLINK_MAX_FMTMSG_LEN") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240326223825.4084412-3-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:37 +02:00
Armin Wolf	c5202a3889	ACPI: Fix Generic Initiator Affinity _OSC bit [ Upstream commit d0d4f1474e36b195eaad477373127ae621334c01 ] The ACPI spec says bit 17 should be used to indicate support for Generic Initiator Affinity Structure in SRAT, but we currently set bit 13 ("Interrupt ResourceSource support"). Fix this by actually setting bit 17 when evaluating _OSC. Fixes: 01aabca2fd54 ("ACPI: Let ACPI know we support Generic Initiator Affinity Structures") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:37 +02:00
Shrikanth Hegde	2bd572d421	sched/fair: Add EAS checks before updating root_domain::overutilized [ Upstream commit be3a51e68f2f1b17250ce40d8872c7645b7a2991 ] root_domain::overutilized is only used for EAS(energy aware scheduler) to decide whether to do load balance or not. It is not used if EAS not possible. Currently enqueue_task_fair and task_tick_fair accesses, sometime updates this field. In update_sd_lb_stats it is updated often. This causes cache contention due to true sharing and burns a lot of cycles. ::overload and ::overutilized are part of the same cacheline. Updating it often invalidates the cacheline. That causes access to ::overload to slow down due to false sharing. Hence add EAS check before accessing/updating this field. EAS check is optimized at compile time or it is a static branch. Hence it shouldn't cost much. With the patch, both enqueue_task_fair and newidle_balance don't show up as hot routines in perf profile. 6.8-rc4: 7.18% swapper [kernel.vmlinux] [k] enqueue_task_fair 6.78% s [kernel.vmlinux] [k] newidle_balance +patch: 0.14% swapper [kernel.vmlinux] [k] enqueue_task_fair 0.00% swapper [kernel.vmlinux] [k] newidle_balance While at it: trace_sched_overutilized_tp expect that second argument to be bool. So do a int to bool conversion for that. Fixes: 2802bf3cd936 ("sched/fair: Add over-utilization/tipping point indicator") Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Qais Yousef <qyousef@layalina.io> Reviewed-by: Srikar Dronamraju <srikar@linux.ibm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/20240307085725.444486-2-sshegde@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:37 +02:00
Johannes Berg	c078f2b492	wifi: iwlwifi: mvm: fix check in iwl_mvm_sta_fw_id_mask [ Upstream commit d69aef8084cc72df7b0f2583096d9b037c647ec8 ] In the previous commit, I renamed the variable to differentiate mac80211/mvm link STA, but forgot to adjust the check. The one from mac80211 is already non-NULL anyway, but the mvm one can be NULL when the mac80211 isn't during link switch conditions. Fix the check. Fixes: 2783ab506eaa ("wifi: iwlwifi: mvm: select STA mask only for active links") Reviewed-by: Daniel Gabay <daniel.gabay@intel.com> Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240325180850.e95b442bafe9.I8c0119fce7b00cb4f65782930d2c167ed5dd0a6e@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:37 +02:00
Johannes Berg	f0fe67ca75	wifi: iwlwifi: reconfigure TLC during HW restart [ Upstream commit 96833fb3c7abfd57bb3ee2de2534c5a3f52b0838 ] Since the HW restart flow with multi-link is very similar to the initial association, we do need to reconfigure TLC there. Remove the check that prevented that. Fixes: d2d0468f60cd ("wifi: iwlwifi: mvm: configure TLC on link activation") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240320232419.a00adcfe381a.Ic798beccbb7b7d852dc976d539205353588853b0@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:36 +02:00
Johannes Berg	adde919099	wifi: iwlwifi: mvm: select STA mask only for active links [ Upstream commit 2783ab506eaa36dbef40bda0f96eb49fe149790e ] During reconfig, we might send keys, but those should be only sent to already active link stations. Iterate only active ones to fix that issue. Fixes: aea99650f731 ("wifi: iwlwifi: mvm: set STA mask for keys in MLO") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240320232419.c6818d1c6033.I6357f05c55ef111002ddc169287eb356ca0c1b21@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:36 +02:00
Johannes Berg	29caa34239	wifi: iwlwifi: mvm: allocate STA links only for active links [ Upstream commit 62bdd97598f8be82a24f556f78336b05d1c3e84b ] For the mvm driver, data structures match what's in the firmware, we allocate FW IDs for them already etc. During link switch we already allocate/free the STA links appropriately, but initially we'd allocate them always. Fix this to allocate memory, a STA ID, etc. only for active links. Fixes: 57974a55d995 ("wifi: iwlwifi: mvm: refactor iwl_mvm_mac_sta_state_common()") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240319100755.f2093ff73465.Ie891e1cc9c9df09ae22be6aad5c143e376f40f0e@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:36 +02:00
Johannes Berg	6c166d1646	wifi: ieee80211: fix ieee80211_mle_basic_sta_prof_size_ok() [ Upstream commit c121514df0daa800cc500dc2738e0b8a1c54af98 ] If there was a possibility of an MLE basic STA profile without subelements, we might reject it because we account for the one octet for sta_info_len twice (it's part of itself, and in the fixed portion). Like in ieee80211_mle_reconf_sta_prof_size_ok, subtract 1 to adjust that. When reading the elements we did take this into account, and since there are always elements, this never really mattered. Fixes: 7b6f08771bf6 ("wifi: ieee80211: Support validating ML station profile length") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240318184907.00bb0b20ed60.I8c41dd6fc14c4b187ab901dea15ade73c79fb98c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:36 +02:00
Guixiong Wei	9fa391354a	x86/boot: Ignore relocations in .notes sections in walk_relocs() too [ Upstream commit 76e9762d66373354b45c33b60e9a53ef2a3c5ff2 ] Commit: aaa8736370db ("x86, relocs: Ignore relocations in .notes section") ... only started ignoring the .notes sections in print_absolute_relocs(), but the same logic should also by applied in walk_relocs() to avoid such relocations. [ mingo: Fixed various typos in the changelog, removed extra curly braces from the code. ] Fixes: aaa8736370db ("x86, relocs: Ignore relocations in .notes section") Fixes: 5ead97c84fa7 ("xen: Core Xen implementation") Fixes: da1a679cde9b ("Add /sys/kernel/notes") Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20240317150547.24910-1-weiguixiong@bytedance.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:36 +02:00
Lorenzo Bianconi	22c3d94cd4	wifi: mt76: mt7915: workaround too long expansion sparse warnings [ Upstream commit 2d5cde1143eca31c72547dfd589702c6b4a7e684 ] Fix the following sparse warnings: drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c:1133:29: error: too long token expansion drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c:1133:29: error: too long token expansion drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c:1133:29: error: too long token expansion drivers/net/wireless/mediatek/mt76/mt7915/debugfs.c:1133:29: error: too long token expansion No functional changes, compile tested only. Fixes: e3296759f347 ("wifi: mt76: mt7915: enable per bandwidth power limit support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/5457b92e41909dd75ab3db7a0e9ec372b917a386.1710858172.git.lorenzo@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:36 +02:00
Aloka Dixit	9cf8052afc	wifi: ath12k: use correct flag field for 320 MHz channels [ Upstream commit 020e08ae5e68cbc0791e8d842443a86eb6aa99f6 ] Due to an error during rebasing the patchset 320 MHz channel support got broken. ath12k was setting the QoS bit instead of the correct flag. WMI_PEER_EXT_320MHZ (0x2) is defined as an extended flag, replace peer_flags by peer_flags_ext while sending peer data. This affected both QCN9274 and WCN7850 which use the same flag. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Fixes: 6734cf9b4cc7 ("wifi: ath12k: peer assoc for 320 MHz") Signed-off-by: Aloka Dixit <quic_alokad@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240314204651.11075-1-quic_alokad@quicinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:36 +02:00
Yonghong Song	ba3647aa16	bpftool: Fix missing pids during link show [ Upstream commit fe879bb42f8a6513ed18e9d22efb99cb35590201 ] Current 'bpftool link' command does not show pids, e.g., $ tools/build/bpftool/bpftool link ... 4: tracing prog 23 prog_type lsm attach_type lsm_mac target_obj_id 1 target_btf_id 31320 Hack the following change to enable normal libbpf debug output, # --- a/tools/bpf/bpftool/pids.c # +++ b/tools/bpf/bpftool/pids.c # @@ -121,9 +121,9 @@ int build_obj_refs_table(struct hashmap *map, enum bpf_obj_type type) # / we don't want output polluted with libbpf errors if bpf_iter is not # * supported # / # - default_print = libbpf_set_print(libbpf_print_none); # + / default_print = libbpf_set_print(libbpf_print_none); / # err = pid_iter_bpf__load(skel); # - libbpf_set_print(default_print); # + / libbpf_set_print(default_print); / Rerun the above bpftool command: $ tools/build/bpftool/bpftool link libbpf: prog 'iter': BPF program load failed: Permission denied libbpf: prog 'iter': -- BEGIN PROG LOAD LOG -- 0: R1=ctx() R10=fp0 ; struct task_struct task = ctx->task; @ pid_iter.bpf.c:69 0: (79) r6 = (u64 )(r1 +8) ; R1=ctx() R6_w=ptr_or_null_task_struct(id=1) ; struct file file = ctx->file; @ pid_iter.bpf.c:68 ... ; struct bpf_link link = (struct bpf_link ) file->private_data; @ pid_iter.bpf.c:103 80: (79) r3 = (u64 )(r8 +432) ; R3_w=scalar() R8=ptr_file() ; if (link->type == bpf_core_enum_value(enum bpf_link_type___local, @ pid_iter.bpf.c:105 81: (61) r1 = (u32 *)(r3 +12) R3 invalid mem access 'scalar' processed 39 insns (limit 1000000) max_states_per_insn 0 total_states 3 peak_states 3 mark_read 2 -- END PROG LOAD LOG -- libbpf: prog 'iter': failed to load: -13 ... The 'file->private_data' returns a 'void' type and this caused subsequent 'link->type' (insn #81) failed in verification. To fix the issue, restore the previous BPF_CORE_READ so old kernels can also work. With this patch, the 'bpftool link' runs successfully with 'pids'. $ tools/build/bpftool/bpftool link ... 4: tracing prog 23 prog_type lsm attach_type lsm_mac target_obj_id 1 target_btf_id 31320 pids systemd(1) Fixes: 44ba7b30e84f ("bpftool: Use a local copy of BPF_LINK_TYPE_PERF_EVENT in pid_iter.bpf.c") Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Quentin Monnet <quentin@isovalent.com> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20240312023249.3776718-1-yonghong.song@linux.dev Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:35 +02:00
Baochen Qiang	424e5ac976	wifi: ath11k: don't force enable power save on non-running vdevs [ Upstream commit 01296b39d3515f20a1db64d3c421c592b1e264a0 ] Currently we force enable power save on non-running vdevs, this results in unexpected ping latency in below scenarios: 1. disable power save from userspace. 2. trigger suspend/resume. With step 1 power save is disabled successfully and we get a good latency: PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=5.13 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=5.45 ms 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=5.99 ms 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=6.34 ms 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=4.47 ms 64 bytes from 192.168.1.1: icmp_seq=6 ttl=64 time=6.45 ms While after step 2, the latency becomes much larger: PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=17.7 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=15.0 ms 64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=14.3 ms 64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=16.5 ms 64 bytes from 192.168.1.1: icmp_seq=5 ttl=64 time=20.1 ms The reason is, with step 2, power save is force enabled due to vdev not running, although mac80211 was trying to disable it to honor userspace configuration: ath11k_pci 0000:03:00.0: wmi cmd sta powersave mode psmode 1 vdev id 0 Call Trace: ath11k_wmi_pdev_set_ps_mode ath11k_mac_op_bss_info_changed ieee80211_bss_info_change_notify ieee80211_reconfig ieee80211_resume wiphy_resume This logic is taken from ath10k where it was added due to below comment: Firmware doesn't behave nicely and consumes more power than necessary if PS is disabled on a non-started vdev. However we don't know whether such an issue also occurs to ath11k firmware or not. But even if it does, it's not appropriate because it goes against userspace, even cfg/mac80211 don't know we have enabled it in fact. Remove it to fix this issue. In this way we not only get a better latency, but also, and the most important, keeps the consistency between userspace and kernel/driver. The biggest price for that would be the power consumption, which is not that important, compared with the consistency. Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30 Fixes: b2beffa7d9a6 ("ath11k: enable 802.11 power save mode in station mode") Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240309113115.11498-1-quic_bqiang@quicinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:35 +02:00
Duoming Zhou	c37466406f	wifi: brcmfmac: pcie: handle randbuf allocation failure [ Upstream commit 316f790ebcf94bdf59f794b7cdea4068dc676d4c ] The kzalloc() in brcmf_pcie_download_fw_nvram() will return null if the physical memory has run out. As a result, if we use get_random_bytes() to generate random bytes in the randbuf, the null pointer dereference bug will happen. In order to prevent allocation failure, this patch adds a separate function using buffer on kernel stack to generate random bytes in the randbuf, which could prevent the kernel stack from overflow. Fixes: 91918ce88d9f ("wifi: brcmfmac: pcie: Provide a buffer of random bytes to the device") Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/20240306140437.18177-1-duoming@zju.edu.cn Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:35 +02:00
Baochen Qiang	014e4e9275	wifi: ath10k: poll service ready message before failing [ Upstream commit e57b7d62a1b2f496caf0beba81cec3c90fad80d5 ] Currently host relies on CE interrupts to get notified that the service ready message is ready. This results in timeout issue if the interrupt is not fired, due to some unknown reasons. See below logs: [76321.937866] ath10k_pci 0000:02:00.0: wmi service ready event not received ... [76322.016738] ath10k_pci 0000:02:00.0: Could not init core: -110 And finally it causes WLAN interface bring up failure. Change to give it one more chance here by polling CE rings, before failing directly. Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00157-QCARMSWPZ-1 Fixes: 5e3dd157d7e7 ("ath10k: mac80211 driver for Qualcomm Atheros 802.11ac CQA98xx devices") Reported-by: James Prestwood <prestwoj@gmail.com> Tested-By: James Prestwood <prestwoj@gmail.com> # on QCA6174 hw3.2 Link: https://lore.kernel.org/linux-wireless/304ce305-fbe6-420e-ac2a-d61ae5e6ca1a@gmail.com/ Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240227030409.89702-1-quic_bqiang@quicinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:35 +02:00
Yu Kuai	e5d98cc331	block: support to account io_ticks precisely [ Upstream commit 99dc422335d8b2bd4d105797241d3e715bae90e9 ] Currently, io_ticks is accounted based on sampling, specifically update_io_ticks() will always account io_ticks by 1 jiffies from bdev_start_io_acct()/blk_account_io_start(), and the result can be inaccurate, for example(HZ is 250): Test script: fio -filename=/dev/sda -bs=4k -rw=write -direct=1 -name=test -thinktime=4ms Test result: util is about 90%, while the disk is really idle. This behaviour is introduced by commit 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting"), however, there was a key point that is missed that this patch also improve performance a lot: Before the commit: part_round_stats: if (part->stamp != now) stats \|= 1; part_in_flight() -> there can be lots of task here in 1 jiffies. part_round_stats_single() __part_stat_add() part->stamp = now; After the commit: update_io_ticks: stamp = part->bd_stamp; if (time_after(now, stamp)) if (try_cmpxchg()) __part_stat_add() -> only one task can reach here in 1 jiffies. Hence in order to account io_ticks precisely, we only need to know if there are IO inflight at most once in one jiffies. Noted that for rq-based device, iterating tags should not be used here because 'tags->lock' is grabbed in blk_mq_find_and_get_req(), hence part_stat_lock_inc/dec() and part_in_flight() is used to trace inflight. The additional overhead is quite little: - per cpu add/dec for each IO for rq-based device; - per cpu sum for each jiffies; And it's verified by null-blk that there are no performance degration under heavy IO pressure. Fixes: 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240509123717.3223892-2-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:11:35 +02:00

1 2 3 4 5 ...

1224217 Commits