linux

iv/linux

Author	SHA1	Message	Date
Brian Norris	90f922646f	tracefs: Only clobber mode/uid/gid on remount if asked [ Upstream commit 47311db8e8f33011d90dee76b39c8886120cdda4 ] Users may have explicitly configured their tracefs permissions; we shouldn't overwrite those just because a second mount appeared. Only clobber if the options were provided at mount time. Note: the previous behavior was especially surprising in the presence of automounted /sys/kernel/debug/tracing/. Existing behavior: ## Pre-existing status: tracefs is 0755. # stat -c '%A' /sys/kernel/tracing/ drwxr-xr-x ## (Re)trigger the automount. # umount /sys/kernel/debug/tracing # stat -c '%A' /sys/kernel/debug/tracing/. drwx------ ## Unexpected: the automount changed mode for other mount instances. # stat -c '%A' /sys/kernel/tracing/ drwx------ New behavior (after this change): ## Pre-existing status: tracefs is 0755. # stat -c '%A' /sys/kernel/tracing/ drwxr-xr-x ## (Re)trigger the automount. # umount /sys/kernel/debug/tracing # stat -c '%A' /sys/kernel/debug/tracing/. drwxr-xr-x ## Expected: the automount does not change other mount instances. # stat -c '%A' /sys/kernel/tracing/ drwxr-xr-x Link: https://lkml.kernel.org/r/20220826174353.2.Iab6e5ea57963d6deca5311b27fb7226790d44406@changeid Cc: stable@vger.kernel.org Fixes: 4282d60689d4f ("tracefs: Add new tracefs file system") Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-20 12:39:43 +02:00
Yipeng Zou	3c90af5a77	tracing: hold caller_addr to hardirq_{enable,disable}_ip [ Upstream commit 54c3931957f6a6194d5972eccc36d052964b2abe ] Currently, The arguments passing to lockdep_hardirqs_{on,off} was fixed in CALLER_ADDR0. The function trace_hardirqs_on_caller should have been intended to use caller_addr to represent the address that caller wants to be traced. For example, lockdep log in riscv showing the last {enabled,disabled} at __trace_hardirqs_{on,off} all the time(if called by): [ 57.853175] hardirqs last enabled at (2519): __trace_hardirqs_on+0xc/0x14 [ 57.853848] hardirqs last disabled at (2520): __trace_hardirqs_off+0xc/0x14 After use trace_hardirqs_xx_caller, we can get more effective information: [ 53.781428] hardirqs last enabled at (2595): restore_all+0xe/0x66 [ 53.782185] hardirqs last disabled at (2596): ret_from_exception+0xa/0x10 Link: https://lkml.kernel.org/r/20220901104515.135162-2-zouyipeng@huawei.com Cc: stable@vger.kernel.org Fixes: c3bc8fd637a96 ("tracing: Centralize preemptirq tracepoints and unify their usage") Signed-off-by: Yipeng Zou <zouyipeng@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-20 12:39:43 +02:00
Borislav Petkov	64840a4a2d	task_stack, x86/cea: Force-inline stack helpers [ Upstream commit e87f4152e542610d0b4c6c8548964a68a59d2040 ] Force-inline two stack helpers to fix the following objtool warnings: vmlinux.o: warning: objtool: in_task_stack()+0xc: call to task_stack_page() leaves .noinstr.text section vmlinux.o: warning: objtool: in_entry_stack()+0x10: call to cpu_entry_stack() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220324183607.31717-2-bp@alien8.de Stable-dep-of: 54c3931957f6 ("tracing: hold caller_addr to hardirq_{enable,disable}_ip") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-20 12:39:43 +02:00
Borislav Petkov	0b009e5fd1	x86/mm: Force-inline __phys_addr_nodebug() [ Upstream commit ace1a98519270c586c0d4179419292df67441cd1 ] Fix: vmlinux.o: warning: objtool: __sev_es_nmi_complete()+0x8b: call to __phys_addr_nodebug() leaves .noinstr.text section Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220324183607.31717-4-bp@alien8.de Stable-dep-of: 54c3931957f6 ("tracing: hold caller_addr to hardirq_{enable,disable}_ip") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-20 12:39:43 +02:00
Nick Desaulniers	f9571a9699	lockdep: Fix -Wunused-parameter for _THIS_IP_ [ Upstream commit 8b023accc8df70e72f7704d29fead7ca914d6837 ] While looking into a bug related to the compiler's handling of addresses of labels, I noticed some uses of _THIS_IP_ seemed unused in lockdep. Drive by cleanup. -Wunused-parameter: kernel/locking/lockdep.c:1383:22: warning: unused parameter 'ip' kernel/locking/lockdep.c:4246:48: warning: unused parameter 'ip' kernel/locking/lockdep.c:4844:19: warning: unused parameter 'ip' Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20220314221909.2027027-1-ndesaulniers@google.com Stable-dep-of: 54c3931957f6 ("tracing: hold caller_addr to hardirq_{enable,disable}_ip") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-20 12:39:42 +02:00
Claudiu Beznea	dee782da39	ARM: dts: at91: sama7g5ek: specify proper regulator output ranges [ Upstream commit 7f41d52ced9e1b7ed4ff8e1ae9cacbf46b64e6db ] Min and max output ranges of regulators need to satisfy board requirements not PMIC requirements. Thus adjust device tree to cope with this. Fixes: 7540629e2fc7 ("ARM: dts: at91: add sama7g5 SoC DT and sama7g5-ek") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20220826083927.3107272-7-claudiu.beznea@microchip.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-20 12:39:42 +02:00
Claudiu Beznea	424ac5929d	ARM: dts: at91: fix low limit for CPU regulator [ Upstream commit 279d626d737486363233b9b99c30b5696c389b41 ] Fix low limit for CPU regulator. Otherwise setting voltages lower than 1.125V will not be allowed (CPUFreq will not be allowed to set proper voltages on proper frequencies). Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20220113144900.906370-7-claudiu.beznea@microchip.com Stable-dep-of: 7f41d52ced9e ("ARM: dts: at91: sama7g5ek: specify proper regulator output ranges") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-20 12:39:42 +02:00
Marco Felsch	8be25fa7cf	ARM: dts: imx6qdl-kontron-samx6i: fix spi-flash compatible [ Upstream commit af7d78c957017f8b3a0986769f6f18e57f9362ea ] Drop the "winbond,w25q16dw" compatible since it causes to set the MODALIAS to w25q16dw which is not specified within spi-nor id table. Fix this by use the common "jedec,spi-nor" compatible. Fixes: 2125212785c9 ("ARM: dts: imx6qdl-kontron-samx6i: add Kontron SMARC SoM Support") Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-20 12:39:42 +02:00
Krzysztof Kozlowski	78eb5e326a	ARM: dts: imx: align SPI NOR node name with dtschema [ Upstream commit ba9fe460dc2cfe90dc115b22af14dd3f13cffa0f ] The node names should be generic and SPI NOR dtschema expects "flash". Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Stable-dep-of: af7d78c95701 ("ARM: dts: imx6qdl-kontron-samx6i: fix spi-flash compatible") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-20 12:39:42 +02:00
Chuanhong Guo	3bb12efc5e	ACPI: resource: skip IRQ override on AMD Zen platforms commit 9946e39fe8d0a5da9eb947d8e40a7ef204ba016e upstream. IRQ override isn't needed on modern AMD Zen systems. There's an active low keyboard IRQ on AMD Ryzen 6000 and it will stay this way on newer platforms. This IRQ override breaks keyboards for almost all Ryzen 6000 laptops currently on the market. Skip this IRQ override for all AMD Zen platforms because this IRQ override is supposed to be a workaround for buggy ACPI DSDT and we can't have a long list of all future AMD CPUs/Laptops in the kernel code. If a device with buggy ACPI DSDT shows up, a separated list containing just them should be created. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216118 Suggested-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Chuanhong Guo <gch981213@gmail.com> Acked-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: XiaoYan Li <lxy.lixiaoyan@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-20 12:39:42 +02:00
Dave Wysochanski	a68a734b19	NFS: Fix WARN_ON due to unionization of nfs_inode.nrequests commit 0ebeebcf59601bcfa0284f4bb7abdec051eb856d upstream. Fixes the following WARN_ON WARNING: CPU: 2 PID: 18678 at fs/nfs/inode.c:123 nfs_clear_inode+0x3b/0x50 [nfs] ... Call Trace: nfs4_evict_inode+0x57/0x70 [nfsv4] evict+0xd1/0x180 dispose_list+0x48/0x60 evict_inodes+0x156/0x190 generic_shutdown_super+0x37/0x110 nfs_kill_super+0x1d/0x40 [nfs] deactivate_locked_super+0x36/0xa0 Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-20 12:39:42 +02:00
Greg Kroah-Hartman	dd20085f2a	Linux 5.15.68 Link: https://lore.kernel.org/r/20220913140357.323297659@linuxfoundation.org Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>=20 Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> v5.15.68	2022-09-15 11:30:08 +02:00
Claudiu Beznea	e04b25638a	ARM: at91: ddr: remove CONFIG_SOC_SAMA7 dependency [ Upstream commit dc3005703f8cd893d325081c20b400e08377d9bb ] Remove CONFIG_SOC_SAMA7 dependency to avoid having #ifdef preprocessor directives in driver code (arch/arm/mach-at91/pm.c). This prepares the code for next commits. Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20220113144900.906370-2-claudiu.beznea@microchip.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:08 +02:00
Arnaldo Carvalho de Melo	154e72a4b3	perf machine: Use path__join() to compose a path instead of snprintf(dir, '/', filename) commit 9d5f0c36438eeae7566ca383b2b673179e3cc613 upstream. Its more intention revealing, and if we're interested in the odd cases where this may end up truncating we can do debug checks at one centralized place. Motivation, of all the container builds, fedora rawhide started complaining of: util/machine.c: In function ‘machine__create_modules’: util/machine.c:1419:50: error: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size between 0 and 4095 [-Werror=format-truncation=] 1419 \| snprintf(path, sizeof(path), "%s/%s", dir_name, dent->d_name); \| ^~ In file included from /usr/include/stdio.h:894, from util/branch.h:9, from util/callchain.h:8, from util/machine.c:7: In function ‘snprintf’, inlined from ‘maps__set_modules_path_dir’ at util/machine.c:1419:3, inlined from ‘machine__set_modules_path’ at util/machine.c:1473:9, inlined from ‘machine__create_modules’ at util/machine.c:1519:7: /usr/include/bits/stdio2.h:71:10: note: ‘__builtin___snprintf_chk’ output between 2 and 4352 bytes into a destination of size 4096 There are other places where we should use path__join(), but lets get rid of this one first. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Link: Link: https://lore.kernel.org/r/YebZKjwgfdOz0lAs@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-15 11:30:08 +02:00
Neil Armstrong	590b4f10e3	drm/bridge: display-connector: implement bus fmts callbacks commit 7cd70656d1285b79c001f041a017fcfee4292ff9 upstream. Since this bridge is tied to the connector, it acts like a passthrough, so concerning the output & input bus formats, either pass the bus formats from the previous bridge or return fallback data like done in the bridge function: drm_atomic_bridge_chain_select_bus_fmts() & select_bus_fmt_recursive. This permits avoiding skipping the negociation if the remaining bridge chain has all the bits in place. Without this bus fmt negociation breaks on drm/meson HDMI pipeline when attaching dw-hdmi with DRM_BRIDGE_ATTACH_NO_CONNECTOR, because the last bridge of the display-connector doesn't implement buf fmt callbacks and MEDIA_BUS_FMT_FIXED is used leading to select an unsupported default bus format from dw-hdmi. Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20211020123947.2585572-2-narmstrong@baylibre.com Cc: Stefan Agner <stefan@agner.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-15 11:30:08 +02:00
Ionela Voinescu	e084c6ab37	arm64: errata: add detection for AMEVCNTR01 incrementing incorrectly commit e89d120c4b720e232cc6a94f0fcbd59c15d41489 upstream. The AMU counter AMEVCNTR01 (constant counter) should increment at the same rate as the system counter. On affected Cortex-A510 cores, AMEVCNTR01 increments incorrectly giving a significantly higher output value. This results in inaccurate task scheduler utilization tracking and incorrect feedback on CPU frequency. Work around this problem by returning 0 when reading the affected counter in key locations that results in disabling all users of this counter from using it either for frequency invariance or as FFH reference counter. This effect is the same to firmware disabling affected counters. Details on how the two features are affected by this erratum: - AMU counters will not be used for frequency invariance for affected CPUs and CPUs in the same cpufreq policy. AMUs can still be used for frequency invariance for unaffected CPUs in the system. Although unlikely, if no alternative method can be found to support frequency invariance for affected CPUs (cpufreq based or solution based on platform counters) frequency invariance will be disabled. Please check the chapter on frequency invariance at Documentation/scheduler/sched-capacity.rst for details of its effect. - Given that FFH can be used to fetch either the core or constant counter values, restrictions are lifted regarding any of these counters returning a valid (!0) value. Therefore FFH is considered supported if there is a least one CPU that support AMUs, independent of any counters being disabled or affected by this erratum. Clarifying comments are now added to the cpc_ffh_supported(), cpu_read_constcnt() and cpu_read_corecnt() functions. The above is achieved through adding a new erratum: ARM64_ERRATUM_2457168. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20220819103050.24211-1-ionela.voinescu@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-15 11:30:08 +02:00
Lu Baolu	4740910867	iommu/vt-d: Correctly calculate sagaw value of IOMMU commit 53fc7ad6edf210b497230ce74b61b322a202470c upstream. The Intel IOMMU driver possibly selects between the first-level and the second-level translation tables for DMA address translation. However, the levels of page-table walks for the 4KB base page size are calculated from the SAGAW field of the capability register, which is only valid for the second-level page table. This causes the IOMMU driver to stop working if the hardware (or the emulated IOMMU) advertises only first-level translation capability and reports the SAGAW field as 0. This solves the above problem by considering both the first level and the second level when calculating the supported page table levels. Fixes: b802d070a52a1 ("iommu/vt-d: Use iova over first level") Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220817023558.3253263-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-15 11:30:08 +02:00
Mark Brown	f9e792035a	arm64/bti: Disable in kernel BTI when cross section thunks are broken commit c0a454b9044fdc99486853aa424e5b3be2107078 upstream. GCC does not insert a `bti c` instruction at the beginning of a function when it believes that all callers reach the function through a direct branch[1]. Unfortunately the logic it uses to determine this is not sufficiently robust, for example not taking account of functions being placed in different sections which may be loaded separately, so we may still see thunks being generated to these functions. If that happens, the first instruction in the callee function will result in a Branch Target Exception due to the missing landing pad. While this has currently only been observed in the case of modules having their main code loaded sufficiently far from their init section to require thunks it could potentially happen for other cases so the safest thing is to disable BTI for the kernel when building with an affected toolchain. [1]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106671 Reported-by: D Scott Phillips <scott@os.amperecomputing.com> [Bits of the commit message are lifted from his report & workaround] Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220905142255.591990-1-broonie@kernel.org Cc: <stable@vger.kernel.org> # v5.10+ Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-15 11:30:07 +02:00
Sasha Levin	a8a007c5b1	Revert "arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"" This reverts commit add4bc9281e8704e5ab15616b429576c84f453a2. On Mon, Sep 12, 2022 at 10:52:45AM +0100, Catalin Marinas wrote: >I missed this (holidays) and it looks like it's in stable already. On >its own it will likely break kasan_hw if used together with user-space >MTE as this change relies on two previous commits: > >70c248aca9e7 ("mm: kasan: Skip unpoisoning of user pages") >6d05141a3930 ("mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON") > >The reason I did not cc stable is that there are other dependencies in >this area. The potential issues without the above commits were rather >theoretical, so take these patches rather as clean-ups/refactoring than >fixes. Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Eliav Farber	7aa57d869e	hwmon: (mr75203) enable polling for all VM channels [ Upstream commit e43212e0f55dc2d6b15d6c174cc0a64b25fab5e7 ] Configure ip-polling register to enable polling for all voltage monitor channels. This enables reading the voltage values for all inputs other than just input 0. Fixes: 9d823351a337 ("hwmon: Add hardware monitoring driver for Moortec MR75203 PVT controller") Signed-off-by: Eliav Farber <farbere@amazon.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20220908152449.35457-7-farbere@amazon.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Eliav Farber	5e0fddad71	hwmon: (mr75203) fix multi-channel voltage reading [ Upstream commit 91a9e063cdcfca8fe642b078d6fae4ce49187975 ] Fix voltage allocation and reading to support all channels in all VMs. Prior to this change allocation and reading were done only for the first channel in each VM. This change counts the total number of channels for allocation, and takes into account the channel offset when reading the sample data register. Fixes: 9d823351a337 ("hwmon: Add hardware monitoring driver for Moortec MR75203 PVT controller") Signed-off-by: Eliav Farber <farbere@amazon.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20220908152449.35457-6-farbere@amazon.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Eliav Farber	948b7beb00	hwmon: (mr75203) fix voltage equation for negative source input [ Upstream commit 227a3a2fc31d8e4bb9c88d4804e19530af245b1b ] According to Moortec Embedded Voltage Monitor (MEVM) series 3 data sheet, the minimum input signal is -100mv and maximum input signal is +1000mv. The equation used to convert the digital word to voltage uses mixed types (val signed and n unsigned), and on 64 bit machines also has different size, since sizeof(u32) = 4 and sizeof(long) = 8. So when measuring a negative input, n will be small enough, such that PVT_N_CONST n < PVT_R_CONST, and the result of (PVT_N_CONST * n - PVT_R_CONST) will overflow to a very big positive 32 bit number. Then when storing the result in val it will be the same value just in 64 bit (instead of it representing a negative number which will what happen when sizeof(long) = 4). When -1023 <= (PVT_N_CONST n - PVT_R_CONST) <= -1 dividing the number by 1024 should result of in 0, but because ">> 10" is used, and the sign bit is used to fill the vacated bit positions, it results in -1 (0xf...fffff) which is wrong. This change fixes the sign problem and supports negative values by casting n to long and replacing the shift right with div operation. Fixes: 9d823351a337 ("hwmon: Add hardware monitoring driver for Moortec MR75203 PVT controller") Signed-off-by: Eliav Farber <farbere@amazon.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20220908152449.35457-5-farbere@amazon.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Eliav Farber	a02267ebb2	hwmon: (mr75203) update pvt->v_num and vm_num to the actual number of used sensors [ Upstream commit bb9195bd6664d94d71647631593e09f705ff5edd ] This issue is relevant when "intel,vm-map" is set in device-tree, and defines a lower number of VMs than actually supported. This change is needed for all places that use pvt->v_num or vm_num later on in the code. Fixes: 9d823351a337 ("hwmon: Add hardware monitoring driver for Moortec MR75203 PVT controller") Signed-off-by: Eliav Farber <farbere@amazon.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20220908152449.35457-4-farbere@amazon.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Eliav Farber	000f335323	hwmon: (mr75203) fix VM sensor allocation when "intel,vm-map" not defined [ Upstream commit 81114fc3d27bf5b06b2137d2fd2b63da656a8b90 ] Bug - in case "intel,vm-map" is missing in device-tree ,'num' is set to 0, and no voltage channel infos are allocated. The reason num is set to 0 when "intel,vm-map" is missing is to set the entire pvt->vm_idx[] with incremental channel numbers, but it didn't take into consideration that same num is used later in devm_kcalloc(). If "intel,vm-map" does exist there is no need to set the unspecified channels with incremental numbers, because the unspecified channels can't be accessed in pvt_read_in() which is the only other place besides the probe functions that uses pvt->vm_idx[]. This change fixes the bug by moving the incremental channel numbers setting to be done only if "intel,vm-map" property is defined (starting loop from 0), and removing 'num = 0'. Fixes: 9d823351a337 ("hwmon: Add hardware monitoring driver for Moortec MR75203 PVT controller") Signed-off-by: Eliav Farber <farbere@amazon.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20220908152449.35457-3-farbere@amazon.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Alexander Gordeev	4b198c41d7	s390/boot: fix absolute zero lowcore corruption on boot [ Upstream commit 12dd19c159659ec9050f45dc8a2ff3c3917f4be3 ] Crash dump always starts on CPU0. In case CPU0 is offline the prefix page is not installed and the absolute zero lowcore is used. However, struct lowcore::mcesad is never assigned and stays zero. That leads to __machine_kdump() -> save_vx_regs() call silently stores vector registers to the absolute lowcore at 0x11b0 offset. Fixes: a62bc0739253 ("s390/kdump: add support for vector extension") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
John Sperbeck	a557ae0942	iommu/amd: use full 64-bit value in build_completion_wait() [ Upstream commit 94a568ce32038d8ff9257004bb4632e60eb43a49 ] We started using a 64 bit completion value. Unfortunately, we only stored the low 32-bits, so a very large completion value would never be matched in iommu_completion_wait(). Fixes: c69d89aff393 ("iommu/amd: Use 4K page for completion wait write-back semaphore") Signed-off-by: John Sperbeck <jsperbeck@google.com> Link: https://lore.kernel.org/r/20220801192229.3358786-1-jsperbeck@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Chao Gao	4f8d658848	swiotlb: avoid potential left shift overflow [ Upstream commit 3f0461613ebcdc8c4073e235053d06d5aa58750f ] The second operand passed to slot_addr() is declared as int or unsigned int in all call sites. The left-shift to get the offset of a slot can overflow if swiotlb size is larger than 4G. Convert the macro to an inline function and declare the second argument as phys_addr_t to avoid the potential overflow. Fixes: 26a7e094783d ("swiotlb: refactor swiotlb_tbl_map_single") Signed-off-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Przemyslaw Patynowski	df82f5ce4f	i40e: Fix ADQ rate limiting for PF [ Upstream commit 45bb006d3c924b1201ed43c87a96b437662dcaa8 ] Fix HW rate limiting for ADQ. Fallback to kernel queue selection for ADQ, as it is network stack that decides which queue to use for transmit with ADQ configured. Reset PF after creation of VMDq2 VSIs required for ADQ, as to reprogram TX queue contexts in i40e_configure_tx_ring. Without this patch PF would limit TX rate only according to TC0. Fixes: a9ce82f744dc ("i40e: Enable 'channel' mode in mqprio for TC configs") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Przemyslaw Patynowski	39d9de5872	i40e: Refactor tc mqprio checks [ Upstream commit 2313e69c84c024a85d017a60ae925085de47530a ] Refactor bitwise checks for whether TC MQPRIO is enabled into one single method for improved readability. Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Stable-dep-of: 45bb006d3c92 ("i40e: Fix ADQ rate limiting for PF") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Masahiro Yamada	657d9d8ac3	kbuild: disable header exports for UML in a straightforward way [ Upstream commit 1b620d539ccc18a1aca1613d9ff078115a7891a1 ] Previously 'make ARCH=um headers' stopped because of missing arch/um/include/uapi/asm/Kbuild. The error is not shown since commit ed102bf2afed ("um: Fix W=1 missing-include-dirs warnings") added arch/um/include/uapi/asm/Kbuild. Hard-code the unsupported architecture, so it works like before. Fixes: ed102bf2afed ("um: Fix W=1 missing-include-dirs warnings") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Yang Ling	12202e0f74	MIPS: loongson32: ls1c: Fix hang during startup [ Upstream commit 35508d2424097f9b6a1a17aac94f702767035616 ] The RTCCTRL reg of LS1C is obselete. Writing this reg will cause system hang. Fixes: 60219c563c9b6 ("MIPS: Add RTC support for Loongson1C board") Signed-off-by: Yang Ling <gnaygnil@gmail.com> Tested-by: Keguang Zhang <keguang.zhang@gmail.com> Acked-by: Keguang Zhang <keguang.zhang@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:07 +02:00
Nathan Chancellor	166ae43f02	ASoC: mchp-spdiftx: Fix clang -Wbitfield-constant-conversion commit 5c5c2baad2b55cc0a4b190266889959642298f79 upstream. A recent change in clang strengthened its -Wbitfield-constant-conversion to warn when 1 is assigned to a 1-bit signed integer bitfield, as it can only be 0 or -1, not 1: sound/soc/atmel/mchp-spdiftx.c:505:20: error: implicit truncation from 'int' to bit-field changes value from 1 to -1 [-Werror,-Wbitfield-constant-conversion] dev->gclk_enabled = 1; ^ ~ 1 error generated. The actual value of the field is never checked, just that it is not zero, so there is not a real bug here. However, it is simple enough to silence the warning by making the bitfield unsigned, which matches the mchp-spdifrx driver. Fixes: 06ca24e98e6b ("ASoC: mchp-spdiftx: add driver for S/PDIF TX Controller") Link: https://github.com/ClangBuiltLinux/linux/issues/1686 Link: `82afc9b169` Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20220810010809.2024482-1-nathan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-15 11:30:07 +02:00
Claudiu Beznea	4643fbc79d	ASoC: mchp-spdiftx: remove references to mchp_i2s_caps commit 403fcb5118a0f4091001a537e76923031fb45eaf upstream. Remove references to struct mchp_i2s_caps as they are not used. Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20220727090814.2446111-3-claudiu.beznea@microchip.com Signed-off-by: Mark Brown <broonie@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-15 11:30:06 +02:00
Alexandru Gagniuc	30a455ac38	hwmon: (tps23861) fix byte order in resistance register commit 1f05f65bddd6958d25b133f886da49c1d4bff3fa upstream. The tps23861 registers are little-endian, and regmap_read_bulk() does not do byte order conversion. On BE machines, the bytes were swapped, and the interpretation of the resistance value was incorrect. To make it work on both big and little-endian machines, use le16_to_cpu() to convert the resitance register to host byte order. Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> Fixes: fff7b8ab22554 ("hwmon: add Texas Instruments TPS23861 driver") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220905142806.110598-1-mr.nuke.me@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-15 11:30:06 +02:00
Zhengjun Xing	159d35a87e	perf script: Fix Cannot print 'iregs' field for hybrid systems [ Upstream commit 82b2425fad2dd47204b3da589b679220f8aacc0e ] Commit b91e5492f9d7ca89 ("perf record: Add a dummy event on hybrid systems to collect metadata records") adds a dummy event on hybrid systems to fix the symbol "unknown" issue when the workload is created in a P-core but runs on an E-core. The added dummy event will cause "perf script -F iregs" to fail. Dummy events do not have "iregs" attribute set, so when we do evsel__check_attr, the "iregs" attribute check will fail, so the issue happened. The following commit [1] has fixed a similar issue by skipping the attr check for the dummy event because it does not have any samples anyway. It works okay for the normal mode, but the issue still happened when running the test in the pipe mode. In the pipe mode, it calls process_attr() which still checks the attr for the dummy event. This commit fixed the issue by skipping the attr check for the dummy event in the API evsel__check_attr, Otherwise, we have to patch everywhere when evsel__check_attr() is called. Before: #./perf record -o - --intr-regs=di,r8,dx,cx -e br_inst_retired.near_call:p -c 1000 --per-thread true 2>/dev/null\|./perf script -F iregs \|head -5 Samples for 'dummy:HG' event do not have IREGS attribute set. Cannot print 'iregs' field. 0x120 [0x90]: failed to process type: 64 # After: # ./perf record -o - --intr-regs=di,r8,dx,cx -e br_inst_retired.near_call:p -c 1000 --per-thread true 2>/dev/null\|./perf script -F iregs \|head -5 ABI:2 CX:0x55b8efa87000 DX:0x55b8efa7e000 DI:0xffffba5e625efbb0 R8:0xffff90e51f8ae100 ABI:2 CX:0x7f1dae1e4000 DX:0xd0 DI:0xffff90e18c675ac0 R8:0x71 ABI:2 CX:0xcc0 DX:0x1 DI:0xffff90e199880240 R8:0x0 ABI:2 CX:0xffff90e180dd7500 DX:0xffff90e180dd7500 DI:0xffff90e180043500 R8:0x1 ABI:2 CX:0x50 DX:0xffff90e18c583bd0 DI:0xffff90e1998803c0 R8:0x58 # [1]https://lore.kernel.org/lkml/20220831124041.219925-1-jolsa@kernel.org/ Fixes: b91e5492f9d7ca89 ("perf record: Add a dummy event on hybrid systems to collect metadata records") Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220908070030.3455164-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Toke Høiland-Jørgensen	4519d4e32f	sch_sfb: Also store skb len before calling child enqueue [ Upstream commit 2f09707d0c972120bf794cfe0f0c67e2c2ddb252 ] Cong Wang noticed that the previous fix for sch_sfb accessing the queued skb after enqueueing it to a child qdisc was incomplete: the SFB enqueue function was also calling qdisc_qstats_backlog_inc() after enqueue, which reads the pkt len from the skb cb field. Fix this by also storing the skb len, and using the stored value to increment the backlog after enqueueing. Fixes: 9efd23297cca ("sch_sfb: Don't assume the skb is still around after enqueueing to child") Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Acked-by: Cong Wang <cong.wang@bytedance.com> Link: https://lore.kernel.org/r/20220905192137.965549-1-toke@toke.dk Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Sindhu-Devale	a600a9baba	RDMA/irdma: Report RNR NAK generation in device caps [ Upstream commit a261786fdc0a5bed2e5f994dcc0ffeeeb0d662c7 ] Report RNR NAK generation when device capabilities are queried Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-6-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Sindhu-Devale	3ca173b217	RDMA/irdma: Return correct WC error for bind operation failure [ Upstream commit dcb23bbb1de7e009875fdfac2b8a9808a9319cc6 ] When a QP and a MR on a local host are in different PDs, the HW generates an asynchronous event (AE). The same AE is generated when a QP and a MW are in different PDs during a bind operation. Return the more appropriate IBV_WC_MW_BIND_ERR for the latter case by checking the OP type from the CQE in error. Fixes: 551c46edc769 ("RDMA/irdma: Add user/kernel shared libraries") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Sindhu-Devale	c1872dfde6	RDMA/irdma: Report the correct max cqes from query device [ Upstream commit 12faad5e5cf2372af2d51f348b697b5edf838daf ] Report the correct max cqes available to an application taking into account a reserved entry to detect overflow. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Dennis Maisenbacher	a1d7c8647c	nvmet: fix mar and mor off-by-one errors [ Upstream commit b7e97872a65e1d57b4451769610554c131f37a0a ] Maximum Active Resources (MAR) and Maximum Open Resources (MOR) are 0's based vales where a value of 0xffffffff indicates that there is no limit. Decrement the values that are returned by bdev_max_open_zones and bdev_max_active_zones as the block layer helpers are not 0's based. A 0 returned by the block layer helpers indicates no limit, thus convert it to 0xffffffff (U32_MAX). Fixes: aaf2e048af27 ("nvmet: add ZBD over ZNS backend support") Suggested-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Dennis Maisenbacher <dennis.maisenbacher@wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Neal Cardwell	a96b1d33ec	tcp: fix early ETIMEDOUT after spurious non-SACK RTO [ Upstream commit 686dc2db2a0fdc1d34b424ec2c0a735becd8d62b ] Fix a bug reported and analyzed by Nagaraj Arankal, where the handling of a spurious non-SACK RTO could cause a connection to fail to clear retrans_stamp, causing a later RTO to very prematurely time out the connection with ETIMEDOUT. Here is the buggy scenario, expanding upon Nagaraj Arankal's excellent report: (1) Send one data packet on a non-SACK connection (2) Because no ACK packet is received, the packet is retransmitted and we enter CA_Loss; but this retransmission is spurious. (3) The ACK for the original data is received. The transmitted packet is acknowledged. The TCP timestamp is before the retrans_stamp, so tcp_may_undo() returns true, and tcp_try_undo_loss() returns true without changing state to Open (because tcp_is_sack() is false), and tcp_process_loss() returns without calling tcp_try_undo_recovery(). Normally after undoing a CA_Loss episode, tcp_fastretrans_alert() would see that the connection has returned to CA_Open and fall through and call tcp_try_to_open(), which would set retrans_stamp to 0. However, for non-SACK connections we hold the connection in CA_Loss, so do not fall through to call tcp_try_to_open() and do not set retrans_stamp to 0. So retrans_stamp is (erroneously) still non-zero. At this point the first "retransmission event" has passed and been recovered from. Any future retransmission is a completely new "event". However, retrans_stamp is erroneously still set. (And we are still in CA_Loss, which is correct.) (4) After 16 minutes (to correspond with tcp_retries2=15), a new data packet is sent. Note: No data is transmitted between (3) and (4) and we disabled keep alives. The socket's timeout SHOULD be calculated from this point in time, but instead it's calculated from the prior "event" 16 minutes ago (step (2)). (5) Because no ACK packet is received, the packet is retransmitted. (6) At the time of the 2nd retransmission, the socket returns ETIMEDOUT, prematurely, because retrans_stamp is (erroneously) too far in the past (set at the time of (2)). This commit fixes this bug by ensuring that we reuse in tcp_try_undo_loss() the same careful logic for non-SACK connections that we have in tcp_try_undo_recovery(). To avoid duplicating logic, we factor out that logic into a new tcp_is_non_sack_preventing_reopen() helper and call that helper from both undo functions. Fixes: da34ac7626b5 ("tcp: only undo on partial ACKs in CA_Loss") Reported-by: Nagaraj Arankal <nagaraj.p.arankal@hpe.com> Link: https://lore.kernel.org/all/SJ0PR84MB1847BE6C24D274C46A1B9B0EB27A9@SJ0PR84MB1847.NAMPRD84.PROD.OUTLOOK.COM/ Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220903121023.866900-1-ncardwell.kernel@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Sagi Grimberg	8589bbfad2	nvme-tcp: fix regression that causes sporadic requests to time out [ Upstream commit 3770a42bb8ceb856877699257a43c0585a5d2996 ] When we queue requests, we strive to batch as much as possible and also signal the network stack that more data is about to be sent over a socket with MSG_SENDPAGE_NOTLAST. This flag looks at the pending requests queued as well as queue->more_requests that is derived from the block layer last-in-batch indication. We set more_request=true when we flush the request directly from .queue_rq submission context (in nvme_tcp_send_all), however this is wrongly assuming that no other requests may be queued during the execution of nvme_tcp_send_all. Due to this, a race condition may happen where: 1. request X is queued as !last-in-batch 2. request X submission context calls nvme_tcp_send_all directly 3. nvme_tcp_send_all is preempted and schedules to a different cpu 4. request Y is queued as last-in-batch 5. nvme_tcp_send_all context sends request X+Y, however signals for both MSG_SENDPAGE_NOTLAST because queue->more_requests=true. ==> none of the requests is pushed down to the wire as the network stack is waiting for more data, both requests timeout. To fix this, we eliminate queue->more_requests and only rely on the queue req_list and send_list to be not-empty. Fixes: 122e5b9f3d37 ("nvme-tcp: optimize network stack with setting msg flags according to batch size") Reported-by: Jonathan Nicklin <jnicklin@blockbridge.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Jonathan Nicklin <jnicklin@blockbridge.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Sagi Grimberg	13c80a6c11	nvme-tcp: fix UAF when detecting digest errors [ Upstream commit 160f3549a907a50e51a8518678ba2dcf2541abea ] We should also bail from the io_work loop when we set rd_enabled to true, so we don't attempt to read data from the socket when the TCP stream is already out-of-sync or corrupted. Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver") Reported-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Gao Xiang	8ddd001cef	erofs: fix pcluster use-after-free on UP platforms [ Upstream commit 2f44013e39984c127c6efedf70e6b5f4e9dcf315 ] During stress testing with CONFIG_SMP disabled, KASAN reports as below: ================================================================== BUG: KASAN: use-after-free in __mutex_lock+0xe5/0xc30 Read of size 8 at addr ffff8881094223f8 by task stress/7789 CPU: 0 PID: 7789 Comm: stress Not tainted 6.0.0-rc1-00002-g0d53d2e882f9 #3 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: <TASK> .. __mutex_lock+0xe5/0xc30 .. z_erofs_do_read_page+0x8ce/0x1560 .. z_erofs_readahead+0x31c/0x580 .. Freed by task 7787 kasan_save_stack+0x1e/0x40 kasan_set_track+0x20/0x30 kasan_set_free_info+0x20/0x40 __kasan_slab_free+0x10c/0x190 kmem_cache_free+0xed/0x380 rcu_core+0x3d5/0xc90 __do_softirq+0x12d/0x389 Last potentially related work creation: kasan_save_stack+0x1e/0x40 __kasan_record_aux_stack+0x97/0xb0 call_rcu+0x3d/0x3f0 erofs_shrink_workstation+0x11f/0x210 erofs_shrink_scan+0xdc/0x170 shrink_slab.constprop.0+0x296/0x530 drop_slab+0x1c/0x70 drop_caches_sysctl_handler+0x70/0x80 proc_sys_call_handler+0x20a/0x2f0 vfs_write+0x555/0x6c0 ksys_write+0xbe/0x160 do_syscall_64+0x3b/0x90 The root cause is that erofs_workgroup_unfreeze() doesn't reset to orig_val thus it causes a race that the pcluster reuses unexpectedly before freeing. Since UP platforms are quite rare now, such path becomes unnecessary. Let's drop such specific-designed path directly instead. Fixes: 73f5c66df3e2 ("staging: erofs: fix `erofs_workgroup_{try_to_freeze, unfreeze}'") Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20220902045710.109530-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Chris Mi	5fbe35c94a	RDMA/mlx5: Set local port to one when accessing counters [ Upstream commit 74b30b3ad5cec95d2647e796d10137438a098bc1 ] When accessing Ports Performance Counters Register (PPCNT), local port must be one if it is Function-Per-Port HCA that HCA_CAP.num_ports is 1. The offending patch can change the local port to other values when accessing PPCNT after enabling switchdev mode. The following syndrome will be printed: # cat /sys/class/infiniband/rdmap4s0f0/ports/2/counters/* # dmesg mlx5_core 0000:04:00.0: mlx5_cmd_check:756:(pid 12450): ACCESS_REG(0x805) op_mod(0x1) failed, status bad parameter(0x3), syndrome (0x1e5585) Fix it by setting local port to one for Function-Per-Port HCA. Fixes: 210b1f78076f ("IB/mlx5: When not in dual port RoCE mode, use provided port as native") Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Chris Mi <cmi@nvidia.com> Link: https://lore.kernel.org/r/6c5086c295c76211169e58dbd610fb0402360bab.1661763459.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Yishai Hadas	819110054b	IB/core: Fix a nested dead lock as part of ODP flow [ Upstream commit 85eaeb5058f0f04dffb124c97c86b4f18db0b833 ] Fix a nested dead lock as part of ODP flow by using mmput_async(). From the below call trace [1] can see that calling mmput() once we have the umem_odp->umem_mutex locked as required by ib_umem_odp_map_dma_and_lock() might trigger in the same task the exit_mmap()->__mmu_notifier_release()->mlx5_ib_invalidate_range() which may dead lock when trying to lock the same mutex. Moving to use mmput_async() will solve the problem as the above exit_mmap() flow will be called in other task and will be executed once the lock will be available. [1] [64843.077665] task:kworker/u133:2 state:D stack: 0 pid:80906 ppid: 2 flags:0x00004000 [64843.077672] Workqueue: mlx5_ib_page_fault mlx5_ib_eqe_pf_action [mlx5_ib] [64843.077719] Call Trace: [64843.077722] <TASK> [64843.077724] __schedule+0x23d/0x590 [64843.077729] schedule+0x4e/0xb0 [64843.077735] schedule_preempt_disabled+0xe/0x10 [64843.077740] __mutex_lock.constprop.0+0x263/0x490 [64843.077747] __mutex_lock_slowpath+0x13/0x20 [64843.077752] mutex_lock+0x34/0x40 [64843.077758] mlx5_ib_invalidate_range+0x48/0x270 [mlx5_ib] [64843.077808] __mmu_notifier_release+0x1a4/0x200 [64843.077816] exit_mmap+0x1bc/0x200 [64843.077822] ? walk_page_range+0x9c/0x120 [64843.077828] ? __cond_resched+0x1a/0x50 [64843.077833] ? mutex_lock+0x13/0x40 [64843.077839] ? uprobe_clear_state+0xac/0x120 [64843.077860] mmput+0x5f/0x140 [64843.077867] ib_umem_odp_map_dma_and_lock+0x21b/0x580 [ib_core] [64843.077931] pagefault_real_mr+0x9a/0x140 [mlx5_ib] [64843.077962] pagefault_mr+0xb4/0x550 [mlx5_ib] [64843.077992] pagefault_single_data_segment.constprop.0+0x2ac/0x560 [mlx5_ib] [64843.078022] mlx5_ib_eqe_pf_action+0x528/0x780 [mlx5_ib] [64843.078051] process_one_work+0x22b/0x3d0 [64843.078059] worker_thread+0x53/0x410 [64843.078065] ? process_one_work+0x3d0/0x3d0 [64843.078073] kthread+0x12a/0x150 [64843.078079] ? set_kthread_struct+0x50/0x50 [64843.078085] ret_from_fork+0x22/0x30 [64843.078093] </TASK> Fixes: 36f30e486dce ("IB/core: Improve ODP to use hmm_range_fault()") Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/74d93541ea533ef7daec6f126deb1072500aeb16.1661251841.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
David Lebrun	55195563ec	ipv6: sr: fix out-of-bounds read when setting HMAC data. [ Upstream commit 84a53580c5d2138c7361c7c3eea5b31827e63b35 ] The SRv6 layer allows defining HMAC data that can later be used to sign IPv6 Segment Routing Headers. This configuration is realised via netlink through four attributes: SEG6_ATTR_HMACKEYID, SEG6_ATTR_SECRET, SEG6_ATTR_SECRETLEN and SEG6_ATTR_ALGID. Because the SECRETLEN attribute is decoupled from the actual length of the SECRET attribute, it is possible to provide invalid combinations (e.g., secret = "", secretlen = 64). This case is not checked in the code and with an appropriately crafted netlink message, an out-of-bounds read of up to 64 bytes (max secret length) can occur past the skb end pointer and into skb_shared_info: Breakpoint 1, seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208 208 memcpy(hinfo->secret, secret, slen); (gdb) bt #0 seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208 #1 0xffffffff81e012e9 in genl_family_rcv_msg_doit (skb=skb@entry=0xffff88800b1f9f00, nlh=nlh@entry=0xffff88800b1b7600, extack=extack@entry=0xffffc90000ba7af0, ops=ops@entry=0xffffc90000ba7a80, hdrlen=4, net=0xffffffff84237580 <init_net>, family=<optimized out>, family=<optimized out>) at net/netlink/genetlink.c:731 #2 0xffffffff81e01435 in genl_family_rcv_msg (extack=0xffffc90000ba7af0, nlh=0xffff88800b1b7600, skb=0xffff88800b1f9f00, family=0xffffffff82fef6c0 <seg6_genl_family>) at net/netlink/genetlink.c:775 #3 genl_rcv_msg (skb=0xffff88800b1f9f00, nlh=0xffff88800b1b7600, extack=0xffffc90000ba7af0) at net/netlink/genetlink.c:792 #4 0xffffffff81dfffc3 in netlink_rcv_skb (skb=skb@entry=0xffff88800b1f9f00, cb=cb@entry=0xffffffff81e01350 <genl_rcv_msg>) at net/netlink/af_netlink.c:2501 #5 0xffffffff81e00919 in genl_rcv (skb=0xffff88800b1f9f00) at net/netlink/genetlink.c:803 #6 0xffffffff81dff6ae in netlink_unicast_kernel (ssk=0xffff888010eec800, skb=0xffff88800b1f9f00, sk=0xffff888004aed000) at net/netlink/af_netlink.c:1319 #7 netlink_unicast (ssk=ssk@entry=0xffff888010eec800, skb=skb@entry=0xffff88800b1f9f00, portid=portid@entry=0, nonblock=<optimized out>) at net/netlink/af_netlink.c:1345 #8 0xffffffff81dff9a4 in netlink_sendmsg (sock=<optimized out>, msg=0xffffc90000ba7e48, len=<optimized out>) at net/netlink/af_netlink.c:1921 ... (gdb) p/x ((struct sk_buff )0xffff88800b1f9f00)->head + ((struct sk_buff )0xffff88800b1f9f00)->end $1 = 0xffff88800b1b76c0 (gdb) p/x secret $2 = 0xffff88800b1b76c0 (gdb) p slen $3 = 64 '@' The OOB data can then be read back from userspace by dumping HMAC state. This commit fixes this by ensuring SECRETLEN cannot exceed the actual length of SECRET. Reported-by: Lucas Leong <wmliang.tw@gmail.com> Tested: verified that EINVAL is correctly returned when secretlen > len(secret) Fixes: 4f4853dc1c9c1 ("ipv6: sr: implement API to control SR HMAC structure") Signed-off-by: David Lebrun <dlebrun@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:06 +02:00
Linus Walleij	4c4eda1364	RDMA/siw: Pass a pointer to virt_to_page() [ Upstream commit 0d1b756acf60da5004c1e20ca4462f0c257bf6e1 ] Functions that work on a pointer to virtual memory such as virt_to_pfn() and users of that function such as virt_to_page() are supposed to pass a pointer to virtual memory, ideally a (void ) or other pointer. However since many architectures implement virt_to_pfn() as a macro, this function becomes polymorphic and accepts both a (unsigned long) and a (void ). If we instead implement a proper virt_to_pfn(void addr) function the following happens (occurred on arch/arm): drivers/infiniband/sw/siw/siw_qp_tx.c:32:23: warning: incompatible integer to pointer conversion passing 'dma_addr_t' (aka 'unsigned int') to parameter of type 'const void ' [-Wint-conversion] drivers/infiniband/sw/siw/siw_qp_tx.c:32:37: warning: passing argument 1 of 'virt_to_pfn' makes pointer from integer without a cast [-Wint-conversion] drivers/infiniband/sw/siw/siw_qp_tx.c:538:36: warning: incompatible integer to pointer conversion passing 'unsigned long long' to parameter of type 'const void ' [-Wint-conversion] Fix this with an explicit cast. In one case where the SIW SGE uses an unaligned u64 we need a double cast modifying the virtual address (va) to a platform-specific uintptr_t before casting to a (void ). Fixes: b9be6f18cf9e ("rdma/siw: transmit path") Cc: linux-rdma@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20220902215918.603761-1-linus.walleij@linaro.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:05 +02:00
Paul Durrant	595e3616f8	xen-netback: only remove 'hotplug-status' when the vif is actually destroyed [ Upstream commit c55f34b6aec2a8cb47eadaffea773e83bf85de91 ] Removing 'hotplug-status' in backend_disconnected() means that it will be removed even in the case that the frontend unilaterally disconnects (which it is free to do at any time). The consequence of this is that, when the frontend attempts to re-connect, the backend gets stuck in 'InitWait' rather than moving straight to 'Connected' (which it can do because the hotplug script has already run). Instead, the 'hotplug-status' mode should be removed in netback_remove() i.e. when the vif really is going away. Fixes: 0f4558ae9187 ("Revert "xen-netback: remove 'hotplug-status' once it has served its purpose"") Signed-off-by: Paul Durrant <pdurrant@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:05 +02:00
Ivan Vecera	c3efe896f1	iavf: Detach device during reset task [ Upstream commit aa626da947e9cd30c4cf727493903e1adbb2c0a0 ] iavf_reset_task() takes crit_lock at the beginning and holds it during whole call. The function subsequently calls iavf_init_interrupt_scheme() that grabs RTNL. Problem occurs when userspace initiates during the reset task any ndo callback that runs under RTNL like iavf_open() because some of that functions tries to take crit_lock. This leads to classic A-B B-A deadlock scenario. To resolve this situation the device should be detached in iavf_reset_task() prior taking crit_lock to avoid subsequent ndos running under RTNL and reattach the device at the end. Fixes: 62fe2a865e6d ("i40evf: add missing rtnl_lock() around i40evf_set_interrupt_capability") Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Patryk Piotrowski <patryk.piotrowski@intel.com> Cc: SlawomirX Laba <slawomirx.laba@intel.com> Tested-by: Vitaly Grinberg <vgrinber@redhat.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:05 +02:00

1 2 3 4 5 ...

1055424 Commits